Comparison of WebCom in the context of Job Management Systems Dr. John P. Morrison, Mr Brian Clayton, Mr Adarsh Patil Centre for Unified Computing, Dept. Computer Science, National University of Ireland, Cork, Ireland. e-mail: j.morrison@cs.ucc.ie Telephone: +353 21 4902795 Fax: +353 21 4274390 Abstract A 2001 report for the NSA LUCITE Task Order on Productive Use of Distributed Reconfigurable Computing compares 12 Job Management Systems using 25 criteria designed to reflect “the most relevant system characteristics needed in the targeted distributed system for reconfigurable computing”. This paper presents a brief overview of the WebCom Meta computer and uses the previously mentioned characteristics to position it in relation to existing systems. Introduction to WebCom WebCom [1] is a platform for the distributed execution of instructions over a network of computers. The basic architecture of WebCom consists of a Master and an arbitrary number of client machines (figure 1). The clients voluntarily contribute to the processing power of the WebCom machine by connecting to the Master. Any machine with a Java Virtual Machine can become a client of WebCom. WebCom Master WebCom Client WebCom Client WebCom Client Figure 1 WebCom architecture Execution begins on a single server, to which clients connect using TCP/IP socket connections. Clients typically connect to WebCom via a webserver. The clients download an applet from the server and then execute it. Clients can be promoted to become masters. When they are promoted they can recruit clients to execute their atomic instructions, becoming the root of another computation tree in the grid. (figure 2) WebCom Master WebCom Client WebCom Client WebCom Master WebCom Client WebCom Client Figure 2 Client promotion Introduction to Comparison Study In 2001 researchers from George Mason University [2] built on the work of two previous papers, the “Cluster Computing Survey” by Baker et al. [3] and the Job Queuing/Scheduling Software comparative studies by Jones and Bricknell [4|5] to profile and rank 12 Job Management systems according to 25 criteria. The 12 systems evaluated were: Load Sharing Facility [6] Sun Grid Engine (Formerly Codine)[7] Portable Batch System [8] R E S [9] Condor [10] LEGION [11] MOSIX [12] GLOBUS [13] Network Weather Service [14] NetSolve [15] AppLES [16] Compaq Distributed Computing Engine [17] The system requirements used to compare each system were: Availability and quality of: binary code, source code, documentation, roadmap, training and customer support. Operating systems/platforms Linux, Windows NT, Solaris, True64UNIX defined as required Operating systems Batch job support Batch Jobs are defined as jobs not requiring any input, with long run times. Interactive job support Interactive jobs are defined as jobs with short turnaround time, and which require input and output. Parallel job support The JMS must be able to maintain control over a parallel job, tracking any subprocess of the job, being able to terminate the job and any sub-processes of the job, to clean up after the job and to provide complete accounting for the job and any sub processes of the job. Resource requests by users When submitting jobs users must be able to specify resources such as: Number of CPUs per job, Number of nodes per job, Swap space Limits on resources by administrators Administrators must be able to set the limits both for groups and for individual users such as: Number of CPUs per job, Number of nodes per job, Disk usage File stage-in and stage-out All files required for job execution must be available on the target machine at execution time. Flexible scheduling The Job Management system should support a range of scheduling policies and be able to alter the scheduling dynamically. Job priorities There should be both user and system assigned priorities. Timesharing It should be possible to run multiple jobs on individual nodes Impact on owners of computational nodes Local users should have priority for system resources over jobs executing on any node. Checkpointing It must be possible to store runtime details for any job, so that execution of that job can be halted and resumed with minimal data loss. Suspending/resuming/killing jobs The system must allow the suspension, resumption and killing of jobs by users, administrators and automatic processes. Process migration It must be possible to migrate a job, along with all the information necessary to continue the execution of that job, from one node to another node. Static load balancing The system should carry out load balancing according to suitable algorithms before job execution. Dynamic load balancing Load balancing by the system should be on-going throughout job execution. Fault tolerance The system should be able to recover both from job loss on a single node and from the loss of entire nodes i.e. system failure. Job monitoring The system should provide real time monitoring of job execution. User interface JMS should provide at least a CLI to all of its modules. A GUI is preferable. Published APIs JMS should provide a well-documented API to all of its modules. Dynamic system reconfiguration The system should allow the addition or removal of system resources during run time, ideally without any loss of job information. Security The system should provide mechanisms for authentication, authorisation and encryption. Accounting Records of resource usage should be kept. Scalability Managing very large clusters (>500 nodes) Allowing very large parallel jobs (>200 nodes) Evaluating WebCom The WebCom system is now evaluated using the categories defined in [2] Operating system, flexibility and user interface Table 1 shows a summary of features of WebCom related to the operating system, flexibility and user interface. Table 1 Roadmap Linux support NT support Solaris support True64unix support Other OS support Impact on owners of computational nodes Dynamic system reconfiguration User Interface API Yes Yes Yes Yes Yes Yes (because of JVM) Minimal due to cycle stealing approach Yes Client and GUI No WebCom is a non grid specific meta computer. It is built on internet technologies, using an implicit parallel program model. [18] WebCom is implemented in Java and so can be ported to any system for which a JVM is available. Currently there are Windows and Linux versions deployed. WebCom, together with Cyclone [19] adopts a cycle stealing approach in order to process work units. This approach allows the WebCom system to be used on ordinary workstations as well as dedicated grid machines, with no disruption to workstation users. WebCom provides a GUI system for operator use. These GUIs can be used to set initial parameters for job execution and to reconfigure the system dynamically. WebCom itself will react to the grid state during execution, and alter job execution according to the condensed graph. Scheduling and Resource Management Table 2 shows a summary of features of WebCom related to scheduling and resource management. Table 2 Batch jobs Interactive jobs Parallel jobs Resource requests Limits on resources Flexible scheduling Job priorities Job monitoring Accounting Yes Yes Yes Yes Yes Yes Yes Yes Yes Workload in WebCom takes the form of applets downloaded by the clients. These can be batch jobs or interactive jobs. On completion of batch jobs, the client will send results back to WebCom. In the case of an interactive job, it can be assigned to execute on a particular client, will receive input and output from that client and on completion of the job will send results back to WebCom. As the Condensed Graph model is implicitly parallel, all WebCom jobs are implicitly parallel. WebCom can replace the job executing on any client at any time. The workload on that client is terminated if necessary and a new workload downloaded from a URL specified by the server. WebCom’s scheduling approach is extremely flexible. Scheduling decisions are made dynamically based on the state of the grid. A WebCom agent is present on each node in the Grid. These agents communicate with each other to pass instructions, results and Grid status information. At each node, scheduling actions are performed based on local resource information. WebCom itself can perform a certain amount of resource requisition, and the programmer can explicitly perform additional resource requisition. Instructions can be targeted at specific machines, either explicitly by the programmer, or implicitly by the WebCom agents, based on resource availability and suitability. WebCom can request a particular executable, such as Word or Excel from a client. WebCom can use information on client machine usage to determine how to schedule jobs on that client. It can operate in four modes: Stopped, Screensaver, Minimal Impact and Mega. In Stopped mode any executing jobs are terminated and waiting jobs will not be launched. In Screensaver mode, WebCom launches jobs if no user activity is detected within a predetermined amount of time. This mode also ensures that job in execution is halted upon the detection of user activity. In Minimal Impact mode the job is launched with a low scheduling priority to minimize the impact on the user. Finally in Mega mode the job is launched with a high scheduling priority, without regard for the effect it may have on a user. The programmer can ensure that WebCom agents limit their memory usage. Efficiency and Utilization Table 3 shows a summary of features of WebCom related to efficiency and utilization. Table 3 Stage-in, stage out Time sharing Process Migration Static load balancing Dynamic load balancing Scalability Yes Yes Yes Yes Yes Yes Stage-in, stage-out is handled by the WebCom system, with no need for user interaction. File transfer in WebCom is managed through the applet model. The client accesses a URL specified by the server, and downloads the work units automatically. WebCom provides timesharing by allowing many clients to run on a single node,. WebCom differs from most Job Management Systems in its approach to process migration. As WebCom uses the CG model of computing, jobs will be migrated throughout the grid according both to the state of the grid and the Condensed Graph itself. The CG model defines WebCom’s approach to load balancing. As the graph is executed on the grid, the grid topology will evolve to reflect the execution of the graph. If there is a heavy load on one section of the graph, a client in that section may be promoted to become a master and request additional clients to process the workload (figure 2). As the work is completed, the clients will be released to other work. The client promotion feature of WebCom addresses node scalability issues. Any WebCom client can be promoted to become a WebCom master. Thus any number of clients can be added, simply by creating new masters to handle them. WebCom also features a high degree of process scalability, Fault tolerance and security Table 4 shows a summary of features of WebCom related to fault-tolerance and security. Table 4 Checkpointing Suspending, resuming, killing jobs Fault-tolerance Authentication Authorization Encryption Yes Yes Yes In Secure Version In Secure Version In Secure Version The CG model provides WebCom with checkpointing. At each level in the execution tree the graphs results and instructions are saved. If a particular node fails, it is possible to back track and restart the job. As WebCom is architecture neutral, that job may be resumed on any available node. Details of instruction execution and data are stored at each node in the graph. Operators may use the GUI to start, stop and suspend jobs within the WebCom system. WebCom maintains two distinct connections to each client, one for instructions and one for results. If an instruction connection fails, the instruction being executed by the associated client is rescheduled; the client descriptor is invalidated and is subsequently removed. If the result connector fails, WebCom destroys the associated client descriptor and its instruction is re-queued. In both connection failure scenarios the onus is on the client to re-establish appropriate connections and rejoin WebCom. WebCom Master Instruction connection Result Connection failure Instruction connection failure Results connection Valid Client Machine Client invalidated Client invalidated Grid security issues are handled in the secure version of WebCom. The secure version contains all the features of WebCom mentioned above as well as a suite of security specific features such as authentication and authorization. Conclusion The original NSA study attempted to rank each surveyed system by assigning points for each of the features they contained. No attempt was made to do this for the WebCom system, since the authors did not feel they could objectively score each feature. The WebCom system is shown to exhibit all the desired characteristics, it compares favourably with respect to each criterion, and the degree of conformance is explained where necessary. Positioning WebCom in relation to its peers has enabled us to make important decisions regarding future research directions. In particular, it is hoped that WebCom can be used in the near term as a general purpose Grid Management System, facilitating deployment, scheduling and dynamic resource management. This activity will be tested in the context of the Grid Ireland Initiative [20] References [1] “WebCom: a volunteer based Web Computer” The journal of SuperComputing Volume 18 Number 1, January 2001. John P. Morrison, James J. Kennedy, Dave A. Power. [2] CONCEPTUAL COMPARATIVE STUDY OF JOB MANAGEMENT SYSTEMS A Report for the NSA LUCITE Task Order Productive Use of Distributed Reconfigurable Computing George Mason University February 21, 2001 Tarek El-Ghazawi, P.I. Kris Gaj, Co-P.I. Nikitas Alexandridis (GWU), GWU-P.I. Brian Schott(USC/ISI), Co-P.I. Graduate Assistants: Alexandru V Staicu, Jacek R. Radzikowski, and Nguyen Nguyen , Suboh A. Suboh (GWU) [3] “Cluster Computing Review,” Northeast Parallel Architectures Center, Syracuse University, Nov. 1995. M. A. Baker, G. C. Fox, and H. W. Yau, [4] “NAS Requirements Checklist for Job Queuing/Scheduling Software,” NAS Technical Report NAS-96-003 April 1996 James Patton Jones [5][ “Evaluation of Job Queuing/Scheduling Software:Phase 1 Report,” NAS Technical Report, NAS-96-009, September 1996 James Patton Jones [6] http://www.platform.com/pdfs/brochuresprod/Platform_LSF_Standard.pdf [7] http://www.sun.com/gridware [8] http://pbs.mrj.com/ [9] “RES: A simple system for distributed computing”, Technical Report SRC-TR-92-067, Supercomputing Research Center. William W. Carlson [10] “Mechanisms for high throughput computing” SPEEDUP Journal, Vol 11, No. 1, June 1997 Miron Livny, Jim Basney, Rajesh Raman, Todd Tannenbaum [11] "LegionFS: A Secure and Scalable File System Supporting Cross-Domain HighPerformance Applications" Brian S. White Michael Walker Marty Humphrey Andrew S. Grimshaw Presented at Supercomputing 2001, Denver, Colorado, November, 2001 [12] “The MOSIX Multicomputer Operating System for High Performance Cluster Computing”, Journal of Future Generation Computer Systems, Vol. 13, No. 4-5, pp. 361-372, March 1998. A. Barak and O La’adan [13] “Globus: A Metacomputing Infrastructure Toolkit”. International Journal of Supercomputing I. Foster, C. Kesselman [14] The Network Weather Service: A Distributed Resource Performance Forecasting Service for Metacomputing, Rich Wolski, Neil Spring, and Jim Hayes, Journal of Future Generation Computing Systems,Volume 15, Numbers 5-6, pp. 757-768, October, 1999 [15] “NetSolve: A Network Server for Solving Computational Science Problems,” H. Casanova and J. Dongarra, The International Journal of Supercomputer Applications and High Performance Computing, vol. 11, No. 3, pp. 212-223, Fall 1997. [16]: "The AppLeS Parameter Sweep Template: User-Level Middleware for the Grid", Henri Casanova, Graziano Obertelli, Francine Berman and Rich Wolski Proceedings of the Super Computing Conference (SC'2000), [17] http://www.tru64unix.compaq.com/dce/index.html [18] Condensed Graphs: Unifying Availability-Driven, Coercion-Driven and ControlDriven Computing. John P. Morrison [19] “Cyclone: A Cycle Brokering System to Harvest Wasted Processor Cycles” John P. Morrison, Keith Power and Neil Cafferkey PDPTA 2000 Las Vegas USA. June 26-30 2000. [20] “Grid Research in Ireland”. Brian Coghlan, John Morrison, Andy Shearer, Michael Manzke. Proceedings of PDPTA 2001 Las Vegas, Nevada.