Resource Management and Resource Brokering using UNICORE Jon MacLaren NeSC 13th February 2003 Summary of Talk Overview of UNICORE – What is UNICORE? – The UNICORE Architecture – Advantages of the UNICORE Architecture Resource Brokering in UNICORE – Resources in UNICORE – Resources and Seamlessness – Resource Brokering – What, Why and How – EUROGRID Resource Broker: Overview – What do the Brokering Messages Convey? – Resource Refinement – Abstract Resource Description: Why? – What next? – What the broker looks like The UNICORE Model – What does a UNICORE Job Look Like? – Execution of a UNICORE Job – What the User Sees… – The UNICORE Security Model Further Resources Background on Manchester… Home of the CSAR National Supercomputing Service: – 512-processor O3000, 816-processor T3E, 128-processor O2000 Involved in many major UK e-Science projects: – RealityGrid – OGSA-DAI, Geodise, myGrid – Markets for Computational Economies Involved in major European projects: – EUROGRID, GRIP Home of e-Science NorthWest – eSNW First Access Grid in the UK My Interests: – Resource Brokering (wrote EUROGRID Resource Broker) – Advance Reservation (co-chair of GGF-WG GRAAP-WG) – Grid Economies (co-chair of GGF-WG GESA-WG) What is UNICORE? Grid middleware (pardon?) Targeted specifically at the execution of compute-intensive tasks on supercomputers and large clusters Provides single-sign on using X509 certificates Can interface to practically any hardware and batch queue system, even “exotic” platforms. UNICORE is a complete Grid computing environment as opposed to a toolkit. There is a GUI UNICORE Client – a simple command line interface was developed later. UNICORE can be extended with application-specific interfaces (called Plugins) The UNICORE Architecture TSI TSI UPL (over SSL) Client UPL over SSL NJS FNTP TSI UUDB Gateway UPL (over SSL) NJS TSI TSI FNTP UUDB Firewall Internet Intranet TSI The UNICORE Architecture TSI TSI UPL (over SSL) Client UPL over SSL NJS FNTP TSI green Vsite UUDB Gateway UPL (over SSL) NJS TSI TSI FNTP UUDB Firewall Internet Intranet TSI turing Vsite The UNICORE Architecture TSI TSI UPL (over SSL) Client UPL over SSL NJS FNTP TSI green Vsite UUDB Gateway UPL (over SSL) NJS TSI TSI FNTP UUDB Firewall Internet Intranet TSI turing Vsite UoM Usite The UNICORE Architecture: Advantages The load on the UNICORE-enabled compute resources is negligible, as only the lightweight TSI daemons, written in Perl, run on the compute resource itself. The Gateway and NJS processes are Java programs but run on other, cheaper resources, e.g. Linux workstations. Porting UNICORE to obscure or rare architectures is made much simpler by the fact that only a Perl interpreter needs to be available on the target system. Target systems include: – Cray T3E, Origin 3000, Fujitsu VPP300, Linux clusters – Sony Playstation 2, IBM Mainframe system All incoming connections to the UNICORE Vsites are routed via a single port on a single machine, which has proved advantageous when using UNICORE at sites with firewalls. What Does a UNICORE Job Look Like? The user composes their job into an AJO (Abstract Job Object). This is a hierarchy of components, such as scripts, compilation tasks, file transfer tasks, or application-specific components. Structure is added to the job by two mechanisms: – Tasks may be grouped together: • RepeatGroup, ForGroup, AbstractJob – Within a group, dependences control the order of execution The AJO components are all members of a Java Class Package, org.unicore.ajo, which also contains all the methods a user requires to construct an AJO ready for dispatching. Online JavaDoc for these classes is available. Execution of a UNICORE Job AJO sent to green (primary Vsite) Input file is exported from user’s workstation Gaussian98 job runs in batch queue Intermediate output transferred to bezier for visualisation After visualisation, the rendered movie is imported to the workstation NJS incarnates individual tasks: – Knows locations of applications – Knows details of the batch queue system NJS gives very simple directives to the TSI: – User x places this file here. – User y runs this executable in batch queue A. NJS manages workspace of jobs (Uspace) AbstractJob green FileTransfer Import Input Deck ExecuteTask Gaussian98 Job TransferFile Transfer Output AbstractJob bezier ExecuteTask Visualisation FileTransfer Import movie What the User Sees… The UNICORE Security Model Mutual Authentication (between Gateway/NJS and User) using X509 Certificates. No proxy certificates, no generalised delegation. Authorisation performed by the NJS using the UUDB Interface. (So authorisation is (potentially) moved away from the target system.) Separation of consigner and endorser: only a user can endorse a job; an NJS or a user can consign a job. The signed AJO is sent to an NJS as a serialised Java object, via UPL (each AbstractJob component is signed). This model ensures no tampering with multi-Vsite jobs by intermediate NJSs. The user’s private key never leaves the encrypted keystore on his/her workstation. At no point is any private key which could be used to impersonate the user (for any lifetime) ever created on a remote resource. Resources and Seamlessness Simple extensible resource model. Resources divided into CapabilityResource and CapacityResource subclasses Same resource model used to describe resources required by a job, and resources provided by a Vsite. Example CapabilityResource: SoftwareResource – Application name, version number – Job meta-data (XML document) Example CapacityResource: Nodes – Number of nodes in the system/required by job Resource description abstract enough to hide – Architectural details – Batch queue details (systems, queue names, etc.) – Locations of applications Job can only run on a Vsite if the resources required by the job can be satisfied by the Vsite. Resource Broker – What, Why and How What is Resource Brokering for? Simply put, Resource Brokering is a method for a user to discover resources suitable for running their work on. Why do you need it? At the moment, people use Grid middleware to access resources that they are already familiar with. They manually target their work at the machine that suits their needs best. For Grids to offer something genuinely new, they need to become much larger, so the user can find and use resources they have never even heard of before. However, as Grids become larger, the manual solution is simply not scalable… How does it work? The user describes their needs to a third party piece of software, a Resource Broker, in a Resource Description Language they both understand, plus a description of their preferences (e.g. quick turnaround time at any cost). The broker searches for suitable resources, and passes these back to the user… EUROGRID Resource Broker: Overview Resource Broker component locates suitable execution environments for the user’s jobs. Resource Broker functionality is part of NJS Protocol can be used symmetrically by Broker, allowing multiple stages. Broker is configured with a list of target NJSs to get offers from Execution NJS 2 CheckQoS 1 CheckQoS User Broker NJS 4 CheckQoS_Outcome 3 CheckQoS_Outcome 2 CheckQoS Execution NJS 3 CheckQoS_Outcome What do these Messages Convey? CheckQoS A TaskResourceDAG containing the resources specified by the job CheckQoS_Outcome A set of Estimates for a number of different Vsites, each containing: – Start time (earliest and latest); – End time (earliest and latest); – Cost of job (and units of cost); – A replacement resource set; and – A Ticket object, with a validity time (or an advertising string). The intended semantics are that for the user to accept the Estimate, they must present the job at the target Vsite with the Tickets with the job, using the replacement resource sets from the Tickets; provided that the job arrives within the lifetime of the Tickets, then the Vsite should turn the job around within the stated range for the end-time, and for the estimated cost. Resource Refinement Resource Refinement – the Brokering protocol supports the inclusion of a modified resource set with a Ticket. This resource set must be adopted by the client for the Ticket to be valid. This allows: Brokers to make offers on resource requirements it cannot exactly match, e.g. a 256-processor Origin could return an offer when the user asked for 512 processors. If the turnaround time and cost were right, the user may accept. Brokers could offer different versions of requested applications, or different applications supporting the same API. Most importantly, Brokers can be extended to handle abstract application specific resource descriptions. Abstract Resource Description: Why? Currently, the user of an application has to know about the performance of this code for a number of architectures. This is a waste of the application scientist’s time, and is unnecessary. Users are not interested in knowing the performance characteristics of a particular application on a given parallel architecture. They want to think in terms of what their job does, e.g. simulates 24 hours of weather over Manchester area, turnaround time, and cost. We can do this using the job meta-data of the SoftwareResource – it’s XML, so can encode a set of integers, or a Gaussian Input Deck. The broker uses its programmed knowledge of the application’s performance, and the characteristics of the machines that it is brokering for to turn this into concrete resource requirements, e.g. 4 hours on 32 processors of Manchester’s Cray T3E, or 2 hours on 128 processors of Manchester’s Origin 3000. So the user gives their application domain requirement, and gets back a list of turnaround times and costs. What next? Mechanism will be added to allow Broker to find out actual execution time and cost, allowing comparison with estimate. – This would provide feedback suitable for input into a learning engine. – The Broker could modify its application performance characteristics. – For a University of Manchester broker for Gaussian98 jobs, there would be many points of comparison every day. – This would mean that its estimates would increase in accuracy over time. – Can also assess the reliability of certain sites, etc. Interoperability with other Grid middleware. Already have some interoperability with Globus Toolkit V2. Interoperability will increase as UNICORE will move towards Open Grid Services Architecture. With multi-tier brokering, one can imagine a series of steps of resource abstraction/refinement. Payment for jobs – the brokers taking a cut. What the Broker Looks Like What the Broker Looks Like What the Broker Looks Like Further Resources UNICORE downloads: – http://www.unicore.org EUROGRID website: – http://www.eurogrid.org Grid Interoperability website: – http://www.grid-interoperability.org Online JavaDoc for AJO classes: – http://people.man.ac.uk/~zzcgujm/unicore/AJO_V4/ GGF GRAAP-WG Home Page: – http://people.man.ac.uk/~zzcgujm/GGF/sched-graap-2.0.html GGF GESA-WG Home Page: – http://www.doc.ic.ac.uk/~sjn5/GGF/GESA-WG3.htm