Grid(Lab) Resource Management System (GRMS) and GridLab Services Krzysztof Kurowski Poznan Supercomputing and Networking Center, Poland SUN HPC Consortium, Heidelberg 2004 What can you do with the Grid? Access: secure, transparent, remote, wireless, … Visualization: access to computers and services, not server… On demand: get resources you need, when you need, … Sharing: share data & resources over the net, … Failover: migrate and restart applications, … Balance between distributed and central control… (by Brian Hammond 4.30 p.m., 21 June 2004, Heidelberg) Krzysztof Kurowski, PSNC HPC SUN Consortium, Heidelberg 2004 GridLab Project Funded by the EU (5+ M€), January 2002 – March 2005 SUN is our commercial partner Open source license for our software Main goal: to develop a Grid Application Toolkit (a set of high level tools and libraries) together with a set of grid middleware services/systems for: resource management (GRMS), data management, monitoring, adaptive components, mobile user support, security services, portals, mobile access. ... and test all GridLab technologies/applications on real testbeds... Krzysztof Kurowski, PSNC HPC SUN Consortium, Heidelberg 2004 GridLab Project and GRMS (1) Krzysztof Kurowski, PSNC HPC SUN Consortium, Heidelberg 2004 GridLab Project and GRMS (2) Krzysztof Kurowski, PSNC HPC SUN Consortium, Heidelberg 2004 GridLab Project and GRMS (3) Krzysztof Kurowski, PSNC HPC SUN Consortium, Heidelberg 2004 GRMS and Core Services Krzysztof Kurowski, PSNC HPC SUN Consortium, Heidelberg 2004 GRMS and Core Services GRMS is using C/Java APIs to Globus 2.X and Pre WS, namely GRAM, GridFTP, GRIS/GISS GRMS stores all historic data in database and various logs Mercury Monitoring System is the most low-level service in GridLab (generic monitoring framework for the grid) Provides instant information about the state of hosts, services and jobs Provides monitoring data represented as metrics via both pull and push model data access semantics and also supports steering by controls. Based on the Grid Monitoring Architecture (GMA) as proposed by the GGF Support application steering (SIGNALS) Krzysztof Kurowski, PSNC HPC SUN Consortium, Heidelberg 2004 GRMS and Middleware Services Krzysztof Kurowski, PSNC HPC SUN Consortium, Heidelberg 2004 GRMS and Middleware Services MORE... Krzysztof Kurowski, PSNC HPC SUN Consortium, Heidelberg 2004 Various clients to GRMS Krzysztof Kurowski, PSNC HPC SUN Consortium, Heidelberg 2004 Various clients to GRMS Krzysztof Kurowski, PSNC HPC SUN Consortium, Heidelberg 2004 Applications and GRMS Example Pegasus/Chimera work-flow Krzysztof Kurowski, PSNC HPC SUN Consortium, Heidelberg 2004 GRMS Job Description XML based language Job executable File location Arguments File argument (files which have to be present in working directory of running executable) Environment variables Standard input Standard output Standard error Checkpoint file (user-level checkpoint) Krzysztof Kurowski, PSNC HPC SUN Consortium, Heidelberg 2004 GRMS Job Description Resource requirements Name of host for job execution (if provided no scheduling algorithms would be used) Operating system Required local resource manager Network parameters Lots of constraints: Minimum memory required Minimum CPUs required Minimum speed of CPUs ... Krzysztof Kurowski, PSNC HPC SUN Consortium, Heidelberg 2004 What is the functionality of the GRMS? 11th May, 2004, GRMS v 1.9.0 released !!! to act in behalf of users on resources and meet application requirements concerning resources, data, etc... to stage-in and stage-out files required by jobs before and after executions using Core Services (GridFTP/GASS/FTP) or GridLab Middleware Services (Replica Catalog Service and Data Movement Service). to use GAS for more advanced security scenarios, to run and control batch jobs remotely, to run and control MPI batch jobs remotely, to run Java applications remotely, to register GAT applications and receive unique JOB IDs, to checkpoint GAT applications remotely, to migrate GAT applications remotely, to store all historic information about job statuses and resources which have been used during a job submission process, to contact the Information Service to receive static and dynamic information about resources, to contact an Adaptive Components Service to get additional information about distributed resources and networks, Krzysztof Kurowski, PSNC HPC SUN Consortium, Heidelberg 2004 GRMS statistics Krzysztof Kurowski, PSNC HPC SUN Consortium, Heidelberg 2004 Who is using our software? N*Grid, Cactus, UCoMS, GriPhyN, Griphyn The Grid Infrastructure, Geon, EPhysics Portal, CLUSTERIX, CASPer, GridOneD, GEO 600, Einstein@home, VLE, GriKS, NRL Protean Group, GEMSS, Ibis, and more... Krzysztof Kurowski, PSNC HPC SUN Consortium, Heidelberg 2004 What can you do with the Grid? Access: secure (GSI, GAS), transparent (GridLab Middleware), remote, wireless, (portal, mobile phone) Visualization: access to computers and services, not server (Vis service, mobile client to GRMS) On demand: get resources you need, when you need (Job description) Sharing: share data & resources over the net, (GRMS and data management services) Failover: migrate and restart applications (user-level checkpointing and migration) Krzysztof Kurowski, PSNC HPC SUN Consortium, Heidelberg 2004 More Information / Summary Please visit: www.gridlab.org www.gridlab.org/WorkPackages/wp-9/ Thank you! Krzysztof Kurowski, PSNC HPC SUN Consortium, Heidelberg 2004