UNICORE and the DEISA supercomputing grid Jules Wolfrat wolfrat@sara.nl Amsterdam, 28 June 2006 DEISA UNICORE tutorial Outline • • • • DEISA overview UNICORE history UNICORE architecture Demo? Amsterdam, 28 June 2006 DEISA UNICORE tutorial 2 THE DEISA SUPERCOMPUTING GRID AIX distributed super-cluster Vector systems (NEC, …) Linux systems (SGI, IBM, …) Amsterdam, 28 June 2006 DEISA UNICORE tutorial 3 DEISA objectives • To enable Europe’s terascale science by the integration of Europe’s most powerful supercomputing systems. • Enabling scientific discovery across a broad spectrum of science and technology is the only criterion for success • DEISA is an European Supercomputing Service built on top of existing national services. This service is based on the deployment and operation of a persistent, production quality, distributed supercomputing environment with continental scope. • The integration of national facilities and services, together with innovative operational models, is expected to add substantial value to existing infrastructures. • Main focus is High Performance Computing (HPC). Amsterdam, 28 June 2006 DEISA UNICORE tutorial 4 Participating Sites BSC Barcelona Supercomputing Centre Spain CINECA Consortio Interuniversitario per il Calcolo Automatico Italy CSC Finnish Information Technology Centre for Science Finland EPCC/HPCx University of Edinburgh and CCLRC UK ECMWF European Centre for Medium-Range Weather Forecast UK (int) FZJ Research Centre Juelich Germany HLRS High Performance Computing Centre Stuttgart Germany IDRIS Institut du Développement et des Ressources France en Informatique Scientifique - CNRS LRZ Leibniz Rechenzentrum Munich Germany RZG Rechenzentrum Garching of the Max Planck Society Germany SARA Dutch National High Performance Computing The Netherlands and Networking centre Amsterdam, 28 June 2006 DEISA UNICORE tutorial 5 The DEISA supercomputing environment (21.900 processors and 145 Tf in 2006, more than 190 Tf in 2007) • IBM AIX Super-cluster – FZJ-Julich, 1312 processors, 8,9 teraflops peak – RZG – Garching, 748 processors, 3,8 teraflops peak – IDRIS, 1024 processors, 6.7 teraflops peak – CINECA, 512 processors, 2,6 teraflops peak – CSC, 512 processors, 2,6 teraflops peak – ECMWF, 2 systems of 2276 processors each, 33 teraflops peak – HPCx, 1536 processors, 11 teraflops peak • BSC, IBM PowerPC Linux system (MareNostrum) 4864 processeurs, 40 teraflops peak • SARA, SGI ALTIX Linux system, 416 processors, 2,2 teraflops peak • LRZ, Linux cluster (2.7 teraflops) moving to SGI ALTIX system (5120 processors and 33 teraflops peak in 2006, 70 teraflops peak in 2007) • HLRS, NEC SX8 vector system, 576 processors, 12,7 teraflops peak. • Systems interconnected with dedicated 1Gb/s network – currently upgrading to 10 Gb/s – provided by GEANT and NRENs Amsterdam, 28 June 2006 DEISA UNICORE tutorial 6 The technology cycle Technology pull Service definitions Technology specifications Technology providers R&D projects Technology watch DEISA strategic and technologic management WAN GPFS (IBM) Multi-cluster batch processing (IBM) GPFS for non-IBM systems (IBM) Co-scheduling (Platform) Amsterdam, 28 June 2006 DEISA UNICORE tutorial completed completed ongoing in preparation 7 How is DEISA enhancing HPC services in Europe? • Running larger parallel applications in individual sites, by a cooperative reorganization of the global computational workload on the whole infrastructure, or by the operation of the job migration service inside the AIX super-cluster. • Enabling workflow applications with UNICORE (complex applicaions that are pipelined over several computing platforms) • Enabling coupled multiphysics Grid applications (when it makes sense) • Providing a global data management service whose primordial objectives are: – – – • Integrating distributed data with distributed computing platforms Enabling efficient, high performance access to remote datasets (with Global File Systems and striped GridFTP). We believe that this service is critical for the operation of (possible) future European petascale systems Integrating hierarchical storage management and databases in the supercomputing Grid. Deploying portals as a way to hide complex environments to new users Amsterdam,communities, 28 June 2006 and DEISA UNICORE tutorial 8 to interoperate with another existing grid infrastructures. Basic Services: Global File Systems network nodes Disk space HPC system at site A Global file system Sophisticated software environment, necessary to provide single system image if a clustered computing platform. They provide global data management. Data in the GFS is “symmetric” with respect to all computing nodes. Amsterdam, 28 June 2006 DEISA UNICORE tutorial 9 The DEISA integration concept Site A Network interconnect (Reserved bandwidth) Site C Amsterdam, 28 June 2006 DEISA UNICORE tutorial Site B Global distributed GPFS file system with continental scope. Global resource pool is dynamic: nodes can enter and leave the pool without Disrupting the national services. Site D 10 DEISA Global File System integration in 2006 (based on IBM’s GPFS) AIX IBM domain Linux SGI IDRIS (FR) ECMWF (UK) SARA (NL) RZG (DE) HPC Common Global File System CSC (FI) High Performance Common Global File System similar architectures / operation systems various architectures / operating systems High bandwidth (10 Gbit/s) High bandwidth (up to 10 Gbit/s) LRZ LRZ (DE) (DE) BSC (ES) CINECA (IT) FZJ (DE) LINUX Power-PC Amsterdam, 28 June 2006 DEISA UNICORE tutorial 11 Demonstration SARA (NL) Global File System ) BSC (ES) Amsterdam, 28 June 2006 RZG (G) Demonstration of a transparent dat access in a heterogeneous configuration (1) A 64 processor job is running at SARA (SGI Altix system) (2) The input data for this run are read from the Linux GPFS at SARA (3) The output data will be written into the BSC GPFS system in Spain (4) Visualization at RZG system is reading the output data produced by the application from BSC GPFS DEISA UNICORE tutorial 12 Global File System Interoperability demo during Supercomputing Conference 2005 in Seattle American and European supercomputing infrastructures linked: bridging communities with scalable, wide-area global file systems DEISA Sites TeraGrid Sites Amsterdam, NL Jülich, DE Argonne, IL Orsay,FR Garching,DE Bloomington , IN San Diego, CA UrbanaChampaign , IL Amsterdam, 28 June 2006 Bologna, IT DEISA UNICORE tutorial 13 Basic services: workflow simulations using UNICORE UNICORE supports complex simulations that are pipelined over several heterogeneous platforms (workflows). UNICORE handles workflows as a unique job and transparently moves the output – input data along the pipeline. UNICORE clients that monitor the application can run in laptops. UNICORE has a user friendly graphical interface. DEISA has developed a command line interface for UNICORE. UNICORE infrastructure including all sites has full production status. It has proven to be very stable during the last few months. Amsterdam, 28 June 2006 DEISA UNICORE tutorial 14 Other basic services • Job migration inside the AIX super-cluster. Based on LoadLeveler MultiCluster, it allows system administrators to reroute jobs to other sites, in a way transparent for the end users. Used to move away simple jobs of « implicit users » to make place for a bigger application in a site. Full production status. • Co-allocation. We are starting to prepare a first generation co-allocation service on the full heterogeneous infrastructure, using LSF Multi-cluster. Important for coupled Grid aplications and for data movement. Service in development phase, prototype expected in 6-9 months • Remote I/O using Global File Systems and fast data transfers. See next transparency • Integrating hierarchical data management and databases in the supercomputing Grid. In progress Amsterdam, 28 June 2006 DEISA UNICORE tutorial 15 Accessing remote data: high performance remote I/O and file transfer Remote I/O with global file systems implicitly moves data across platforms (in production today) DEISA will also deploy explicit high performance data movers, using GridFTP DATA REPOSITORY GridFTP Co-scheduled, parallel data mover tasks Amsterdam, 28 June 2006 DEISA UNICORE tutorial 16 Summary • DEISA provides an integrated supercomputing environment, with efficient data sharing through high performance global file systems. This is highly transparent to end users. • DEISA enables job migration across sites (also transparent to end users). Exceptional resources for very demanding applications are made available by the operation of the global resource pool. We are load balancing computational workload at a European scale. • Huge, demanding applications can be run “as such”. • Support of Grid applications (which are distributed by design). • With this operational model, the DEISA super-cluster is not very different from a “true” monolithic European supercomputer (which must be partitioned in any case for fault tolerance and QoS). • The main difference comes from the coexistence of several independent administration domains. This requires, as in TeraGrid, coordinated production environments. Amsterdam, 28 June 2006 DEISA UNICORE tutorial 17 UNICORE UNICORE UNiform Interface to COmputer Resources Following material thanks to UNICORE team Amsterdam, 28 June 2006 DEISA UNICORE tutorial 18 Highlights • • • • • • Excellent workflow support Transparent data staging / transfer Multi-site, multi-step jobs: heterogeneous meta-computing Uniform user authentication and security mechanisms The site maintains full control over their resources UNICORE Client offers – Uniform GUI for job creation and monitoring – Easy integration of applications through plugins Amsterdam, 28 June 2006 DEISA UNICORE tutorial 19 History I • Development started in 1997 • Projects UNICORE and UNICORE Plus – Funded by the German Ministry of Education and Research (until 12/2002) Amsterdam, 28 June 2006 DEISA UNICORE tutorial 20 History II Developments in EC funded projects: • EUROGRID (11/2000 – 01/2004) – – – • IST-1999-20247 Resource broker, Standard based File Transfer (gridFTP) Bio molecular simulations, Weather prediction, coupled CAE simulations, Structural analysis GRIP (01/2002 – 02/2004) – – IST-2000-32257 Interoperability between UNICORE and Globus (Integration of Globus maintained resources as target system in UNICORE) • OpenMolGRID (09/2002 – 11/2004) – – – IST-2001-37238 Use UNICORE for molecular engineering Focus on scientific workflows Amsterdam, 28 June 2006 DEISA UNICORE tutorial 21 History III • Collaborators: – – – – – – Intel GmbH (former Pallas GmbH) Fujitsu Laboratory of Europe (former fecit) Forschungszentrum Jülich Deutscher Wetterdienst Genias, RUS, RUKA, LRZ, PC2, ZHR, ZIB CNRS-IDRIS (F), CSCS (CH), GIE EADS CCR (F), ICM (PL), Parallab (N), Soton (UK), UoM (UK), ANL (US), UT (EE), UU (UK), ComGenex (HU), Negri (I) Amsterdam, 28 June 2006 DEISA UNICORE tutorial 22 • Features – – – – Intuitive GUI with single sign-on X.509 certificates for AA and job/data signing only one opened port in firewall required workflow engine for • • • – – – – – – complex multi-site multi-step workflows job monitoring extensible application support secure data transfer integrated resource management easy installation and configuration of client and server components full control of resources remains production quality, … Amsterdam, 28 June 2006 DEISA UNICORE tutorial 23 Software Status I • • • • Current version 5.6 (Client) / 4.6 (Server) User Client is platform independent (Java) Servers (Unix) Target systems (Unix) – “no batch“ – T3E, SP3, VPP, hpcLine, SR 8000, SX-5, PC-Clusters, … , Globus 2.x as targets – NQS, LL, LSF, PBS, CCS, SGE, ... Amsterdam, 28 June 2006 DEISA UNICORE tutorial 24 Software Status II • UNICORE available at SourceForge as OpenSource under BSD license • http://unicore.sourceforge.net • UNICORE Forum e.V. • http://www.unicore.org • Public test system for testing (standard) client functions available Amsterdam, 28 June 2006 DEISA UNICORE tutorial 25 Deployment • At all project partner sites • DEISA sites (IDRIS, CINECA, RZ Garching, ...) • Naregi project (Japan) • … Amsterdam, 28 June 2006 DEISA UNICORE tutorial 26 UNICORE Architecture UNICORE Client The UNICORE Grid Amsterdam, 28 June 2006 DEISA UNICORE tutorial 27 ARCHITECTURE Client SSL Multi-Site Jobs opt. Firewall Gateway NJS Authorization Authentication NJS Abstract opt. Firewall Gateway opt. Firewall NJS UUDB UUDB IDB IDB Incarnation IDB TSI TSI NonAbstract TSI RMS Disc RMS Disc RMS Disc Vsite Vsite Vsite Usite Amsterdam, 28 June 2006 Authorization DEISA UNICORE tutorial Usite 28 Client Usites Job Preparation Workflow Management Job Monitoring Amsterdam, 28 June 2006 Vsites DEISA UNICORE tutorial 29 UNICORE Server • Gateway • Network Job Supervisor – Configuration – UNICORE User Data Base • Target System Interface • Demo package containing preconfigured components available on sourceforge.net/projects/unicore Amsterdam, 28 June 2006 DEISA UNICORE tutorial 30 Server Components Gateway conf conf Network Job Supervisor UUDB UNICORE User DB Target System Interface Amsterdam, 28 June 2006 DEISA UNICORE tutorial 31 Server Prerequisites • Gateway and NJS: – Java 1.4.2 – X.509 certificates for Gateway and NJS – Signer certificate(s) • TSI: – Perl ( 5.004) Amsterdam, 28 June 2006 DEISA UNICORE tutorial 32 Gateway • Entry point of a UNICORE Site • Accepts SSL connections from Clients and NJSs • Accepts valid certificates from all signers known to it (authentication) • Talks UNICORE Protocol Layer (UPL) on connections to the outside world • Sends/receives AJOs to/from the NJSs Amsterdam, 28 June 2006 DEISA UNICORE tutorial 33 Gateway connections Gateway gateway.properties gw.gateway_host_name=<host name> gw.port=<port> conf connections <Vsite name> <NJS machine> <NJS port> conf Network Job Supervisor UUDB Amsterdam, 28 June 2006 UNICORE User DB DEISA UNICORE tutorial 34 Network Job Supervisor (NJS) • • • • • • • • UNICORE scheduler Receives/sends AJOs from/to local Gateway Translates AJO into batch job for target Maps the user’s Ulogin to Xlogin Sends sub-AJOs to corresponding Gateway according to dependencies Polls for status and output of sub-AJOs Sends batch jobs and requests to TSI Polls TSI for job status and output Amsterdam, 28 June 2006 DEISA UNICORE tutorial 35 NJS Connections Gateway conf Admin NJS conf UUDB connections njs.properties njs.gateway=<host name> njs.vsite_name=<name> njs.gateway_port=<port> njs.admin_port=<port> UNICORE User DB TSI Amsterdam, 28 June 2006 DEISA UNICORE tutorial 36 Incarnation Data Base Static definitions and translation table, contains definitions for • GENERAL properties (file spaces, descriptions, …) • EXECUTION_TSI (host + ports, resources, batch queues, …) • STORAGE_TSI (for file transfers and management) • RUN (translation rules for target) • IMPORT, EXPORT, CLEANUP, LIST_DIRECTORY, RENAME, COPY_FILE, DELETE_FILE, CHANGE_PERMISSIONS • FORTRAN, LINK Amsterdam, 28 June 2006 DEISA UNICORE tutorial 37 UNICORE User Data Base • Management of Ulogin – Xlogin mapping information • NJS accesses this information • Basic version allows to map one certificate to exactly one Xlogin • NJS to UUDB interface defined to adapt to site specific user data bases (i.e. ldap) • http://www.unicore.org/downloads.htm contributions offers an alternative uudb with certificate-projectid pairs being mapped to Xlogins Amsterdam, 28 June 2006 DEISA UNICORE tutorial 38 NJS connections Gateway conf Admin NJS conf connections njs.properties njs.gateway=<host name> njs.vsite_name=<name> njs.gateway_port=<port> njs.admin_port=<port> njs.idb SOURCE <TSI machine> <port1> <port2> UUDB UNICORE User DB TSI Amsterdam, 28 June 2006 DEISA UNICORE tutorial 39 Target System Interface • Interface to target operating and batch system • Perl scripts and modules • Needs root privileges to act on behalf of the user (uses setreuid) • Provides interface to local system for – Job submission – Status query, job monitoring – File handling – … Amsterdam, 28 June 2006 DEISA UNICORE tutorial 40 Example: Submit.pm $jobname = $2 if $1 eq "JOBNAME"; $outcome_dir = $2 if $1 eq "OUTCOME_DIR"; $uspace_dir = $2 if $1 eq "USPACE_DIR"; $time = $2 if $1 eq "TIME"; $memory = $2 if $1 eq "MEMORY"; $nodes = $2 if $1 eq "NODES"; … $memory = "-lM $memory"."Mb"; my $command = "$main::submit_cmd $queue $nodes $email $memory $time $jobname $stdout_loc $stderr_loc $Submit::tsi_unique_file_name"; Amsterdam, 28 June 2006 DEISA UNICORE tutorial 41 TSI connections njs.properties NJS conf UUDB TSI Amsterdam, 28 June 2006 njs.idb SOURCE <TSI machine> <port1> <port2> UNICORE User DB tsi.properties $main::njs_machine = shift || "NJS host"; $main::njs_port = shift || "port1"; $main::my_port = shift || “ port2”; DEISA UNICORE tutorial 42 Overview: Server connections Client NJS SSL Client Gateway Plain sockets TSI SSL or plain socket NJS TSI Client Admin idb: gw.port + Client Usite list Amsterdam, 28 June 2006 njs.admin_port host p1 p2 njs.gateway_port SOURCE + tsi script + GW connections file DEISA UNICORE tutorial 43 Firewall Issues • Client Gateway – – – • Gateway NJS – – • Internet Allow connections to Gateway for https protocol on the port the Gateway is listening on Client side has to allow for outgoing traffic on any port Intranet All connections from Gateway to NJS system and NJS’s Gateway port NJS TSI – – – Intranet All connections from NJS to TSI system and TSI’s NJS port All connections from TSI to NJS system and NJS’s TSI port Amsterdam, 28 June 2006 DEISA UNICORE tutorial 44 Current Trends and a look into the UNICORE future... • Web services for interoperability • “open up“ the architecture • ...but keep the UNICORE strenghts – abstraction and virtualisation – workflows – easy application integration Amsterdam, 28 June 2006 DEISA UNICORE tutorial 45 Acronyms I Usite Vsite Ujob Ulogin Xlogin Uspace Xspace Nspace Site providing UNICORE services Computing resource, target system UNICORE Job UNICORE Login, X.509 certificate Unix Login at Vsite Temporary file space for Ujob at Vsite File space at the Vsite File space on the computer where Client runs Amsterdam, 28 June 2006 DEISA UNICORE tutorial 46 Acronyms II AJO UPL NJS Incarnation IDB UUDB TSI JPA JMC Amsterdam, 28 June 2006 Abstract Job Object UNICORE Protocol Layer (Client – Gateway) Network Job Supervisor Transl. of AJO into batch job using IDB Incarnation Data Base UNICORE User Data Base Target System Interface Job Preparation Agent, part of Client Job Monitor Controller, part of Client DEISA UNICORE tutorial 47 meets FZJ users CNE users RZG users IDR users CSC users SARA users BSC users LRZ users DEISA FZJ gateway DMZ DEISA CNE gateway DMZ DEISA RZG gateway DMZ DEISA IDR gateway DMZ DEISA CSC gateway DMZ DEISA SARA gateway DMZ DEISA BSC gateway DMZ DEISA LRZ gateway DMZ CNE NJS RZG NJS CSC NJS SARA NJS BSC NJS FZJ NJS intranet intranet Amsterdam, 28 June 2006 intranet IDR NJS intranet intranet DEISA UNICORE tutorial intranet intranet LRZ NJS intranet 48 UNICORE Security • • • • • • Security model based on X509 public key infrastructure Credential consists of a public and a private key No userid and password authentication Password protected keystore Single sign on UNICORE accepts following private key formats: – RSA (pkcs12) • E.g. Openssl 0.9.7x – Java keystore (jks) • • • SUN Java Certificates provided e.g. by DFN CA Two server site security entities: – Gateway – Authentication – NJS – Authorisation Amsterdam, 28 June 2006 DEISA UNICORE tutorial 49 UNICORE Security - Client • Access to password protected keystore • Encrypted Keystore contains all imported certificate(s) and the user‘s private key(s) • UNICORE Keystore editor allows to – Generate a X509 certificate request – Import/export .p12 or .jks keystores – Import public keys • The User has to import (at least) three certificates into the Client – Pluginsigner‘s certificate (public key) – Gateway signer‘s certificate (public key) – User‘s signed public key Amsterdam, 28 June 2006 DEISA UNICORE tutorial 50 UNICORE Security: Gateway • Gateway authenticates the user • Following checks are performed on certificates presented by a client – Certificate is issued by one of the trusted CA (e.g. DFN-CA) – Certificate is within its validity period – Certificate has not been revoked (if check for Certification Revocation Lists (CRL) is activated) • Gateway accepts only SSL connections from Clients and other NJSs – SSL-Handshake • Optional SSL connection between Gateway and NJS Amsterdam, 28 June 2006 DEISA UNICORE tutorial 51 Behind the scenes: Authentication Client send user certificate Gateway send gateway certificate User Certificate establish SSL connection Trust gateway certificate issuer? Amsterdam, 28 June 2006 Gateway Certificate Trust user certificate issuer? DEISA UNICORE tutorial 52 UNICORE Security: NJS • NJS authorizes the user • Access the UNICORE user Database (UUDB) – Maps the user‘s certificate to his xlogin on the target system • Only users presenting certificates stored in the UUDB can connect to the target system • NJS authorises other NJSs • Explicit UUDB entry Amsterdam, 28 June 2006 DEISA UNICORE tutorial 53 Behind the Scenes: Authorisation Certificate 1 Typical UNICORE User Client Login A Login B Certificate 2 Login C Certificate 3 Login D Gateway Certificate 4 Certificate 5 Login E AJO User Certificate User Certificate UUDB NJS IDB User Login TSI Amsterdam, 28 June 2006 DEISA UNICORE tutorial 54 UNICORE Job • Job contains – Sub-jobs and tasks – Dependency information • Without dependencies all tasks of a job are executed in “parallel” – Workflow: doN, loops, if-then-else – Target system location • Tasks are translated into batch jobs for the destination system by the servers (NJSs) Amsterdam, 28 June 2006 DEISA UNICORE tutorial 55 Abstract Job Object (AJO) • Abstract, target system independent representation of a job • Specifies actions to be performed by UNICORE – Execute task – File transfer task – Control task • Contains dependency graph • Contains resource requests (nodes, memory, time, ...) • Contains data set descriptions for data to be streamed • Realised as Java classes Amsterdam, 28 June 2006 DEISA UNICORE tutorial 56 @ • Open Source under BSD license • Supported by FZJ – Integration of own results and from other projects – Release Management – Problem tracking – CVS, Mailing Lists – Documentation – Assistance • Viable basis for many projects – DEISA, UniGrids, NaReGI, … • http://unicore.sourceforge.net Amsterdam, 28 June 2006 DEISA UNICORE tutorial 57