Globus Installation • • • Secure Sockets Layer SSL - OpenSSL 0.9.6 (or SSLeay 0.9.0b) Lightweight Directory Access Protocol LDAP - OpenLDAP 1.2.7 patched for Globus Globus v1.1.4 (2.0 out soon…) – + additional packages, e.g. MPICH-G2, Condor-G, Nimrod-G, SSH-G • Grid Starter Kit - see http://www.grid-support.ac.uk • RPMs are available for Linux - see http://www.gridsupport.ac.uk and http://www.GridPP.ac.uk 22nd October 2001 UK Grid Support Centre - GSC Where do I need to use Globus ? • Front end to a major resource – workstation pool – parallel computer or Beowulf (commodity) system – Data store – Instrument – Visualisation facility 22nd October 2001 UK Grid Support Centre - GSC Daresbury Grid Testbed IBM PPC (AIX, MyProxy, server) IBM PPC cluster (4xPPC, AIX, Web server) Beowulf1 (32xPII, Linux, PBS) SP (48xPower, AIX, Loadleveler) LoadLeveler Loki (64xAlpha, Linux, PBS) Globus Condor Condor SUN cluster (2xUltra, Solaris, GIIS server) 22nd October 2001 UK Grid Support Centre - GSC Installing SSL • • • • SSL is the IETF standard Secure Sockets Layer Can be obtained from – SSLeay v0.9.0b from E.A. Young ftp://ftp.psy.uq.oz.au/Crypto/SSL – OpenSSL v0.9.6 from http://www.openssl.org – Proprietary versions, e.g. Sun – or http://esc.dl.ac.uk/StarterKit Installation is straightforward: – cd /usr/local/ssl – ./Configure <system_type> – make – make install – make tests this should install into /usr/local/ssl 22nd October 2001 UK Grid Support Centre - GSC Installing LDAP • • • OpenLDAP v1.2.7 – a special version is available from Globus ftp://ftp.globus.org/pub/globus/OpenLDAP-1.2.7-globuslatest.tar.gz – or http://esc.dl.ac.uk/StarterKit installation: – cd /usr/local/ldap – ./configure --prefix=/usr/local/ldap --enable-slapd --enableshell --disable-ldbm --without-threads – make depend – make – make install this should install into /usr/local/ldap 22nd October 2001 UK Grid Support Centre - GSC Installing Globus • • • • • Globus v1.1.4 has been fully evaluated – ftp://ftp.globus.org/pub/globus/globus-latest.tar.gz – or http://esc.dl.ac.uk/StarterKit RPMs are available for Linux – see http://www.grid-support.ac.uk and http://www.GridPP.ac.uk Installation notes are available from – http://esc.dl.ac.uk/StarterKit Evaluation reports also available plus – “Globus Guide” Current version is Globus v2.0 but this is still beta release 22nd October 2001 UK Grid Support Centre - GSC Globus Directory Structure A “globus” UNIX id should be created and home directory /home/globus 1) build in /usr/local 2) build in /home/globus Source directory • /usr/local/globus /home/globus Install directory • /usr/local/globus-install /home/globus/globus-build • or /usr/local/globus/globus-build • can be nfs mounted and shared for re-deployment Deploy directory • /usr/local/globus-deploy /home/globus/globus-deploy • or /opt/globus • can be a link, but must NOT be nfs mounted 22nd October 2001 UK Grid Support Centre - GSC Installing Globus As “globus” do: • cd /usr/local/globus • ./globus_install --prefix=/usr/local/globus-install --with-sslpath=/usr/local/ssl --with-ldap-path=/usr/local/ldap • globus-setup • globus-local-deploy During these steps you will need to do the following: • create the deploy directory • define the GIIS machine in a configuration file • obtain certificates • edit the grid-mapfile Start Globus: • cd /opt/globus/sbin • SXXGlobus start 22nd October 2001 UK Grid Support Centre - GSC issuer :/C=US/O=Globus/CN=Globus Certification Authority subject:/C=US/O=Globus/O=Central Laboratory of the Research Councils/OU=Daresbury Laboratory/OU=Computational Science and Engineering Department/CN=R J Allan serial :1053 Certificate: Data: Version: 3 (0x2) Serial Number: 4179 (0x1053) Signature Algorithm: md5WithRSAEncryption Issuer: C=US, O=Globus, CN=Globus Certification Authority Validity Not Before: Jun 6 14:33:52 2001 GMT Not After : Jun 6 14:33:52 2002 GMT Subject: C=US, O=Globus, O=Central Laboratory of the Research Councils, OU=Daresbury Laboratory, OU=Computational Science and Engineering Department, CN=R J Allan Subject Public Key Info: Public Key Algorithm: rsaEncryption RSA Public Key: (1024 bit) Modulus (1024 bit): ... Exponent: 65537 (0x10001) X509v3 extensions: Netscape Cert Type: Should install in 0xC0 Signature Algorithm: md5WithRSAEncryption ~/.globus/userkey.pem ... ~/.globus/usercert.pem -----BEGIN CERTIFICATE----/opt/globus/etc/certificates ... -----END CERTIFICATE----- Host and User Certificates can be obtained from ca@grid-support.ac.uk 22nd October 2001 UK Grid Support Centre - GSC /opt/globus/etc/grid-info.conf # These values are set by globus-setup SETUP_GRID_INFO_MODEL="MDS_SITE_INDEX" SETUP_GRID_INFO_HOST="tcs10.dl.ac.uk” defines the GIIS server SETUP_GRID_INFO_PORT="2136” defines the port SETUP_GRID_INFO_BASEDN="" SETUP_GRID_INFO_ORGANIZATION_DN="dc=dl, dc=ac, dc=uk, o=Grid" SETUP_GRID_INFO_ORGANIZATION_ADMIN_DN="" 22nd October 2001 UK Grid Support Centre - GSC /opt/globus/etc/grid-mapfile Maps distinguished names (as in the user cert) onto local UNIX ids: "/O=Grid/O=Globus/OU=dl.ac.uk/CN=Steve Andrews" sja56 "/O=Grid/O=Globus/OU=dl.ac.uk/CN=R J Allan" rja "/O=Grid/O=Globus/OU=rl.ac.uk/CN=Ronald Fowler" rff "/O=Grid/O=Globus/OU=rl.ac.uk/CN=Pete Oliver" pxo "/O=Grid/O=Globus/OU=man.ac.uk/CN=Stephen Pickles" zzcgusp "/O=Grid/O=Globus/OU=man.ac.uk/CN=Robin Pinning" zzcgurp "/O=Grid/O=Globus/OU=dl.ac.uk/CN=Daniel Hanlon" djn 22nd October 2001 UK Grid Support Centre - GSC Globus The major components • • • • GIIS - Grid Index Information Server GRIS - Grid Resource Information Server – GIIS/ GRIS together provide information on Grid resources GSI - Grid Security Infrastructure based on GSS-API (with TLS, formerly SSL) GRAM - Globus Resource Allocation Manager – manages resources via: } GRIP - Grid Resource • fork, PBS, LSF, LoadLeveler, NQE, GridEngine etc. • • • GASS - Global Access to Secondary Storage GridFTP - new parallel FTP in Globus 2.0 DUROC - co-allocator (being replaced by GARA) 22nd October 2001 Information Protocol Table 4: Unix Port Numbers used by Grid software Port service 22 GSISSH over tcp 80 http server 433 https server >1024 GridFTP data return 2119 Globus GRAM resource manager (Gatekeeper, tcp) 2135 Globus local GRIS server (tcp, udp) 2136 Globus GIIS server (tcp) 2811 FridFTP contact (tcp) 7512 MyProxy server 8080 Web cache server UK Grid Support Centre - GSC Resource Discovery Command Line Example tci18.dl.ac.uk>grid-proxy-init Enter PEM pass phrase: ..+++++ .............+++++ initialise a proxy certificate tci18.dl.ac.uk>grid-info-search -b "o=Grid” '(objectclass=GlobusComputeResource)' dn cpuload5 do a GIIS search and print 5-minute average cpu load hn=tci18.dl.ac.uk, dc=dl, dc=ac, dc=uk, o=Grid cpuload5=3.62 hn=tcs10.dl.ac.uk, dc=dl, dc=ac, dc=uk, o=Grid cpuload5=0.34 hn=tcs7.dl.ac.uk, dc=dl, dc=ac, dc=uk, o=Grid cpuload5=0.30 22nd October 2001 UK Grid Support Centre - GSC GIIS/ GRIS C API Example 1 #include "globus_common.h” Globus common module #include "lber.h" #include "ldap.h" #include <string.h> #define GRID_INFO_HOST "tcs10.dl.ac.uk” name of server to contact #define GRID_INFO_PORT "2136” specify GIIS or GRIS port #define GRID_INFO_BASEDN "o=Grid" #define search_format "(objectclass=GlobusServiceJobManager)" int main(int argc, char * argv[]) { ... rc = globus_module_activate(GLOBUS_COMMON_MODULE); activate it filter = globus_libc_malloc(strlen(search_format)+5); globus_libc_sprintf(filter, search_format); result = get_ldap_attribute(filter); contact GIIS and process search globus_module_deactivate_all(); activate all Globus modules } /* main */ 22nd October 2001 UK Grid Support Centre - GSC Example 1 (cont.) LDAP programming... char * get_ldap_attribute(char * search_filter) { LDAP * ldap_server; LDAPMessage * reply; char * attrs[1]; char * server = GRID_INFO_HOST; int port = atoi(GRID_INFO_PORT); char * base_dn = GRID_INFO_BASEDN; ldap_server = ldap_open(server, port); open contact to server on port ldap_simple_bind_s(ldap_server, "", ""); bind to server ldap_search_s(ldap_server, base_dn, run LDAP search query LDAP_SCOPE_SUBTREE, search_filter, attrs, 0, &reply); /* now parse entries in reply...*/ } /* get_ldap_attribute */ 22nd October 2001 UK Grid Support Centre - GSC GASS API Example tcs10> Globus_gass_server -w https://tcs10.dl.ac.uk:53895 create gass server on tcs10 returns a URL tci18> gass-copy myfile can now copy files https://tcs10.dl.ac.uk:53895/export/home/rja/Globus/newfile #include <globus_common.h> #include <globus_gass_file.h> int main(int argc, char * argv[]) { ... globus_module_activate(GLOBUS_GASS_FILE_MODULE); in = globus_gass_fopen(argv[1],"r"); out = globus_gass_fopen(argv[2],"w"); globus_gass_fclose(in); globus_gass_fclose(out); globus_module_deactivate(GLOBUS_GASS_FILE_MODULE); } /* main */ 22nd October 2001 UK Grid Support Centre - GSC Resource Management Architecture RSL specialization Broker RSL Queries & Info Application Ground RSL Information Service Co-allocator Simple ground RSL Local resource managers GRAM GRAM GRAM LSF Condor NQE 22nd October 2001 UK Grid Support Centre - GSC Running Remote Jobs • • Gatekeeper – Single point of entry – Authenticates user, maps to local security environment, runs service – In essence, a “secure inetd” Job manager – A gatekeeper service – Layers on top of local resource management system supported DRM systems include: • PBS, NQE, LoadLeveler, LSF, GridEngine, Condor, Nimrod, etc. – Handles remote interaction with the job 22nd October 2001 UK Grid Support Centre - GSC GRAM Components Client MDS client API calls to locate resources MDS: Grid Index Info Server Site boundary MDS client API calls to get resource info GRAM client API calls to MDS: Grid Resource Info Server request resource allocation and process creation. Query current status of resource GRAM client API state change callbacks Grid Security Local Resource Manager Infrastructure Allocate & Request create processes Job Manager Create Gatekeeper Parse RSL Library 22nd October 2001 Monitor & control Process Process Process UK Grid Support Centre - GSC Job Submission Interfaces • • Globus Toolkit includes several command line programs for job submission – globus-job-run: Interactive jobs – globus-job-submit: Batch/offline jobs – globusrun: Flexible scripting infrastructure Others are building better interfaces – General purpose • Condor-G, PBS, GRD, Hotpage, etc – Application specific • ECCE’, Cactus, Web portals 22nd October 2001 UK Grid Support Centre - GSC globus-job-run • • For running of interactive jobs Additional functionality beyond rsh – Ex: Run 2 process job with executable staging globus-job-run -: host –np 2 –s myprog arg1 arg2 – Ex: Run 5 processes across 2 hosts globus-job-run \ -: host1 –np 2 –s myprog.linux arg1 \ -: host2 –np 3 –s myprog.aix arg2 – For list of arguments run: globus-job-run -help 22nd October 2001 UK Grid Support Centre - GSC globus-job-submit • For running of batch/ offline jobs – globus-job-submit Submit job • Same interface as globus-job-run • Returns immediately – – – – globus-job-status globus-job-cancel globus-job-get-output globus-job-clean 22nd October 2001 Check job status Cancel job Get job stdout/ stderr Cleanup after job UK Grid Support Centre - GSC globusrun • Flexible job submission for scripting – Uses an RSL string to specify job request – Contains an embedded globus-gass-server • Defines GASS URL prefix in RSL substitution variable: (stdout=$(GLOBUSRUN_GASS_URL)/stdout) • – Supports both interactive and offline jobs Complex to use – Must write RSL by hand (or from a program, e.g. Web portal) – Must understand its esoteric features – Generally you should use globus-job-* commands instead 22nd October 2001 UK Grid Support Centre - GSC Resource Specification Language • • • Common notation for exchange of information between components – Syntax similar to MDS/ LDAP filters (BNF grammar) RSL provides two types of information: – Resource requirements: Machine type, number of nodes, memory, etc. – Job configuration: Directory, executable, args, environment Globus Toolkit provides an API/ SDK for manipulating RSL 22nd October 2001 UK Grid Support Centre - GSC Using globusrun with RSL tcs10> globusrun -f tci18.rsl -r tci18/jobmanager-fork RSL file Remote machine and job manager & (executable="test_gpfa.out") (directory="/home2/rja/Globus/FFT/IBM") (stdout="output.txt") (stderr="stderr.txt") could also be a GASS URL 22nd October 2001 UK Grid Support Centre - GSC Resource Management APIs • • • • The globus_gram_client API provides access to all of the core job submission and management capabilities, including callback capabilities for monitoring job status. The globus_rsl API provides convenience functions for manipulating and constructing RSL strings. The globus_gram_myjob allows multi-process jobs to self-organize and to communicate with each other. The globus_duroc_control and globus_duroc_runtime APIs provide access to multirequest (co-allocation) capabilities. 22nd October 2001 UK Grid Support Centre - GSC API Examples of GRAM #include <stdio.h> #include <string.h> #include "globus_gram_client.h" static void callback_func(void * user_callback_arg, char * job_contact, int state, int errorcode); typedef struct { globus_mutex_t mutex; globus_cond_t cond; globus_bool_t done; } my_monitor_t; 22nd October 2001 Initialisations int main(int argc, char * argv[]) { int callback_fd; int job_state_mask; int rc; char * callback_contact; char * job_contact; char * rm_contact; char * specification; float confidence; globus_bool_t done; globus_gram_client_time_t estimate; globus_gram_client_time_t interval_size; my_monitor_t Monitor; UK Grid Support Centre - GSC Gram client job request rc = globus_module_activate(GLOBUS_GRAM_CLIENT_MODULE) rm_contact = strdup(argv[1]); specification = strdup(argv[2]); job_state_mask = GLOBUS_GRAM_CLIENT_JOB_STATE_ALL; /* initialize callback function and mutex */ ... printf("\n\tTEST: submitting to resource manager...\n"); rc = globus_gram_client_job_request(rm_contact, specification, job_state_mask, callback_contact, &job_contact); /* use callback function and mutex to wait for a signal from job */ ... globus_gram_client_job_contact_free(job_contact); globus_module_deactivate(GLOBUS_GRAM_CLIENT_MODULE); printf("\tTEST completed\n"); return 0; } /* main */ 22nd October 2001 UK Grid Support Centre - GSC Callback function static void callback_func(void * user_callback_arg, char * job_contact, int state, int errorcode) { my_monitor_t * Monitor = (my_monitor_t *) user_callback_arg; switch(state) { case GLOBUS_GRAM_CLIENT_JOB_STATE_PENDING: ... case GLOBUS_GRAM_CLIENT_JOB_STATE_ACTIVE: ... case GLOBUS_GRAM_CLIENT_JOB_STATE_FAILED: printf(”a message...\n"); globus_mutex_lock(&Monitor->mutex); Monitor->done = GLOBUS_TRUE; globus_cond_signal(&Monitor->cond); globus_mutex_unlock(&Monitor->mutex); break; case GLOBUS_GRAM_CLIENT_JOB_STATE_DONE: … send signal as above } } /* callback_func() */ 22nd October 2001 UK Grid Support Centre - GSC Use of Mutex in callback standard threads programming... globus_mutex_init(&Monitor.mutex, (globus_mutexattr_t *) NULL); globus_cond_init(&Monitor.cond, (globus_condattr_t *) NULL); globus_mutex_lock(&Monitor.mutex); Monitor.done = GLOBUS_FALSE; globus_mutex_unlock(&Monitor.mutex); globus_gram_client_callback_allow(callback_func,(void *) &Monitor, &callback_contact); Wait for signal from callback function, destroy mutex and condition variable 22nd October 2001 Initialise mutex and condition variable and set up callback function globus_mutex_lock(&Monitor.mutex); while (!Monitor.done) {globus_cond_wait(&Monitor.cond, &Monitor.mutex); } globus_mutex_unlock(&Monitor.mutex); globus_mutex_destroy(&Monitor.mutex); globus_cond_destroy(&Monitor.cond); UK Grid Support Centre - GSC Advance Reservation and Other Generalizations • • General-purpose Architecture for Reservation and Allocation (GARA) – 2nd generation resource management services Broadens GRAM on two axes – Generalize to support various resource types • CPU, storage, network, devices, etc. • – Advance reservation of resources, in addition to allocation Currently a research prototype 22nd October 2001 UK Grid Support Centre - GSC Co-allocation • • Simultaneous allocation of a resource set – Handled via optimistic co-allocation based on free nodes or queue prediction – In the future, advance reservations will also be supported (already in prototype) Globus APIs/SDKs support the co-allocation of specific multirequests – Uses a Globus component called the Dynamically Updated Request Online Co-allocator (DUROC) 22nd October 2001 UK Grid Support Centre - GSC Multirequest: “+” • • A multirequest allows us to specify multiple resource needs, for example + (& (count=5)(memory>=64) (executable=p1)) (&(network=atm) (executable=p2)) – Execute 5 instances of p1 on a machine with at least 64M of memory – Execute p2 on a machine with an ATM connection Multirequests are central to co-allocation 22nd October 2001 UK Grid Support Centre - GSC A Co-allocation Multirequest +( & (resourceManagerContact= “flash.isi.edu:754:/C=US/…/CN=flash.isi.edu-fork”) (count=1) (label="subjob A") Different resource (executable= my_app1) managers ) Different ( & (resourceManagerContact= counts “sp139.sdsc.edu:8711:/C=US/…/CN=sp097.sdsc.edu-lsf") (count=2) (label="subjob B") Different executables (executable=my_app2) ) 22nd October 2001 UK Grid Support Centre - GSC GARA: The Big Picture Co-Reservation Agent Gatekeeper GRIO RM Gatekeeper Scheduler RM 22nd October 2001 MDS Info Service Gatekeeper Diffserv RM Gatekeeper DSRT RM UK Grid Support Centre - GSC Resource Management Futures: GRAM-2 (planned for 2002) • • • • • • Advance reservations – As prototyped in GARA in previous 2 years Multiple resource types – Manage anything: storage, networks, etc., etc. Recoverable requests, timeout, etc. Better lifetime management Policy evaluation points for restricted proxies Use of Web Services (WSDL, SOAP) 22nd October 2001 UK Grid Support Centre - GSC