Getting Access to the Grid Hamza Mehammed National e-Science Centre Edinburgh CCPB Protein Modeling on the NGS Training 25 November 2008 Outline z Grid Computing z Grid Infrastructure z Grid Components z Methodes of Accessing the Grid Grid Computing z Hetrogen, dynamic, geo. distributed z PCs, Servers, Storages, Special devices, Services, … HPC (paralel jobs), HTC (disperse cycle) z On-demand Access z Co-ordinatiton and collaboration z Transparent for the user z VO z gsissh user Grid Grid Architecture - Connectivity - Resources - Tools - Applications Taken from: http://www.gridcafe.org/ Virtual Organisation Distributed user and resources z Different admin. domains z Sharing resources z Dynamic environment z Member 1 Member 2 VO Globus Toolkit Argonne Lab. Chicago, USA (globus.org) z Software for Grid infrastructure z Globus Alliance: SW, Documentation, mailing lists, … z z z GT4: Service Oriented GT2: Pre-Web Service Oriented Installing and maintaining is complicated z Provides all Grid Components (GSI, RM, DM, IM) z Hosts incubator projects z Meta scheduler Gridway z Globus SW Infrastructure Applications Portals Virtual organisations Commandline Grid Security Infrastructure Data mgmt. Information mgmt. Resource mgmt. Grid Environment API Users Grid Security Infrastructure (GSI) z Authentication (user/host) • Proxy (signed by owner, avoid re-entering) • Authorisation the user: Access control • Public Key Infrastructure (PKI) • Public key and pivate key • Sender endcryption: Private key1-Public key2) • Receiver decryption: Private key2-Public key1) • Certificate Authority (CA) • digital certificate z z Subject, public key, CA info Global name space (DN) z “/O=UK/O=sScience/OU=Edinburgh/CN=Hamza Mehammed" GSI (Cont.) Proxies z Without account z Single-Sign-On z Credential delegation through creating another proxy • Myproxy: • Repository, • Credential renewal, • Global access • Used by Grid Portals • Store credential z Resource Management (GRAM) Ressource (Job) Management (RM) z Job submision: z globus-job-run <host> /bin/hostname –f z Job submision (batch): z globus-job-submit <host>/<jm> /bin/hostname –f z Job status, output and clean: z Globus-job-[status/output/clean] <URI> • Scheduler • Fork, PBS, LSF, Condor and SGE z Data Management FTP + GSI = GridFTP z Separates data and control channel z Configurable z z buffersize, multiple stream, ... • Syntax • globus-url-copy [param.] <source> <dest> • example: copy from local to remote globus-url-copy file:///tmp/file1 gsiftp://<remote host>/tmp/file1copy • source - dest format • <protocol>://<host>:<port>/<path> • Supported protocols • https, http, gsiftp, ftp, and file Information Management z Monitoring and Discovery z Software: z Application, z Compiler, z Library, ... z Hardware resources: z Operating System z RAM, Storage, ... z Network comp z Services: z Grid and/or Software Accessing the Grid GRID Portal Commandline API SSH / SH S I GS Portal Web server providing Grid Getway z Web-based interface (GUI) z No software installation (IE, Firefox,...) z Services z GSI, RM, DM, IM) z Perform Single-Sign-On z Uses Myproxy servers z Platform independent z Portal username/ password Grid node us er pr ox y us er e/ m a rn use sword pas pr ox y Portal Grid node Myproxy Server Grid node Grid node Commandline Needs client Installation z Complex installation z For experts z Flexible and efficient z Platform dependent (GT2 only Linux) z Basic knowldge of unix z Memorizing commands z Can use Myproxy server z Using Specification Language Portal or Commandline z Specying the required ressources z Require basic programming skill z Flexible (lots of parameters) z JSDL (XML based) and RSL z <JobDefinition> <JobDescription> <JobIdentification ... />? <Application ... /> <Resource ... />* <DataStaging ... />* </JobDescription> </JobDefinition> GSI-SSH OpenSSH + GSI z Sinlge-Sign-On and credential delegation z Included in Globus z Authentication using: z Myproxy server z Local proxy z Browser Certificate z Platform independent z Conclusion Why: z Too many ways to utilize resources z To save time, resource, ... z To Acheive collaboration, ... z Application: z Compute and data intensive z Long running, interactive, not network intensive, ... z Standard z THE END!