Globus Computing Infrustructure Software Globus Toolkit 1-2 1 Grid computer software infrastructure • Primary objective: to makes a seamless environment for users to access distributed resources. • Key aspects: – Secure envelope – over all transactions – Single sign-on – being able to access all available resources after providing credentials ONCE – Data Management – Information services - providing characteristics of resources and their status (including dynamic load) – APIs and services that enable applications themselves to take advantage of Grid platform – Convenient User Interfaces (??) 1-2 2 Globus Project • Open source software toolkit developed for Grid computing. • Roots in I-way experiment – led by Ian Foster • Work started in 1996. • Now up to Version 5 • Reference implementations of Grid computing standards. • Defacto standard for Grid computing and one of the most influential projects 1-2 3 Globus Toolkit • “Toolkit” of services and packages for creating basic grid computing infrastructure. One may use parts of the toolkit as needed. • Five major parts: – Common run time - Libraries and services – Security - Components to provide secure access – Execution management - Executing, monitoring and management of jobs – Data Management - Discovery access and transfer of data – Information - Discovery and monitoring of resources and services 1-2 4 Globus Toolkit Version • Version 1 essentially a research prototype not widely used • Version 2 widely used - not web-service based • Version 3 web service based but not widely accepted because of the way services were implemented and non-robustness • Version 4 is web-service based. Some non-web services code exists from earlier versions (legacy) or where not appropriate to change to web-service based (for efficiency, etc.). • Version 5 returned to non-web service approach of version 2. • We are using Globus Version 4.0 as it is mature, widely used, and we did not want to incur new software problems in class. 1-2 5 Timeline of Globus Toolkit Globus 5.0.4 Globus 5.0.0 2011 1-2 6 Globus Open Source Grid Software Version 4 G T 4 G T 3 G T 2 G T 3 G T 4 Community Scheduler Framework [contribution] Delegation Service Python WS Core [contribution] C WS Core Community Authorization Service OGSA-DAI [Tech Preview] WS Authentication Authorization Reliable File Transfer Grid Resource Allocation Mgmt (WS GRAM) Monitoring & Discovery System (MDS4) Java WS Core GridFTP Grid Resource Allocation Mgmt (Pre-WS GRAM) Monitoring & Discovery System (MDS2) C Common Libraries Pre-WS Authentication Authorization Web Services Components Non-WS Components Replica Location Service XIO Credential Management Security Data Management Execution 1-2 Management Information Services Common Runtime 7 I Foster Major Globus 5 changes over version 4 “Most components of GT5 are incremental updates (numerous bug fixes and new features) over their GT4 counter-parts (e.g. GridFTP, RLS, MyProxy, GSIOpenSSH” Some components taken out: GT4 Java Core, WSGRAM4, RFT, to be replaced. GRAM implementation -- pre-WS GRAM2 code base and GRAM2 compatibile. NO WEB SERVICE COMPONENTS http://www.globus.org/toolkit/docs/5.0/5.0.0/rn/ 1-2 8 Currently not showing information services in version 5. New Globus crux project will address this. 1-2 http://www.globus.org/toolkit/about.html 9 Some basic Globus components • GSI Grid Security Infrastructure – Provides for security envelop around Grid resources – Uses public key cryptography • GRAM (Globus/Grid Resource Allocation Management) – Globus’ basic execution management component – Used to issue and manage jobs • GridFTP – For transferring files between resources • MDS (Monitoring and Discovery Service) – To discover resources and1-2their status 10 Security Issues • Has to cross administrative domains. • Need agreed mechanisms and standards. • Focus on Internet security mechanisms, modified to handle the special needs of Grid computing. • Distributed resources must be protected from unauthorized access. 1-2 11 GSI (Grid Security Infrastructure) Globus components for creating security envelop • Requires each user to be authenticated (their identity proved) • Uses public key cryptography (basis of Internet security) • Each user must possess a (digital) certificate, signed by a trusted certificate authority. • Users will also need to be able to give their authority to Grid components to act on their behalf – so-called proxy certificates, see later. • Users generally will also need accounts on resources they intend to use (authorization). 1-2 12 Resource Discovery Globus MDS (Monitoring and Discovery System) • Still primitive and in research but ideal is to be able to submit a job and the system find the best grid resources for that job across the whole grid • Users might access MDS to discover status of compute resources. In practice, users often know what resources are there but not dynamic load. • MDS might be used by other Grid components such as schedulers. 1-2 13 Executing a Job GRAM (Globus or Grid Resource Allocation Management) • Users typically want to submit jobs for execution. • Grid computing environments mostly Linux-based and originally and still commonly accessed through a command line. 1-2 14 Job submission command-line interface •Once you have established your security credentials, to run a simple job you might issue GRAM command: globusrun-ws -submit -c prog1 * where prog1 is executable of job. •Executable needs to be present on compute resource that is to execute it. •Above command does not specify compute resource and hence computer executing globusrun-ws command will execute prog1. 1-2 * Globus 5 command is globusrun (not a web service) 15 GridFTP command to transfer files globus-url-copy \ gsiftp://www.coitgrid02.uncc.edu/~abw/prog1out \ file:///home/abw/ First argument -- source location Second argument -- destination location. In the above case, the file: www.coit-grid02.uncc.edu/~abw/prog1out transferred to home/abw/prog1out on the local computer. 1-2 16 Scenario of User employing Globus services and facilities 1-2 17 Grid portals • Command-line interface a very primitive way of interacting with Grid resources. • Portal offers a higher-level Web based interfaces to accessing and controlling grid resources and to communicate with other members of Virtual Organization 1-2 18 Gridsphere • Gridsphere is a toolkit to build a portal • We are starting with a portal. Next we will use the command line • Later we will have an assignment of building a portal 1-2 19 Proxies • To use many services, you are required to have a proxy certificate (a proxy), derived from your user certificate. • Proxies enables resources to be accessed on user’s behalf. • Proxies are part of Grid security infrastructure, discussed later in course. • A credential management service called myProxy is used to hold proxies • Usually, Gridsphere automatically obtains a proxy from the myProxy server for you when you log in. 1-2 20 Proxy management tab 1-2 21 Questions 1-2.22 Quiz Question: What is meant by "single sign-on"? (a) Allowing only one person to sign onto a computer (b) Not allowing a person to log onto a computer more than once in any one period (c) A mechanism in which a user does not need to sign again to acquire additional resources. (b) None of the other answers 1-2.23 Question: What is authentication and what is authorization? What’s the difference? 1-2.24 Question: What does GRAM do? 1-2.25 Question: What does MDS do? 1-2.26 Question: What compoent in the Globus toolkit provides the means to transfer files? 1-2.27 Discussion Question Is it possible to use the tradition security method of username/password on a grid? What problems exist for this method? 1-2.28