Enabling Grids for E-sciencE Concepts of grid computing Guy Warner NeSC Training Team gcw@nesc.ac.uk www.eu-egee.org INFSO-RI-508833 Acknowledgements Enabling Grids for E-sciencE • This talk was prepared by Mike Mineter of NeSC and includes slides from previous tutorials and talks delivered by: – – – – – – Dave Berry, Richard Hopkins, Guy Warner (National e-Science Centre) the EDG training team Ian Foster, Argonne National Laboratories Jeffrey Grethe, SDSC EGEE colleagues Mark Baker, The Distributed Systems Group, University of Portsmouth, http://dsg.port.ac.uk/mab • Talks at 3rd EGEE conference by – Kyriakos Baxevanidis,Deputy Head,Unit of Research Infrastructures,European Commission, DG INFSO – Dr Spyros Konidaris, European Commission – DG INFSO INFSO-RI-508833 Multimodal Behavioural Data and e-Collaboration, NeSC, Edinburgh, 14 July 05 2 The Grid Metaphor Enabling Grids for E-sciencE Mobile Access G R I D Workstation M I D D L E W A R E Supercomputer, PC-Cluster Data-storage, Sensors, Experiments Visualising Internet, networks INFSO-RI-508833 Multimodal Behavioural Data and e-Collaboration, NeSC, Edinburgh, 14 July 05 3 The grid vision Enabling Grids for E-sciencE • The grid vision is of “Virtual computing” (+ information services to locate computation, storage resources) – Compare: The web: “virtual documents” (+ search engine to locate them) • MOTIVATION: collaboration through sharing resources (and expertise) to expand horizons of – Research – Commerce – engineering, … “the knowledge economy” – Public service – health, environment,… INFSO-RI-508833 Multimodal Behavioural Data and e-Collaboration, NeSC, Edinburgh, 14 July 05 4 “A grid” Enabling Grids for E-sciencE • The initial vision: “The Grid” • The present reality: Many “grids” • Each grid is an infrastructure enabling one or more “virtual organisations” to share computing resources • What’s a VO? – People in different organisations seeking to cooperate and share resources across their organisational boundaries • Why establish a Grid? VO Institute A Institute B Institute C Institute D – Share data – Pool computers – Collaborate INFSO-RI-508833 Multimodal Behavioural Data and e-Collaboration, NeSC, Edinburgh, 14 July 05 5 The Single Computer Enabling Grids for E-sciencE • The Operating System enables easy use of – – – – – Input devices Processor Disks Display Any other attached devices Application Software Operating System Disks, Processor, Memory, … INFSO-RI-508833 Multimodal Behavioural Data and e-Collaboration, NeSC, Edinburgh, 14 July 05 6 Resources on a Local Area Network Enabling Grids for E-sciencE User just perceives “shared resources”, with no regard to location in the organisation: - Authenticated by username / password - Authorised to use own files,… Application Software Middleware for sharing computers, servers, printers, … Operating System on each computer Resources connected by a LAN INFSO-RI-508833 Multimodal Behavioural Data and e-Collaboration, NeSC, Edinburgh, 14 July 05 7 Resources on a grid Enabling Grids for E-sciencE Application Software Interface between app. and grid Grid Middleware: “collective services” Grid Middleware on each resource Operating System on each resource Resources connected by internet INFSO-RI-508833 Multimodal Behavioural Data and e-Collaboration, NeSC, Edinburgh, 14 July 05 8 A Grid Enabling Grids for E-sciencE • • Grid Middleware on each shared Resource Local Area Networks – Connected by Internet – Data Storage – (Usually) batch jobs on pools of processors • • • Users join VO’s Virtual organisation negotiates with sites to agree access to resources Distributed services (both people and middleware) enable the grid, allow single signon INFSO-RI-508833 THE INTERNE T Multimodal Behavioural Data and e-Collaboration, NeSC, Edinburgh, 14 July 05 9 What characterises a grid? Enabling Grids for E-sciencE • Co-ordinated resource sharing – No centralised point of control – Different administrative domains. • Standard, open, general-purpose protocols and interfaces – NOT specific to an application – EGEE, NGS support multiple VO’s • Delivering non-trivial qualities of service – Co-ordinated to deliver combined services, greater than sum of the individual components • http://www.gridtoday.com/02/0722/100136.html INFSO-RI-508833 Multimodal Behavioural Data and e-Collaboration, NeSC, Edinburgh, 14 July 05 11 The components of a Grid Enabling Grids for E-sciencE • Resources – networking, computers, storage, data, instruments, … • Grid Middleware – the “operating system of the grid” • Operations infrastructure – Run enabling services (people + software) • Virtual Organization management – Procedures for gaining access to resources INFSO-RI-508833 Multimodal Behavioural Data and e-Collaboration, NeSC, Edinburgh, 14 July 05 12 Key concepts Enabling Grids for E-sciencE • Virtual organisation: people and resources collaborating - across admin, organisational boundaries • Single sign-on – I connect to one machine – some sort of “digital credential” is passed on to any other resource I use, basis of: Authentication: How do I identify myself to a resource without username/password for each resource I use? Authorisation: what can I do? Determined by • My membership of VO • VO negotiations with resource providers • Grid middleware runs on each resource • User just perceives “shared resources” with no concern for location or owning organisation INFSO-RI-508833 Multimodal Behavioural Data and e-Collaboration, NeSC, Edinburgh, 14 July 05 13 The first driver: e-Science Enabling Grids for E-sciencE • What is e-Science? Collaborative science that is made possible by the sharing across the Internet of resources (data, instruments, computation, people’s expertise...) – Often very compute intensive – Often very data intensive (both creating new data and accessing very large data collections) – data deluges from new technologies – Crosses organisational boundaries INFSO-RI-508833 Multimodal Behavioural Data and e-Collaboration, NeSC, Edinburgh, 14 July 05 14 The expanding horizons of grids Enabling Grids for E-sciencE Curation, discovery, reuse of knowledge e-Research e-Science INFSO-RI-508833 Multimodal Behavioural Data and e-Collaboration, NeSC, Edinburgh, 14 July 05 15 The National Grid Service (NGS) • NGS is a production service – Therefore cannot include latest research prototypes! – ETF recommends what should be deployed • Core sites provide computation and also data services • NGS is evolving – OMII, EGEE, Globus Alliance all have m/w under assessment by the ETF for the NGS • Selected, deployed middleware currently provides “low-level” tools – New deployments will follow soon – New sites and resources being added ! Multimodal Behavioural Data and e-Collaboration, NeSC, Edinburgh, 14 July 05 16 UofD U of A H P C x BRISTOL Commercial Provider PSRE Man. Leeds GOSC RAL Oxford C S A R U of B CARDIFF U of C NGS Core Nodes: Host core services, coordinate integration, deployment and support +free to access resources for all VOs. Monitored interfaces + services NGS Partner Sites: Integrated with NGS, some services/resources available for all VOs Monitored interfaces + services NGS Affiliated Sites: Integrated with NGS, support for some VO’s Monitored interfaces (+security etc.) Multimodal Behavioural Data and e-Collaboration, NeSC, Edinburgh, 14 July 05 17 EGEE – building e-infrastructure Enabling Grids for E-sciencE EGEE is building a large-scale production grid service to: • Underpin research, technology and public service • Link with and build on national, regional and international initiatives • Foster international cooperation both in the creation and the use of the einfrastructure INFSO-RI-508833 Pan-European Grid Operations, Support and training Collaboration Network infrastructure & Resource centres Multimodal Behavioural Data and e-Collaboration, NeSC, Edinburgh, 14 July 05 18 EGEE Communities Enabling Grids for E-sciencE • Initially supported two communities: High Energy Physics and Bioinformatics – Most VO’s linked to a particular experiment • Additional Communities have since been added: – Geophysics – Earth Observation – Chemistry Pilot Added • Working with other communities – E.g. Digital Libraries INFSO-RI-508833 Multimodal Behavioural Data and e-Collaboration, NeSC, Edinburgh, 14 July 05 19 Enabling Grids for E-sciencE If “The Grid” vision leads us here… … then where are we now? INFSO-RI-508833 Multimodal Behavioural Data and e-Collaboration, NeSC, Edinburgh, 14 July 05 20 Grids: where are we now? Enabling Grids for E-sciencE • Many key concepts identified and known • Many grid projects have tested, and benefit from, these • Major efforts now on establishing: – Standards (a slow process) (e.g. Global Grid Forum, http://www.gridforum.org/ ) – Production Grids for multiple VO’s “Production” = Reliable, sustainable, with commitments to quality of service • In Europe, EGEE • In UK, National Grid Service • In US, Teragrid One stack of middleware that serves many research (and other!!!) communities Operational procedures and services (people!, policy,..) – New user communities • … whilst research & development continues INFSO-RI-508833 Multimodal Behavioural Data and e-Collaboration, NeSC, Edinburgh, 14 July 05 21 Summary of grid computing concepts Enabling Grids for E-sciencE • Flexible collaboration across multiple administrative domains – sharing data, computers, instruments, application software,.. • Single sign-on to resources in multiple organisations – Authorisation, authentication • Need for people-services as well as middleware services – credential authorities, VO managers, support • Drives are towards – Production services (reliable, sustainable,… – against which research projects can plan with confidence) In Europe, EGEE In UK, National Grid Service – Standards – Empowering new user communities INFSO-RI-508833 Multimodal Behavioural Data and e-Collaboration, NeSC, Edinburgh, 14 July 05 22