Enabling Grids for E-sciencE European and UK e-Science Resources Mike Mineter Training Outreach and Education National e-Science Centre mjm@nesc.ac.uk www.eu-egee.org EGEE-II INFSO-RI-031688 Contents Enabling Grids for E-sciencE • • • • • • e-Science e-Infrastructure (~ cyberinfrastructure) Grid concepts NGS: National Grid Service (UK) EGEE: Enabling Grids for e-Science (EU funded) Building future infrastructure: European Grid Initiative EGEE-II INFSO-RI-031688 2 Enabling Grids for E-sciencE ‘e-Science is about global collaboration in key areas of science, and the next generation of infrastructure that will enable it.’ John Taylor Director General of Research Councils Office of Science and Technology 2000 EGEE-II INFSO-RI-031688 3 e-Infrastructure Enabling Grids for E-sciencE ‘e-Science is about global collaboration in key areas of science, and the next generation of infrastructure that will enable it.’ e-Infrastructure = Networks + Grids .. + Operations, Support, Training… + Data centres, archives, instruments… • Networks connect resources • Grids enable flexible use of networked resources: “virtual computing” EGEE-II INFSO-RI-031688 4 Enabling Grids for E-sciencE Grid concepts EGEE-II INFSO-RI-031688 5 Grids: a foundation for e-Science Enabling Grids for E-sciencE • Enabling a whole-system approach • Effect > Σparts computers software Grid sensor nets instruments Diagram derived from Ian Foster’s slide EGEE-II INFSO-RI-031688 colleagues Shared data archives 6 Virtual organisations and grids Enabling Grids for E-sciencE • What is a Virtual Organisation? – People in different organisations seeking to cooperate and share resources across their organisational boundaries – E.g. A research collaboration • Each grid is an infrastructure enabling one or more “virtual organisations” to share and access resources • Each resource is exposed to the grid through an abstraction that masks heterogeneity, e.g. – Multiple diverse computational platforms – Multiple data resources • Resources are usually owned by VO members. Negotiations lead to VOs sharing resources EGEE-II INFSO-RI-031688 7 Typical current grid Enabling Grids for E-sciencE • Virtual organisations bring and/or negotiate access to resources • Grid middleware runs on each shared resource • Provides INTERNET – Data services – Computation services – Single sign-on • Distributed services (both people and middleware) enable the grid EGEE-II INFSO-RI-031688 8 The Role of the Virtual Organisation (VO) Enabling Grids for E-sciencE Compute Center VO Service Compute Center slide based on presentation given by Carl Kesselman at GGF Summer School 2004 EGEE-II INFSO-RI-031688 9 The many scales of grids Enabling Grids for E-sciencE National datacentres, HPC, instruments Institutes’ data; Wider collaboration greater resources International instruments,.. International grid (EGEE) UK: National Grid Service Regional grids Campus grids Condor pools, clusters Desktop Little interoperability across these scales of grids – yet. EGEE-II INFSO-RI-031688 10 (Some of the) Basic grid services Enabling Grids for E-sciencE • In both EGEE and NGS: – Authorisation and authentication underpins it all Grid Security Infrastructure: X.509 – issued by Certificate Authority Additional VO credentials – “VOMS” – Compute services Broker – user submits job “to the grid” Jobs run in batch mode under e.g. LSF, PBS,… – Data services - Next slide! – VO-specific and “Higher level services” build on these Portals,… Application hosting services EGEE-II INFSO-RI-031688 11 2 main types of data services on Grids Enabling Grids for E-sciencE • Simple data files on grid-specific storage • Middleware supporting • – Structured data: RDBMS, XML databases,… – Files on project’s filesystems – Data that may already have other user communities not using a Grid – Replica files to be close to where you want computation For resilience – Logical filenames – Catalogue: maps logical name to physical storage device/file – Virtual filesystems, POSIX-like I/O – Services provided: storage, transfer, catalogue that maps logical filenames to replicas. • Solutions include – gLite data service (EGEE) – Globus: Data Replication Service – Storage Resource Broker EGEE-II INFSO-RI-031688 Other data e.g. …. • Require extendable middleware tools to support – Computation near to data – Controlled exposure of data without replication • • Basis for integration and federation OGSA –DAI – In Globus 4 – Not (yet...) in gLite National Grid Service 12 EGEE – international e-infrastructure Enabling Grids for E-sciencE A four year programme (from April 2004): • • Build, deploy and operate a consistent, robust a large scale production grid service that – Links with and build on national, regional and international initiatives Improve and maintain the middleware in order to deliver a reliable service to users Attract new users from research and industry and ensure training and support for them EGEE-II INFSO-RI-031688 Pan-European Grid Operations, Support and training • Collaboration Network infrastructure & Resource centres 13 Production service Sites Enabling Grids for E-sciencE 200 180 160 140 120 100 80 sites 60 40 20 ec -0 5 D ct -0 5 O Au g05 Ju n05 Fe b05 Ap r05 ec -0 4 D ct -0 4 O Au g04 Ju n04 Ap r- 04 0 Size of the infrastructure today: • 192 sites in 40 countries • ~25 000 CPU • ~ 3 PB disk, + tape MSS 30000 No. CPU 25000 20000 CPU 15000 10000 5000 A pr -0 4 Ju n04 A ug -0 4 O ct -0 4 D ec -0 4 Fe b05 A pr -0 5 Ju n05 A ug -0 5 O ct -0 5 D ec -0 5 Fe b06 0 Date EGEE-II INFSO-RI-031688 14 The Vision of the NGS • National infrastructure services which allow researchers to: – systematically create, process, preserve and publish digital information; – easily navigate through the available resources; – be confident in the quality of the services available; – tie into international efforts • To achieve this, the NGS will – Lead the deployment of a common grid infrastructure – Promote common open standards – Through the NGS Partnership programme, integrate services to access a growing number, scale and variety of resources 15 • A production Service NGS & Partners, 2006 16 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 17 To e- or not to e-, that is the question Enabling Grids for E-sciencE • And in the geo world it’s a no-brainer. – Integrating resources (data, cpus, expertise) across semantic and admin domains – Orchestrating services: data, computation and models – Collaborating in key research & public service support • BUT – are the foundations of grids strong enough? – Do NGS, EGEE have adequate Authentication & Authorisation? Often, yes! But richer authorisation services are needed • Perhaps the biggest problems of all – Are we willing to invest in and sustain production quality services for others to use? An ecology of geo-services…. – Will “The People Grid” grow? Will competition squash cooperation? This workshop & OGF-GIS WG,… are reasons for optimism EGEE-II INFSO-RI-031688 18 EGEE - Further information Enabling Grids for E-sciencE • • • • EGEE www.eu-egee.org EGEE digital library: http://egee.lib.ed.ac.uk/ gLite http://www.glite.org/ Real-time monitors: http://gridportal.hep.ph.ic.ac.uk/rtm • EGEE training: http://egee.nesc.ac.uk INFSO-RI-508833 19 NGS Information • http://www.ngs.ac.uk • Wiki: http://wiki.ngs.ac.uk • To see what’s happening: http://ganglia.ngs.rl.ac.uk/ • Training events: http://www.nesc.ac.uk/training 20