Grid computing : an introduction Lionel Brunie Institut National des Sciences Appliquées Laboratoire LIRIS – UMR CNRS 5205 – Equipe DRIM Lyon, France A Brain is a Lot of Data! (Mark Ellisman, UCSD) And comparisons must be made among many We need to get to one micron to know location of every cell. We’re just now starting to get to 10 microns Données intensives • Physique nucléaire et des hautes énergies • Simulations – Observation terrestre, modélisation du climat – Géophysique, modélisation des tremblements de Terre – Aérodynamique et dynamique des fluides – Dispersion de polluants • Astronomie : les futurs télescopes produiront plus de 10 Petaoctets par an ! • • • • Génomique Chimie et biochimie Applications financières Imagerie médicale Evolution de la performance des composants informatiques • Performance Réseau/Processeur – La vitesse des processeurs double tous les 18 mois – La vitesse des réseaux double tous les 9 mois – La capacité de stockage des disques double tous les 12 mois • 1986 à 2000 – Processeurs : x 500 – Réseaux : x 340000 • 2001 à 2010 – Processeurs : x 60 – Réseaux : x 4000 Moore’s Law vs. storage improvements vs. optical improvements. Graph from Scientific American (Jan-2001) by Cleo Vilett, source Vined Khoslan, Kleiner, Caufield and Perkins. Conclusion : invest in networks ! Hansel and Gretel are lost in the forest of definitions • • • • • • • • • • Distributed system Parallel system Cluster computing Meta-computing Grid computing Peer to peer computing Global computing Internet computing Network computing Cloud computing Distributed system • N autonomous computers (sites) : n administrators, n data/control flows • an interconnection network • User view : one single (virtual) system – «A distributed system is a collection of independent computers that appear to the users of the system as a single computer » Distributed Operating Systems, A. Tanenbaum, Prentice Hall, 1994 • « Traditional » programmer view : client-server Parallel System • 1 computer, n nodes : one administrator, one scheduler, one power source • memory : it depends • Programmer view : one single machine executing parallel codes. Various programming models (message passing, distributed shared memory, data parallelism…) Examples of parallel system Memory CPU CPU Memory CPU CPU CPU CPU CPU Interconnection network A CC-NUMA architecture CPU Memory network Memory CPU CPU network Periph.network network CPU CPU Memory Memory A shared nothing architecture CPU CPU CPU Cluster computing • Use of PCs interconnected by a (high performance) network as a parallel (cheap) machine • Two main approaches – dedicated network (based on a high performance network : Myrinet, SCI, Infiniband, Fiber Channel...) – non-dedicated network (based on a (good) LAN) Where are we today ? • A source for efficient and up-to-date information : www.top500.org • The 500 best architectures ! • N° 1 : 1457 (1105) Tflops ! N° 500 : 22 (13) Tflops • Sum (1-500) = 16953 Tflops • 31% in Europe, 59% in North America • 1 Flops = 1 floating point operation per second • 1 TeraFlops = 1000 GigaFlops How it grows ? • in 1993 (prehistoric times!) – n°1 : 59.7 GFlops – n°500 : 0.4 Gflops – Sum = 1.17 TFlops • in 2004 (yesterday) – n°1 : 70 TFlops (x1118) – n°500 : 850 Gflops (x2125) – Sum = 11274 Tflops and 408629 processors 2007/11 best : http://www.top500.org/ Peak: 596 Tflops !!! http://www.top500.org/ 2008/11 best : http://www.top500.org/ Peak: 1457 Tflops !!! http://www.top500.org/ 2009/11 best : http://www.top500.org/ Peak: 2331 Tflops !!! http://www.top500.org/ Performance evolution Projected performance Architecture distribution Interconnection network distribution NEC earth simulator (1er en 2004 ; 30ème en 2007) Single stage crossbar : 2700 km of cables A MIMD with Distributed Memory 700 TB disk space 1.6 PB mass storage area : 4 tennis court, 3 floors NEC earth simulator BlueGene • 212992 processors – 3D torus • Rmax = 478 Tflops ; Rpeak = 596 Tflops RoadRunner • • • • 3456 nodes (18 clusters) - 2 stage fat tree Infiniband (optical) 1 node= 2 AMD Opteron DualCore + 4 IBM PowerXCell 8i Rmax = 1.1Pflops ; Rpeak = 1.5Pflops 3,9 MW (0,35 Gflops/W) Jaguar • 224162 cœurs – Mémoire : 300 To – Disque : 10 Po • AMD x86_64 Opteron Six Core 2600 MHz (10.4 GFlops) • Rmax = 1759 – Rpeak = 2331 • Power : 6,950 MW • http://www.nccs.gov/jaguar/ Network computing • From LAN (cluster) computing to WAN computing • Set of machines distributed over a MAN/WAN that are used to execute parallel loosely coupled codes • Depending on the infrastructure (soft and hard), network computing is derived in Internet computing, P2P, Grid computing, etc. Meta computing (beginning 90’s) • Definitions become fuzzy... • A meta computer = set of (widely) distributed (high performance) processing resources that can be associated for processing a parallel not so loosely coupled code • A meta computer = parallel virtual machine over a distributed system SAN LAN Cluster of PCs WAN SAN Visualization Cluster of PCs Supercomputer Internet computing • Use of (idle) computer interconnected by Internet for processing large throughput applications • Ex : SETI@HOME – 5M+ users since launching – 2009/11 : 930k users, 2.4M computers; 190k active users, 278k active computers, 2M years of CPU time – 234 « countries » – 1021 floating point operations since 1999 – 769 Tflop/s! – BOINC infrastructure (Décrypthon, RSA155…) • Programmer view : a single master, n servants Global computing • Internet computing on a pool of sites • Meta computing with loosely coupled codes • Grid computing with poor communication facilities • Ex : Condor (invented in the 80’s) Peer to peer computing • A site is both client and server : servent • Dynamic servent discovery by « contamination » • 2 approaches : – centralized management : Napster, Kazaa, eDonkey… – distributed management : Gnutella, KAD, Freenet, bittorrent… • Application : file sharing Grid computing (1) “coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organisations” (I. Foster) Grid computing (2) • Information grid – large access to distributed data (the Web) • Data grid – management and processing of very large distributed data sets • Computing grid – meta computer • Ex : Globus, Legion, UNICORE… Parallelism vs grids : some recalls • • • Grids date back “only” 1996 Parallelism is older ! (first classification in 1972) Motivations : – need more computing power (weather forecast, atomic simulation, genomics…) – need more storage capacity (petabytes and more) – in a word : improve performance ! 3 ways ... Work harder Work smarter Get help --> --> --> Use faster hardware Optimize algorithms Use more computers ! The performance ? Ideally it grows linearly • Speed-up : – if TS is the best time to process a problem sequentially, – its time should be TP=TS/P with P processors ! – Speedup = TS/TP – limited (Amdhal law): any program has a sequential and a parallel part TS=F+T//, – thus the speedup is limited : S = (F+T//)/(F+T///P)<1/F • Scale-up : – if TPS is the time to treat a problem of size S with P processors, – then TPS should also be the time to treat a problem of size n*S with n*P processors Grid computing Starting point • Real need for very high performance infrastructures • Basic idea : share computing resources – “The sharing that the GRID is concerned with is not primarily file exchange but rather direct access to computers, software, data, and other resources, as is required by a range of collaborative problem-solving and resourcebrokering strategies emerging in industry, science, and engineering” (I. Foster) Applications • Distributed supercomputing • High throughput computing • On demand (real time) computing • Data intensive computing • Collaborative computing An Example Virtual Organization: CERN’s Large Hadron Collider Worldwide LHC Computing Grid (WLCG) 1800 Physicists, 140 Institutes, 33 Countries 10 PB of data per year ; 50,000 CPUs? LCG System Architecture • Modèle de calcul sur 4 étages – Tier-0: CERN: accélérateur • Acquisition de données et reconstruction • Distribution des données aux Tier-1 (~online) – Tier-1 • Accès et disponibilité 24x7, • Acquisition de données quasi-online • Service de données sur la grille: enregistrement utilisant un service de « mass storage » • Analyse-Lourde des données • ~10 pays – Tier-2 • Simulation • Utilisateur final, analyses des donnéesen batch et en Interactif • ~40 Pays Tier-0 (1) Tier-1 (11) Tier-2 (~150) Tier-3 (~50) LHC 40 millions de collisions par seconde ~100 collisions d’intérêt par seconde après filtrage 1-10 MB de données pour chaque collision – Tier-3 Taux d’acquisition: 0.1 à 1 GB/sec • Utilisateur final, analyse scientifique 1010 collisions enregistrées chaque année ~10 PetaBytes/an LCG System Architecture (suite) Trigger and Data Acquisition System Tier-0 Tier-1 10 Gbps links Optical Private Network (to almost all sites) Tier-2 General Purpose/Academic/ Research Network From F. Malek – LCG FRance Back to roots (routes) • Railways, telephone, electricity, roads, bank system • Complexity, standards, distribution, integration (large/small) • Impact on the society : how US grown • Big differences : – – – – clients (the citizens) are NOT providers (State or companies) small number of actors/providers small number of applications strong supervision/control Computational grid • “Hardware and software infrastructure that provides dependable, consistent, pervasive and inexpensive access to high-end computational capabilities” • Performance criteria : – – – – – – – security reliability computing power latency throughput scalability services Some recalls about parallelism Sources of parallelism R1 R2 ---> pipeline parallelism ---> intra-operator parallelism ---> inter-operator parallelism ---> inter-query parallelism Parallel Execution Plan (PEP) PARALLEL EXECUTION PLAN “Scenario” of the query processing Define the role played by every processor and its interaction with the other ones Plan model Cost Model Search heuristics Search space Intrinsic limitations • Startup time • Contentions : – concurrent accesses to shared resources – sources of contention : • • • • architecture data partitioning communication management execution plan • Load imbalance – response time = slowest process – NEED to balance data, IO, computations, comm. Parallel Execution Scenario ---> Operator processing ordering ---> degree of inter-operation parallelism ---> access method (e.g. indexed access) ---> characteristics of intermediate relations (e.g. cardinality) ---> degree of intra-operation parallelism ---> algorithm (e.g. hybrid hash join) ---> redistribution procedures ---> synchronizations ---> scheduling ---> mapping ---> control strategies Execution control Precisely planning the processing of a query is impossible ! Global and partial Information Dynamic parameters Mandatory to control the execution and to dynamically re-optimize the plan Load balancing Re-parallelization End of recalls… Levels of cooperation in a computing grid • End system (computer, disk, sensor…) – multithreading, local I/O • Cluster (heterogeneous) – synchronous communications, DSM, parallel I/O – parallel processing • Intranet – heterogeneity, distributed admin, distributed FS and databases – low supervision, resource discovery – high throughput • Internet – no control, collaborative systems, (international) WAN – brokers, negotiation Grid characteristics • • • • • • • • Large scale Heterogeneity Multiple administration domain Autonomy… and coordination Dynamicity Flexibility Extensibility Security Basic services • • • • • • Authentication/Authorization/Traceability Activity control (monitoring) Resource discovery Resource brokering Scheduling Job submission, data access/migration and execution • Accounting Layered Grid Architecture (By Analogy to Internet Architecture) “Coordinating multiple resources”: ubiquitous infrastructure services, app-specific distributed services Collective Application “Sharing single resources”: negotiating access, controlling use Resource “Talking to things”: communication (Internet protocols) & security Connectivity Transport Internet “Controlling things locally”: Access to, & control of, resources Fabric Link From I. Foster Internet Protocol Architecture Application Aspects of the Problem • Need for interoperability when different groups want to share resources – – • Need for shared infrastructure services to avoid repeated development, installation – – • Diverse components, policies, mechanisms E.g., standard notions of identity, means of communication, resource descriptions E.g., one port/service/protocol for remote access to computing, not one per tool/application E.g., Certificate Authorities: expensive to run A common need for protocols & services From I. Foster Resources • • • • • • • Description Advertising Cataloging Matching Claiming Reserving Checkpointing Resource layers • Application layer – tasks, resource requests • Application resource management layer – intertask resource management, execution environment • System layer – resource matching, global brokering • Owner layer – owner policy : who may uses what • End-resource layer – end-resource policy (e.g. O.S.) Resource management (1) • Services and protocols depend on the infrastructure • Some parameters – – – – stability of the infrastructure (same set of resources or not) freshness of the resource availability information reservation facilities multiple resource or single resource brokering • Example request : I need from 10 to 100 CE each with at least 128 MB RAM and a computing power of 50 Mips Resource management and scheduling (1) • Levels of scheduling – job scheduling (global level ; perf : throughput) – resource scheduling (perf : fairness, utilization) – application scheduling (perf : response time, speedup, produced data…) • Mapping/scheduling – – – – – resource discovery and selection assignment of tasks to computing resources data distribution task scheduling on the computing resources (communication scheduling) Resource management and scheduling (2) • Individual perfs are not necessarily consistent with the global (system) perf ! • Grid problems – predictions are not definitive : dynamicity ! – Heterogeneous platforms – Checkpointing and migration A Resource Management System example (Globus) RSL specialization Broker RSL Resource Specification Language Queries & Info Application Ground RSL Information Service Co-allocator Simple ground RSL Local resource managers GRAM GRAM GRAM LSF Condor NQE LSF : Load Sharing Facility (task scheduling and load balancing; Developed by Platform Computing) NQE : Network Queuing Env. (batch management; developed by Cray Research Resource information (1) • What is to be stored ? – virtual organizations, people, computing resources, software packages, communication resources, event producers, devices… – what about data ??? • A key issue in such dynamics environments • A first approach : (distributed) directory (LDAP) – – – – – – – easy to use tree structure distribution static mostly read ; not efficient updating hierarchical poor procedural language Resource information (2) • But : – – – – dynamicity complex relationships frequent updates complex queries • A second approach : (relational) database Programming on the grid : potential programming models • • • • • • • Message passing (PVM, MPI) Distributed Shared Memory Data Parallelism (HPF, HPC++) Task Parallelism (Condor) Client/server - RPC Agents Integration system (Corba, DCOM, RMI) Program execution : issues • Parallelize the program with the right job structure, communication patterns/procedures, algorithms • Discover the available resources • Select the suitable resources • Allocate or reserve these resources • Migrate the data • Initiate computations • Monitor the executions ; checkpoints ? • React to changes • Collect results Data management • It was long forgotten !!! • Though it is a key issue ! • Issues : – – – – – – indexing retrieval replication caching traceability (auditing) • And security !!! Bruni, pas BruniE !!! From computing grids to information grids From computing grids to information grids (1) • Grids lack most of the tools mandatory to share (index, search, access), analyze, secure, monitor semantic data (information) • Several reasons : – history – money – difficulty • Why is it so difficult ? – Sensitivity but openness – Multiple administrative domains, multiple actors, heterogeneousness but a single global architecture/view/system – Dynamicity and unpredictability but robustness – Wideness but high performance From computing grids to information grids (2) ex: Replica Management Problem • Maintain a mapping between logical names for files and collections and one or more physical locations • Decide where and when a piece of data must be replicated • Important for many applications • Example: CERN high-level trigger data – – – – – • Multiple petabytes of data per year Copy of everything at CERN (Tier 0) Subsets at national centers (Tier 1) Smaller regional centers (Tier 2) Individual researchers have copies of pieces of data Much more complex with sensitive and complex data like medical data !!! From computing grids to information grids (3): some (still…) open issues • Security, security, security (incl. privacy, monitoring, traceability…)) at a semantic level • Access protocols (incl. replication, caching, migration…) • Indexing tools • Brokering of data (incl. accounting) • (Content-based) Query optimization and execution • Mediation of data • Data integration, data warehousing and analysis tools • Knowledge discovery and data mining Functional View of Grid Data Management Application Metadata Service Planner: Data location, Replica selection, Selection of compute and storage nodes Replica Location Service Information Services Location based on data attributes Location of one or more physical replicas State of grid resources, performance measurements and predictions Security and Policy Executor: Initiates data transfers and computations Data Movement Data Access Compute Resources Storage Resources Un peu simpliste, quand même… Grid Security (1) Why Grid Security is Hard • Resources being used may be extremely valuable & the problems being solved extremely sensitive • Resources are often located in distinct administrative domains – Each resource may have own policies & procedures • Users may be different • The set of resources used by a single computation may be large, dynamic, and/or unpredictable – Not just client/server • The security service must be broadly available & applicable – Standard, well-tested, well-understood protocols – Integration with wide variety of tools Grid Security(2) Various views User View Resource Owner View 1) Easy to use 1) Specify local access control 2) Single sign-on 2) Auditing, accounting, etc. 3) Run applications ftp,ssh,MPI,Condor,Web,… 3) Integration w/ local system Kerberos, AFS, license mgr. 4) User based trust model 4) Protection from compromised resources 5) Proxies/agents (delegation) Developer View API/SDK with authentication, flexible message protection, flexible communication, delegation, ... Direct calls to various security functions (e.g. GSS-API) Or security integrated into higher-level SDKs: E.g. GlobusIO, Condor Grid security (3) Requirements • • • • • • • Authentication Authorization and delegation of authority Assurance Accounting Auditing and monitoring Traceability Integrity and confidentiality Access to data and Mediation • • • • Ciel, where are the data ? Use case : German tourist – heart accident in Nice Data inside the grid # data at the side of the grid ! Basic idea – use of metadata/indexes. Pb : indexes are (sensitive) information • Alternative – encrypted indexes, use of views, proxies DSEM/DM2 • Mediation – no single view of the world mechanisms for interoperability, ontologies • Negotiation : a key open issue Caching • Motivation : – Collaborative caching is proved to be efficient – Each institution wants to control the access to its data – No standard exists in Grids for caching • Proposal : – use metadata to collaborate / index data – a two-level cache : local caches and a global virtual cache – a single access point to internal data (proxy) Query optimization and execution • Old wine in new bottles ? • Yes and no : it seems the problem has not changed but the operational context has so changed that classical heuristics and methods are not more pertinent • Key issues : – Dynamicity – Unpredictability – Adaptability • Very few works have specifically addressed this problem An application example : GGM Grille Geno-Medicale An application example: GGM Biomedical grids • Biomedical applications are perfect candidates for gridification: – – – – Huge volumes of data (an hospital = several TB per year) Dissemination of data Collaborative work (health networks) Very hard requirements (e.g. response time) • But – Partially structured semantic data – Very strong privacy issues … a perfect play field for researchers ! An application example: GGM Motivation (1) • Dissemination of new “high bandwidth” technologies in genome and proteome research (e.g. micro-arrays) – huge volume of structural (gene localization) – functional (gene expression) data • Generalization of digital patient files and digital medical images • Implementation of (regional and inter-national) health networks • All information is available, people are connected to the network. • The question is : How can we use it ? An application example: GGM Motivation (2) • Need for an information infrastructure to – index, exchange/share, process all this data – while preserving their privacy at a very large scale • That is... just a good grid! • Application objectives: – correlation of genomic and medical data: fundamental research and later medical decision making process – patient-centered medical data integration: patient’s monitoring in and out-side the hospital – epidemiology – training An application example: GGM Motivation (3) • References: “Synergy between medical informatics and bioinformatics: facilitating genomic medicines for future healthcare”, – BIOINFOMED Working Group, Jan. 2003, European Commission • Proceedings of the first and second Healthgrid conferences (Lyon 2003, Clermont-Ferrand 2004, Oxford 2005, Valence 2006) An application example: GGM Scope “The goal of the GGM project is, on top of a grid infrastructure, to propose a software architecture able to manage heterogeneous and dynamic data stored in distributed warehouses for intensive analysis and processing purposes.” • • • • Distributed Data Warehouses Query Optimization Data Access [and control] Data Mining An application example: GGM Coming back: semantic data ? • A medical piece of data (age, image, biological result, salient object in an image) has a meaning – conveys information that can be (differently) interpreted • Meta-data can be attached to medical data… or not – pre-processing is necessary • Medical data are often private – privacy/delegation • The medical data of a patient are often disseminated over multiple sites – access rights/authentication problem, collection/integration of data into partial views, identification of data/users • Medical (meta-)data are complex and not yet standardized – no global structure An application example: GGM Architecture DW-GUI DW-GUI GGM NDS DW DM OQS Patate DA+Cache wrappers Grid Middlware Experiments GO Medical Virtual data warehouses on the grid Virtual data warehouse on the grid (1) • Almost nothing… • Why is it so difficult ? – – – – – – – multiple administrative domains very sensitive data => security/privacy issues wide distribution unpredictability relationship with data replica heterogeneity dynamicity (permanent production of large volumes of data) • Centralized data warehouse ? – Not realistic at a large scale and not acceptable Virtual data warehouse on the grid (2) • A possible direction of research : virtual data warehouses on the grid • Components : – a federated schema – a set of partial views (“chunks”)materialized at the local system level • Advantages – – – – Flexibility wrt users’ needs Good use of the storage capacity of the grid and scalability Security control at the local level Global view of the disseminated data Virtual data warehouse on the grid (3) • Drawbacks and open issues – – – – maintenance protocols indexing tools access to data and negotiation query processing • Use of mobile agents ? Access to data and collaborative brokers Access to data and collaborative brokers (1) • Brokers act as interfaces between data, services and applications • Possible locations – – – – at the interface between the grid and the external data repositories on the grid storage elements at the interface between the grid and the user inside the network (e.g. routers) • Open issues – – – – – – caching : computation results, query partial results… data indexing prefetching user’s customization inter brokers collaboration a key issue : security and privacy Access to data and collaborative brokers (2) : security and privacy • Medical data belong to the patient that should be able to give access rights to who he wants • To whom processed (even anonymous) data belong to ? • How one can combine privacy and dissemination/ replication/caching ? • What about traceability ? • What about traceability ? Datamining and knowledge extraction on the grid Datamining and knowledge extraction on the grid • Structure of the data : few records, many attributes • Parallelizing data mining algorithms for the grid – volatility of the resources (data, processing) – fault tolerance, checkpointing – distribution of the data : local data exploration + aggregation function to converge towards a unified model – incremental production of the data => active data mining techniques A short overview of some grid middlewares The Legion system • University of Virginia • Object-oriented approach. Objects = data, applications, sensors, computing resources, codes… : all is object ! • Loosely coupled codes • Single naming space • Reuse of existing OS and protocols ; definition of message formats and high level protocols • Core objects : naming, binding, object creation/activation/desactivation/destruction • Methods : description via an IDL • Security : in the hands of the users • Resource allocation : a site can define its own policy High-Throughput Computing: Condor • High-throughput computing platform for mapping many tasks to idle computers • Since 1986 ! • Major components – A central manager manages pool(s) of [distributively owned or dedicated] computers. A CM = scheduler + coordinator – DAGman manages user task pools – Matchmaker schedules tasks to computers using classified ads – Checkpointing and process migration – No simple communications • Parameter studies, data analysis • Condor married Globus : Condor-G • Several hundreds of Condor pools in the world ; or on your machine ! Defining a DAG • A DAG is defined by a .dag file, listing each of its nodes and their dependencies: # diamond.dag Job A a.sub Job B b.sub Job C c.sub Job D d.sub Parent A Child B C Parent B C Child D Job A Job B Job C Job D • Each node will run the Condor job specified by its accompanying Condor submit file From Condor tutorial Executing jobs => placing data ! Executing jobsapproach => placing data ! The Storck The Stork approach • Storck • STORK: Making Data Placement a First Class Citizen in the Grid : Tevfik Kosar and Miron Livny, University of Wisconsin-Madison, ICDCS’2004, Tokyo, Japan The Globus toolkit • A set of integrated executable management (GEM) services for the Grid • Services – – – – – – – – – – resource management (GRAM-DUROC) communication (NEXUS - MPICH-G2, globus_io) information (MDS) data management (replica catalog) security (GSI) monitoring (HBM) remote data access (GASS - GridFTP - RIO) executable management (GEM) execution commodity Grid Kits (Java, Python, Corba, Matlab…) Components in Globus Toolkit 3.0 GSI WU GridFTP Pre-WS GRAM WS-Security RFT (OGSI) WS GRAM (OGSI) MDS2 JAVA WS Core (OGSI) WS-Index (OGSI) OGSI C Bindings RLS Security Data Management Resource Management Information Services WS Core Components in Globus Toolkit 3.2 GSI WU GridFTP Pre-WS GRAM WS-Security RFT (OGSI) WS GRAM (OGSI) CAS (OGSI) RLS SimpleCA OGSI-DAI MDS2 JAVA WS Core (OGSI) WS-Index (OGSI) OGSI C Bindings OGSI Python Bindings (contributed) pyGlobus (contributed) XIO Security Data Management Resource Management Information Services WS Core Planned Components in GT 4.0 GSI New GridFTP Pre-WS GRAM WS-Security RFT (WSRF) WS-GRAM (WSRF) CAS (WSRF) RLS CSF (contribution) SimpleCA OGSI-DAI MDS2 JAVA WS Core (WSRF) WS-Index (WSRF) C WS Core (WSRF) pyGlobus (contributed) Authz Framework Security XIO Data Management Resource Management Information Services WS Core GT2 GRAM Trusted by server and user Root Gatekeeper Host Creds Server Invoke Requestor JobManager User Account GT3 GRAM Globus account (non-privileged) Trusted by server MMJFS Invoke Requestor Root HostEnv Starter GRIM Host Creds Server JobManager GRIM Creds User Account Trusted by server GT4 GRAM • http://wwwunix.globus.org/toolkit/docs/3.2/gram/ws/d eveloper/architecture.html Conclusion (2005) • Just a new toy for scientists or a revolution ? • Huge investments • Classical issues but a functional, operational and applicative context very complex • Complexity from heterogeneity, wide distribution, security, dynamicity • Functional shift from computing to information • Data management in grids : not prehistory, but still middle-ages • Still much work to do !!! • A global framework for grid computing, pervasive computing and Web services ? Conclusion (2008) • Just a new toy for scientists or a revolution ? Neither of them ! • Huge investments : too much ?! • Classical issues but a functional, operational and applicative context very complex • Complexity from heterogeneity, wide distribution, security, dynamicity • Functional shift from computing to information • Data management in grids : not middle-ages, but not 21st century => services • Supercomputing is still alive • A global framework for grid computing, pervasive computing and Web services… and SOA ! • Some convergence between P2P and grid computing • The industrialization time And in 2009, a new paradigm : cloud computing • • • • • • Amazon, Google, Microsoft… even L’Oréal ! Cloud = Internet Resource = service provided by the network Behind the scene : a grid ??? Old wine in a new bottle ? Cf. « exposés »