The French ACI GRID* initiative and its latest achievements using Grid'5000 * Member of the non-usual suspect National Grid Initiatives (W. Gentzcsh) Thierry PRIOL Director of the French ACI GRID Thierry.Priol@inria.fr Franck Cappello Director ACI GRID Grid’5000 Franck.Cappello@inria.fr Contents An overview of the ACI GRID initiative and some of the projects The Grid’5000 project Concluding remarks Objectives of the ACI GRID Push the national research effort on grid computing Increase the visibility of French Grid research activities Fund medium and long term research activities in Grid using a bottomup approach (nothing imposed !) Stimulate synergies between research groups Encourage experimentations with the available grid infrastructure being deployed through national projects Develop new software for experimental grid infrastructures New system and programming environments for distributed computing or large data management The French ACI GRID initiative and its latest achievements using Grid'5000 2 Organisation Programme Director : Thierry Priol Scientific council : Brigitte Plateau Budget: ~8 M€* (including 8 PhD grants) 2001 since January 2004, M. Cosnard before This is incentive funding (around 98.3 M€ estimated by GridCoord) 2002 2003 2004 2005 2006 2007 Call1 18 projects 2.25M€ Call2 12 projects 3M€ Call3-G5K 5 projects 1M€ Call4-G5K 6 projects 1M€ * Computing and network infrastructures, permanent researchers salaries already paid by the state The French ACI GRID initiative and its latest achievements using Grid'5000 3 Several kinds of projects Multidisciplinary project Software project Young research team Collaboration International Testbed The French ACI GRID initiative and its latest achievements using Grid'5000 4 ACI GRID projects Middleware, tools, environments TAG (S. Genaud, LSIIT) ANCG (N. Emad, PRISM) DOC-G (V-D. Cung, UVSQ) Métacompil (G-A. Silbert, ENMP) CiGri-CIMENT (L. Desbat, UjF) Mecagrid (H. Guillard, INRIA) GLOP (V. Breton, IN2P3) GRID5000 (F. Cappello, INRIA) Support for disseminations Networks and communication COUMEHY (C. Messager, LTHE) - Climate GenoGrid (D. Lavenier, IRISA) - Bioinformatics GeoGrid (J-C. Paul, LORIA) - Oil reservoir IDHA (F. Genova, CDAS) - Astronomy Guirlande-fr (L. Romary, LORIA) - Language GriPPS (C. Blanchet, IBCP) - Bioinformatics HydroGrid (M. Kern, INRIA) - Environment Medigrid (J. Montagnat, INSA-Lyon) - Medical Grid Testbeds Compiler techniques Applications Algorithms CGP2P (F. Cappello, LRI/CNRS) ASP (F. Desprez, ENS Lyon/INRIA) EPSN (O. Coulaud, INRIA) PADOUE (A. Doucet, LIP6) MEDIAGRID (C. Collet, IMAG) DARTS (S. Frénot, INSA-Lyon) Grid-TLSE (M. Dayde, ENSEEIHT) RMI (C. Pérez, IRISA) CONCERTO (Y. Maheo, VALORIA) CARAML (G. Hains, LIFO) ARGE (A. Schaff, LORIA) GRID2 (J-L. Pazat, IRISA/INSA) DataGRAAL (Y. Denneulin, IMAG) RESAM (C. Pham, ENS Lyon) ALTA (C. Pérez, IRISA/INRIA) The French ACI GRID initiative and its latest achievements using Grid'5000 5 GRID ASP: Client/Server Approach for Simulation over the Grid Call 1 (2001 - 2003) Project coordinator: F. Desprez Participants E-mail : Frederic.Desprez@inria.fr Web: http://graal.ens-lyon.fr/ASP/ ENS-Lyon, INRIA, LORIA, LIFC, IRCOM, LST, SRSMC, Physique Lyon1 Objectives Building a portable set of tools for computational servers in a ASP (Application Service Provider) model (Software database distributed) AGENT DIET (Distributed Interactive Engineering Toolbox) Scheduler Performance database (distributed) AGENT physic, geology, chemistry, electronic device simulation, robotics, … Scheduler Focus on issues …… Porting several different applications grpc_call(MatPROD, A, B); resource localization (hierarchical) scheduling, performance evaluation (both static and dynamic), data persistence, data redistribution between servers S3 AGENT Scheduler Local S1 Scheduler Clients Direct connection Batch system S2 C, C++, Scilab, Web browser Visualization server The French ACI GRID initiative and its latest achievements using Grid'5000 6 TLSE : Web expert site for sparse matrices based on grid infrastructure Call 2 (2002 - 2004) Project coordinator: Michel Daydé Participants E-mail : Michel.Dayde@enseeiht.fr Web: http://www.enseeiht.fr/lima/tlse/ CERFACS, FéRIA-IRIT, LIP-ENSL, LaBRI, CEA, CNES, EADS, EDF, IFP Objectives Design a Web expertise site for sparse matrices Dissemination of our expertise in sparse linear algebra Easy access and experimentation with software and tools: only statistics are provided, not computing resources Exploitation of the computing power of the grid for parametric studies Contents : Sparse matrix software, Bibliography, Collections of sparse matrices The French ACI GRID initiative and its latest achievements using Grid'5000 7 CGP2P: Global P2P Computing “Fusion of Desktop Grid and P2P systems” Client (PC) result request Client (PC) accept request provide Coordination system result accept provide Requests concern computations or data Call 1 (2001 - 2003) Coordinator: Franck Cappello, email: fci@lri.fr Web: www.lri.fr/~fci Potential communications for parallel applications (MPI) Service provider (PC) Services concern computation or data Participants: LRI, LIFL, ID IMAG, LARIA, LAL, EADS Service provider (PC) Desktop Grid middleware: XtremWeb Fault tolerant MPI: MPICH-V Sandbox for binary applications: SBSLM Large Scale Storage: US Workflow/Dataflow language: YML Scheduling simulator: SimLargeGrid French ADSL analysis Theoretical proof of the protocols Convergence/Integration with GRID (GT3) The French ACI GRID initiative and its latest achievements using Grid'5000 8 RMI: Programming the Grid with distributed Objects Call 1 (2001 - 2003) Project coordinator: C. Pérez Mpich DSM MPI OmniORB MICO Orbacus Orbix/E CORBA Kaffe JVM CERTI HLA PadicoTM Services Participants E-mail : Christian.Perez@irisa.fr Web: http://www.irisa.fr/Grid-RMI/en/ Mome IRISA, ENS-Lyon, LIFL, INRIA, EADS Personality Layer PadicoTM Core Objectives For parallel programming: Message based runtimes (MPI, PVM, …) DSM-based runtimes (TreadMarks, …) For distributed programming RPC/RMI based middleware (DCE, CORBA, Java) Middleware for discrete-event based simulation (HLA) Get the maximum performance from the network! Offer zero-copy mechanism to middleware/runtime Madeleine Network s Internal engine Provide a framework to combine various communication middleware and runtimes Portability across networks TCP Myrinet SCI Marcel I/O aware multi-threading Multithreading Bandwidth (MB/s) 250 CORBA/Myrinet-2000 MPI/Myrinet-2000 200 Java/Myrinet-2000 CORBA/SCI MPI/SCI TCP/Ethernet-100 150 100 50 0 1 10 100 1000 10000 100000 10000001E+07 The French ACI GRID initiative and its latest achievements using Grid'5000 Message size (bytes) 9 HydroGrid: distributed code coupling in hydrogeology, using software components Call 2 (2002 - 2004) Project coordinator: M. Kern flow transport chemistry E-mail : Michel.Kern@inria.fr Web:http://www-rocq.inria.fr/~kern/ HydroGrid/HydroGrid-en.html. Participants: INRIA Rocquencourt, INRIA Rennes, IMFS Strasbourg, Geosciences Rennes meshing Objectives Simulate flow and transport of pollutants in the subsurface Take into account couplings between different physical phenomena Couple parallel codes on a grid, software from ACI GRID RMI project Links between numerical and software coupling Example applications: reactive transport (top), density driven flow (bottom), fractured media visualization Density driven flow : mass fraction The French ACI GRID initiative and its latest achievements using Grid'5000 10 Main feedback from call1 & call2 projects Lack of a large scale testbed available for experiments Several small scale testbeds at the regional level Duplication of effort when setting up testbeds Various type of Grids Need to be able to experiment various software layers Incompatible with a production Grid The French ACI GRID initiative and its latest achievements using Grid'5000 11 How to proceed… log(cost & coordination) Grid’5000 Major challenge Data Grid eXplorer WANinLab Emulab Challenging Reasonable SimGrid MicroGrid Bricks NS, etc. DAS PlanetLab Naregi Testbed NSF GENI Model Protocol proof log(realism) math simulation emulation live systems In the first ½ of 2003, the design and development of an experimental platform for Grid researchers was decided: Grid’5000 as a real life system The French ACI GRID initiative and its latest achievements using Grid'5000 12 The Grid’5000 project Grid’5000 Objective Deploy an experimental large scale computing infrastructure to allow any kind of experiments Experiments of any kind of grids (Virtual Supercomputer, Desktop Grid, …) Experimental conditions Configuration of the entire software stack from the application to the operating system Computer testbed Grid testbed Grid Application Grid Middleware OS (…) Grid BIOS Application Middleware OS (…) BIOS Grid’5000 The French ACI GRID initiative and its latest achievements using Grid'5000 14 The Grid’5000 Project Building a nation wide experimental platform for Large scale Grid & P2P experiments 9 geographically distributed sites Every site hosts a cluster (from 256 CPUs to 1K CPUs) All sites are connected by RENATER (French Res. and Edu. Net.) RENATER hosts probes to trace network load conditions Design and develop a system/middleware environment for safely test and repeat experiments Use the platform for Grid experiments in real life conditions Port and test applications, develop new algorithms Address critical issues of Grid system/middleware: Address critical issues of Grid Networking Programming, Scalability, Fault Tolerance, Scheduling High performance transport protocols, Qos Investigate original mechanisms P2P resources discovery, Desktop Grids The French ACI GRID initiative and its latest achievements using Grid'5000 15 Planning June 2003 Today 2004 2005 2006 2007 5000 Discussions Prototypes Installations Clusters & Net 3500 Preparation Calibration ~2500 Experiments International collaborations CoreGRID Processors 2000 1250 2300 CPUs First Experiments Funded The French ACI GRID initiative and its latest achievements using Grid'5000 16 Grid’5000 foundations: Measurements and condition injection Quantitative metrics : Performance: Execution time, throughput, overhead, QoS (Batch, interactive, soft real time, real time). Scalability:Resource occupation (CPU, memory, disc, network), Applications algorithms, Number of users, Number of resources. Fault-tolerance:Tolerance to very frequent failures (volatility), tolerance to massive failures (a large fraction of the system disconnects), Fault tolerance consistency across the software stack. Experimental Condition injection : Background workloads: CPU, Memory, Disk, network, Traffic injection at the network edges. Stress: high number of clients, servers, tasks, data transfers, Perturbation: artificial faults (crash, intermittent failure, memory corruptions, Byzantine), rapid platform reduction/increase, slowdowns, etc. Allow users running their favorite measurement tools and experimental condition injectors The French ACI GRID initiative and its latest achievements using Grid'5000 17 Application Programming Environments Application Runtime Grid or P2P Middleware Operating System Measurement tools Experimental conditions injector Grid’5000 principle: A highly reconfigurable experimental platform Networking Let users create, deploy and run their software stack, including the software to test and their environment + measurement tools + experimental conditions injectors The French ACI GRID initiative and its latest achievements using Grid'5000 18 Experiment workflow Log into Grid’5000 Import data/codes yes Build an env. ? no Reserve nodes corresponding to the experiment Reserve 1 node Reboot node (existing env.*) Reboot the nodes in the user experimental environment (optional) Adapt env. Transfer params + Run the experiment Reboot node Env. OK ? Collect experiment results yes *Available on all sites: Fedora4all Ubuntu4all Debian4all Exit Grid’5000 The French ACI GRID initiative and its latest achievements using Grid'5000 19 Should be red today at Orsay ! Grid’5000 map Lille: 500 (106) Nancy: 500 (94) Rennes 518 (518) Orsay 1000 (684) Bordeaux 500 (96) Toulouse 500 (116) Lyon 500 (252) Grenoble 500 (270) Sophia Antipolis 500 (434) The French ACI GRID initiative and its latest achievements using Grid'5000 20 Rennes Lyon Sophia Grenoble Bordeaux Orsay The French ACI GRID initiative and its latest achievements using Grid'5000 21 Hardware Configuration The French ACI GRID initiative and its latest achievements using Grid'5000 22 Grid’5000 network provided by RENATER QuickTime™ et un décompresseur TIFF (non compressé) sont requis pour visionner cette image. QuickTime™ et un décompresseur TIFF (non compressé) sont requis pour visionner cette image. 10 Gbps Dark fiber Dedicated Lambda Fully isolated traffic! The French ACI GRID initiative and its latest achievements using Grid'5000 23 Grid’5000 as an Instrument A high security for Grid’5000 and the Internet, despite the deep reconfiguration feature A software infrastructure allowing users to access Grid’5000 from any Grid’5000 site and have simple view of the system A user has a single account on Grid’5000, Grid’5000 is seen as a cluster of clusters, 9 (1 per site) unsynchronized home directories A reservation/scheduling tools allowing users to select nodes and schedule experiments Grid’5000 is confined: communications between sites are isolated from the Internet and Vice versa (level2 MPLS, Dedicated lambda). a reservation engine + batch scheduler (1 per site) + OAR Grid (a co-reservation scheduling system) QuickTime™ et un décompresseur TIFF (non compressé) sont requis pour visionner cette image. A user toolkit to reconfigure the nodes software image deployment and node reconfiguration tool QuickTime™ et un décompresseur TIFF (non compressé) sont requis pour visionner cette image. The French ACI GRID initiative and its latest achievements using Grid'5000 24 OS Reconfiguration techniques Reboot OR Virtual Machines Reboot: Virtual Machine: Remote control with IPMI, RSA, etc. No need for reboot Disc repartitioning, if necessary Virtual machine technology Selection not so easy Xen has some limitations: Reboot or Kernel switch (Kexec) -Xen3 in “initial support” status for intel vtx -Xen2 does not support x86/6 -Many patches not supported -High overhead on high speed Net. Currently we use Reboot, but Xen will be used in the default environment. Let users select its experimental environment: Fully dedicated or shared within virtual machine The French ACI GRID initiative and its latest achievements using Grid'5000 25 Resource usage: activity (Feb’06) April: just before SC’06 and Grid’06 deadlines French ACI GRID initiative and its latest achievements using Grid'5000 Activity >The70% 26 Community: Grid’5000 users 345 registered Users Coming from 45 Laboratories. IBCP IMAG INRIA-Alpes INSA-Lyon Prism-Versailles BRGM INRIA CEDRAT IME/USP.br INF/UFRGS.br LORIA Univ.Nantes UFRJ.br France-telecom Sophia LABRI LRI CS-VU.nl LIFL IDRIS FEW-VU.nl ENS-Lyon AIST.jp Univ. Nice EC-Lyon UCD.ie ENSEEIHT IRISA LIPN-Paris XIII CICT RENATER U-Picardie IRIT IN2P3 EADS CERFACS LIFC EPFL.ch ENSIACET LIP6 LAAS INP-Toulouse UHP-Nancy ICPS-Strasbourg The French ACI GRID initiative and its latest achievements using Grid'5000 SUPELEC 27 About 230+ Experiments The French ACI GRID initiative and its latest achievements using Grid'5000 28 About 200 Publications The French ACI GRID initiative and its latest achievements using Grid'5000 29 A series of Events The French ACI GRID initiative and its latest achievements using Grid'5000 30 Grid@work (Octobre 10-14 2005) • Series of conferences and tutorials including Grid PlugTest (N-Queens and Flowshop Contests). The objective of this event was to bring together ProActive users, to present and Grid@work discuss current and Don’t miss 2006 future features of the ProActive Grid platform, and to test the deployment interoperability of in and Nov. 26 to Dec. 1 ProActive Grid applications on various Grids. http://www.etsi.org/plugtests/Upcoming/GRID2006/GRID2006.htm The N-Queens Contest (4 teams) where the aim was to find the number of solutions to the N-queens problem, N being as big as possible, in a limited amount of time The Flowshop Contest (3 teams) 1600 CPUs in total: 1200 provided by Grid’5000 + 50 by the other Grids (EGEE, DEISA, NorduGrid) + 350 CPUs on clusters. The French ACI GRID initiative and its latest achievements using Grid'5000 31 Experiment: Geophysics: Seismic Ray Tracing in 3D mesh of the Earth Stéphane Genaud , Marc Grunberg , and Catherine Mongenet IPGS: “Institut de Physique du Globe de Strasbourg” Building a seismic tomography model of the Earth geology using seismic wave propagation characteristics in the Earth. Seismic waves are modeled from events detected by sensors. Ray tracing algorithm: waves are reconstructed from rays traced between the epicenter and one sensor. A MPI parallel program composed of 3 steps 1) Master-worker: ray tracing and mesh update by each process with blocks of rays successively fetched from the master process, 2) all-to all communications to exchange submesh in-formation between the processes, 3) merging of cell information of the submesh associated with each process. Reference: 32 CPUs The French ACI GRID initiative and its latest achievements using Grid'5000 32 Jxta DHT scalability Edge Peer rdv Peer 1) It requires 2 hours to contact all “rendez vous” peers With the per“DHT” default setting, the view of every rendez vous Goals:2)study of a JXTA peers limited to JXTA onlyDHT 300 rendez vous peers “Rendez vous” is peers form the Performance of thisof DHT? 3) The view every “rendez vous” peer is very unstable Organization of a JXTA overlay (peerview protocol) Scalability of this DHT? Each rendezvous peer has a local view of other rendezvous peers Loosely-Consistent DHT between rendezvous peers Mechanism for ensuring convergence of local views Benchmark: time for local views to converge Up to 580 nodes on 6 sites The French ACI GRID initiative and its latest achievements using Grid'5000 33 A C D B Fully Distributed Batch Scheduler G E F L. Rilling et al., 2006 Motivation : evaluation of a fully distributed resource allocation service (batch scheduler) Grid'5000 site Bordeaux Orsay Rennes (paraci) Rennes (parasol) Rennes (paravent) Sophia Vigne : Unstructured network, flooding (random walk optimized for scheduling). Experiment: a bag of 944 homogeneous tasks / 944 CPU #CPU used 82 344 98 62 198 160 Execution time 1740s 1959s 1594s 2689s 2062s 1905s Synthetic sequential code (monte carlo application). Measure of the mean execution time for a task (computation time depends on the resource) Measure the overhead compared with an ideal execution (central coordinator) Objective: 1 task per CPU. Tested configuration: Result : Submission interval Average execution time Overhead 944 CPUs Bordeaux (82), Orsay(344), Rennes Paraci (98), Rennes Parasol (62), Rennes Paravent (198), Sophia (160) Duration: 12 hours 5s 2057s 4,80% 10s 2042s 4,00% 20s 2039s 3,90% The French ACI GRID initiative and its latest achievements using Grid'5000 34 Large Scale experiment of DIET: A GridRPC environment 1120 clients submitted more than 45 000 REAL GridRPC requests (dgemm matrix multiply) to GridRPC servers 7 sites : Lyon, Orsay,Rennes, Lilles, Sophia,Toulouse,Bordeaux 8 clusters - 585 machines - 1170 CPUs. Objectives : - Proove that the DIET environment is scallable. - Test the functionnalities of DIET at large scale Raphaël The Bolze French ACI GRID initiative and its latest achievements using Grid'5000 35 Solving the Flow-Shop Scheduling Problem E. Talbi, N. Melab, 2006 “one of the hardest challenge problems in combinatorial optimization” • • • • • • Schedule a set of jobs on a set of machines minimizing the makespan. Jobs order must be respected and machines can execute 1 job at a time. Complexity is very high for large size instances (possible schedules). Exhaustive enumeration of all combinations would take several years. The challenge is thus to reduce the number of explored solutions. But the problem cannot be efficiently solved without computational grids. New Grid exact method based on the Branch-and-Bound algorithm (Talbi, melab, et al.), combining new approaches of combinatorial algorithmic, grid computing, load balancing and fault tolerance. Problem: 50 jobs on 20 machines, optimally solved for the 1st time, with 1245 CPUs (peak) Using simultaneously Grid5000 and other clusters Involved Grid5000 sites (6): Bordeaux, Lille, Orsay, Rennes, Sophia-Antipolis and Toulouse. The optimal solution required a wall-clock time of 1 month and 3 weeks. The French ACI GRID initiative and its latest achievements using Grid'5000 36 TCP limits over 10Gb/s links P. Primet et al., 2006 Highlighting TCP stream interaction issues in very high bandwidth links (congestion colapse) and poor bandwidth fairness Grid’5000 10Gb/s connections evaluation Evaluation of TCP variants over Grid’5000 10Gb/s links (BIC TCP, H-TCP, weswood…) Interaction of 10 1Gb/s TCP streams, over the 10Gb/s Rennes-Nancy link, during 1 hour. Aggregated bandwidth of 9,3 Gb/s on a time interval of few minutes. Then a very high drop of the bandwidth on one of the connection. The French ACI GRID initiative and its latest achievements using Grid'5000 37 Grid’5000 main achievements in 2006 A large scale and highly reconfigurable Grid experimental platform Used by Master student Ph. D., PostDoc and researchers (and results are presented in their reports, thesis, papers, etc.) Grid’5000 offers in 2006: Grid’5000 results in 2006: Grid’5000 is opened to others communities in 2006 (CoreGRID) Grid’5000 winter school (Philippe d’Anfray, ~January 2007) Connection to other Grid experimental platforms 300+ users ~200 publications, ~230 planned experiments Grid’5000 is opened to French Grid researchers since July 2005 9 clusters distributed over 9 sites in France, about 10 Gigabit/s (directional) of bandwidth the capability for all users to reconfigure the platform [protocols/OS/Middleware/Runtime/Application] Netherlands (from October 2006), Japan (under discussion) Sustainability ensured by INRIA after 2007 The French ACI GRID initiative and its latest achievements using Grid'5000 38 Concluding remarks GRID in its wider definition The French ACI GRID lead to many European initiatives Computing, data and knowledge Grids, P2P Not only focusing on the use of Supercomputers… neither on Globus… An emphasis on middleware but also on applications/algorithms to make them Gridaware Several groups of the ACI GRID projects are involved in EU funded projects (almost absent in FP5, involved in 10 projects in FP6 and leader of 3 projects) The idea to set up a Network of Excellence in Grid Research came from the ACI GRID (M. Cosnard) On-going discussions to have a European dimension of Grid’5000 funded under the 7th Framework Programme Funding of Grid research yet available Through the “Agence National de la Recherche” Oct. 2006 1500 CPUs 2600 CPUs To get more information about the ACI-GRID DAS3 Grid’5000 http://www-sop.inria.fr/aci/grid Thierry.Priol@inria.fr The French ACI GRID initiative and its latest achievements using Grid'5000 39 Announcement Project consultation Meeting Bridging Global Computing with Grid (BIGG) In conjunction with November 28-29, 2006 The objective of the workshop is to provide a direct gateway facilitating interactions between two different communities: Grid & Global Computing The French ACI GRID initiative and its latest achievements using Grid'5000 40