Weaving the World-Wide Grid [Marketplace]: “Economic Paradigm for Distributed Resource Management and Scheduling for Service-Oriented Computing” Rajkumar Buyya Melbourne, Australia www.buyya.com/ecogrid WW Grid 2 Vision: Grid for Service Oriented Computing? WW Grid Nimrod-G World Wide Grid! 3 Overview A quick glance at Grid computing Resource Management challenges for next generation Grid computing A Glance at Approaches to Grid computing. Grid Architecture for Computational Economy Nimrod-G -- Grid Resource Broker Scheduling Experiments on Grid the World Wide Grid: both Real and Simulation Conclusions Economy Grid 4 Scheduling Economics Scalable HPC: Breaking Administrative Barriers & new challenges 2100 2100 2100 2100 2100 2100 2100 ? P E R F O R M A N C E 5 2100 2100 Administrative Barriers •Individual •Group •Department •Campus •State •National •Globe •Inter Planet •Galaxy Desktop SMPs or SuperComputers Local Cluster Enterprise Cluster/Grid Global Cluster/Grid Inter Planetary Grid! Why SC? Large Scale Explorations need them—Killer Applications. Solving grand challenge applications using modeling, simulation and analysis Aerospace Internet & Ecommerce Life Sciences 6 CAD/CAM Digital Biology Military Applications What is Grid ? A paradigm/infrastructure that allows sharing, selection, & aggregation of geographically distributed resources: Wide area Computers – PCs, workstations, clusters, supercomputers, laptops, notebooks, mobile devices, PDA, etc; Software – e.g., ASPs renting expensive special purpose applications on demand; Catalogued data and databases – e.g. transparent access to human genome database; Special devices/instruments – e.g., radio telescope – SETI@Home searching for life in galaxy. People/collaborators. [depending on their availability, capability, cost, and user QoS requirements] for solving large-scale problems/applications. 7 Thus enabling the creation of “virtual enterprises” (VEs) P2P/Grid Applications-Drivers Distributed HPC (Supercomputing): High-Capacity/Throughput Computing: Collaborative design, Data exploration, education. Service Oriented Computing (SOC): 8 Medical instrumentation & Mission Critical. Collaborative Computing: Drug Design, Particle Physics, Stock Prediction... On-demand, realtime computing: Sharing digital contents among peers (e.g., Napster) Remote software access/renting services: Application service provides (ASPs) & Web services. Data-intensive computing: Large scale simulation/chip design & parameter studies. Content Sharing (free or paid) Computational science. Computing as Competitive Utility: New paradigm, new industries, and new business. Building and Using Grids require Globus Nimrod-G 9 Services that enable the execution of a job on a resource in different admistrative domain. Security mechanisms that permit resources to be accessed only by authorized users. App/Data Security (?) – A must for commercial users (protecting from GSPs/other users). (New) programming tools that make our applications Grid Ready!. Tools that can translate the requirements of an application/user into the requirements of computers, networks, and storage. Tools that perform resource discovery, trading, selection/allocation, scheduling and distribution of jobs and collects results. Resource Management Challenges in Grid Computing Environments A Typical Grid Computing Environment Grid Information Service Grid Resource Broker R2 R3 R5 Application database R4 RN Grid Resource Broker R6 Grid Information Service 11 R1 Resource Broker What users want ? Users in Grid Economy & Strategy Grid Consumers Execute jobs for solving varying problem size and complexity Benefit by selecting and aggregating resources wisely Tradeoff timeframe and cost Grid Providers Contribute (“idle”) resource for executing consumer jobs Benefit by maximizing resource utilisation Tradeoff local requirements & market opportunity 12 Strategy: minimise expenses Strategy: maximise return on investment Sources of Complexity in Grid for Resource Management and Scheduling 13 Size (large number of nodes, providers, consumers) Heterogeneity of resources (PCs, Workstations, clusters, and supercomputers, instruments, databases, software) Heterogeneity of fabric management systems (single system image OS, queuing systems, etc.) Heterogeneity of fabric management polices Heterogeneity of application requirements (CPU, I/O, memory, and/or network intensive) Heterogeneity in resource demand patterns (peak, off-peak, ...) Applications need different QoS at different times (time critical results). The utility of experimental results varies from time to time. Geographical distribution of users & located different time zones Differing goals (producers and consumers have different objectives and strategies) Unsecure and Unreliable environment Need Grid tools for managing Security Computational Economy Uniform Access Resource Discovery Resource Allocation & Scheduling System Management Data locality Application Development Tools 14 Network Management Traditional approaches to resource management & scheduling are NOT useful for Grid ? They use centralised policy that need Due to too many heterogenous parameters in the Grid it is impossible to define/get: 15 complete state-information and common fabric management policy or decentralised consensus-based policy. system-wide performance matrix and common fabric management policy that is acceptable to all. “Economic” paradigm proved as an effective institution in managing decentralization and heterogeneity that is present in human economies! Hence, we propose/advocate the use of “computational economy” principles in the management of resources and scheduling computations on the Grid. Benefits of Computational Economies It provides a nice paradigm for managing self interested and selfregulating entities (resource owners and consumers) Helps in regulating supply-and-demand for resources. User-centric / Utility driven: Value for money! Scalable: 16 Services can be priced in such a way that equilibrium is maintained. No need of central coordinator (during negotiation) Resources(sellers) and also Users(buyers) can make their own decisions and try to maximize utility and profit. Adaptable It helps in offering different QoS (quality of services) to different applications depending the value users place on them. It improves the utilisation of resources It offers incentive for resource owners for being part of the grid! It offers incentive for resource consumers for being good citizens There is large body of proven Economic principles and techniques available, we can easily leverage it. New challenges of Computational Economy Resource Owners Resource Consumers 17 How do I decide prices ? (economic models?) How do I specify them ? How do I enforce them ? How do I advertise & attract consumers ? How do I do accounting and handle payments? ….. How do I decide expenses ? How do I express QoS requirements ? How I trade between timeframe & cost ? …. Any tools, traders & brokers available to automate the process ? mix-and-match Object-oriented Internet/partial-P2P Network enabled Solvers Market/Computational Economy 18 Building an Economy Grid (Next Generation Grid Computing!) 19 To enable the creation and promotion of: Grid Marketplace (competitive) ASP Service Oriented Computing ... And let users focus on their own work (science, engineering, or commerce)! GRACE: A Reference Grid Architecture for Computational Economy Grid Bank Programming Environments Applications Sign-on Info ? Grid Explorer Job Control Agent Grid Market Services Health Monitor Grid Node N Secure Schedule Advisor QoS Grid Node1 Pricing Algorithms Trade Server Trade Manager Trading … Deployment Agent JobExec Grid Resource Broker Misc. services Resource Allocation Storage Grid Middleware Services Accounting Resource Reservation R1 Grid Consumer 20 Information Service R2 … Rm Grid Service Providers GRACE: A Reference Grid Architecture for Computational Economy Grid Bank Sign-on Info ? Grid Explorer Application Job Control Agent Grid Market Services Information Server(s) Health Monitor Grid Node N Secure Schedule Advisor QoS Grid Node1 Pricing Algorithms Trade Server Trade Manager … Deployment Agent Trading JobExec Grid User Grid Resource Broker R1 21See PDPTA 2000 paper! Misc. services Resource Allocation Storage Grid Middleware Services Accounting Resource Reservation R2 … Rm Grid Service Providers Economic Models Price-based: Supply,demand,value, wealth of economic system Commodity Market Model Posted Price Model Bargaining Model Tendering (Contract Net) Model Auction Model Proportional Resource Sharing Model Monopoly (one provider) and Oligopoly (few players) consumers may not have any influence on prices. Bartering 22 English, first-price sealed-bid, second-price sealed-bid (Vickrey), and Dutch (consumer:low,high,rate; producer:high, low, rate) Shareholder Model Partnership Model See SPIE ITCom 2001 paper!: with Heinz Stockinger, CERN! Cost Model Without cost model any shared system becomes un-managable Charge users more for remote facilities than their own Choose cheaper resources before more expensive ones Cost units (G$) may be 23 Dollars Shares in global facility Stored in bank Cost Matrix @ Grid site X Non-uniform costing Encourages use of local resources first Real accounting system User 1 1 can control machine usage Machine 5 Machine 1 User 5 2 1 3 Resource Cost = Function (cpu, memory, disk, network, software, QoS, current demand, etc.) 24 Simple: price based on peaktime, offpeak, discount when less demand, .. Nimrod-G: The Grid Resource Broker Soft Deadline and Budget-based Economy Grid Resource Broker for Parameter (Task Farming Applications) Processing on Grids Nimrod/G : A Grid Resource Broker A resource broker for managing, steering, and executing task farming (parameter sweep/SPMD model) applications on Grid based on deadline and computational economy. Based on users’ QoS requirements, our Broker dynamically leases services at runtime depending on their quality, cost, and availability. Key Features 26 A single window to manage & control experiment Persistent and Programmable Task Farming Engine Resource Discovery Resource Trading Scheduling & Predications Generic Dispatcher & Grid Agents Transportation of data & results Steering & data management Accounting Parametric Computing (What Users think of Nimrod Power) Parameters Age 23 23 28 28 19 10 -4000000 Hair Clean Beard Goatee Clean Moustache Clean Too much Multiple Runs Same Program Multiple Data 27Courtesy: Anand Natrajan, University of Virginia Magic Engine Killer Application for the Grid! Sample P-Sweep/Task Farming Applications Bioinformatics: Drug Design / Protein Modelling Sensitivity experiments on smog formation Ecological Modelling: Combinatorial Control Strategies Optimization: for Cattle Tick Meta-heuristic Data Mining parameter estimation Computer Graphics: Ray Tracing High Energy Physics: Searching for Rare Events Electronic CAD: Field Programmable Gate Arrays VLSI Design: Finance: SPICE Simulations Investment Risk Analysis Civil Engineering: Building Design Automobile: Crash Simulation 28 Network Simulation Aerospace: Wing Design astrophysics Distributed Drug Design: Data Intensive Computing on the Grid A Virtual Laboratory environment for “Molecular Docking for Drug Design” on the Grid. It provides tools for screening millions of chemical compounds (molecules) in the Chemical DataBase (CDB) to identify those having potential use in drug design (acts as inhibitor). In collaboration with: 29 http://www.buyya.com/vlab Kim Branson, Structural Biology, Walter and Eliza Hall Institute (WEHI) Docking Application Input data configuration file score_ligand minimize_ligand multiple_ligands random_seed anchor_search torsion_drive clash_overlap conformation_cutoff_factor torsion_minimize match_receptor_sites random_search . . . . . . . . . . . . maximum_cycles ligand_atom_file receptor_site_file score_grid_prefix vdw_definition_file chemical_definition_file chemical_score_file flex_definition_file flex_drive_file ligand_contact_file ligand_chemical_file ligand_energy_file 30 yes yes no 7 no yes 0.5 3 yes no yes 1 S_1.mol2 ece.sph ece parameter/vdw.defn parameter/chem.defn parameter/chem_score.tbl parameter/flex.defn parameter/flex_drive.tbl dock_cnt.mol2 dock_chm.mol2 dock_nrg.mol2 Molecule to be screened Parameterize Dock input file (use Nimrod Tools: GUI/language) 31 score_ligand minimize_ligand multiple_ligands random_seed anchor_search torsion_drive clash_overlap conformation_cutoff_factor torsion_minimize match_receptor_sites random_search . . . . . . . . . . . . maximum_cycles ligand_atom_file receptor_site_file score_grid_prefix vdw_definition_file chemical_definition_file chemical_score_file flex_definition_file flex_drive_file ligand_contact_file ligand_chemical_file ligand_energy_file $score_ligand $minimize_ligand $multiple_ligands $random_seed $anchor_search $torsion_drive $clash_overlap $conformation_cutoff_factor $torsion_minimize $match_receptor_sites $random_search Molecule to be screened $maximum_cycles ${ligand_number}.mol2 $HOME/dock_inputs/${receptor_site_file} $HOME/dock_inputs/${score_grid_prefix} vdw.defn chem.defn chem_score.tbl flex.defn flex_drive.tbl dock_cnt.mol2 dock_chm.mol2 dock_nrg.mol2 Create Dock PlanFile 1. Define parameters and their value parameter database_name label "database_name" text select oneof "aldrich" "maybridge" "maybridge_300" "asinex_egc" "asinex_epc" "asinex_pre" "available_chemicals_directory" "inter_bioscreen_s" "inter_bioscreen_n" "inter_bioscreen_n_300" "inter_bioscreen_n_500" "biomolecular_research_institute" "molecular_science" "molecular_diversity_preservation" "national_cancer_institute" "IGF_HITS" "aldrich_300" "molecular_science_500" "APP" "ECE" default "aldrich_300"; parameter CDB_SERVER text default "bezek.dstc.monash.edu.au"; parameter CDB_PORT_NO text default "5001"; parameter score_ligand text default "yes"; parameter minimize_ligand text default "yes"; parameter multiple_ligands text default "no"; parameter random_seed integer default 7; parameter anchor_search text default "no"; parameter torsion_drive text default "yes"; parameter clash_overlap float default 0.5; parameter conformation_cutoff_factor integer default 5; parameter torsion_minimize text default "yes"; parameter match_receptor_sites text default "no"; . . . . . . . . . . . . parameter maximum_cycles integer default 1; parameter receptor_site_file text default "ece.sph"; parameter score_grid_prefix text default "ece"; parameter ligand_number integer range from 1 to 2000 step 1; Molecules to be screened 32 Create Dock PlanFile 2. Define the task that each job needs to do task nodestart copy ./parameter/vdw.defn node:. copy ./parameter/chem.defn node:. copy ./parameter/chem_score.tbl node:. copy ./parameter/flex.defn node:. copy ./parameter/flex_drive.tbl node:. copy ./dock_inputs/get_molecule node:. copy ./dock_inputs/dock_base node:. endtask task main node:substitute dock_base dock_run node:substitute get_molecule get_molecule_fetch node:execute sh ./get_molecule_fetch node:execute $HOME/bin/dock.$OS -i dock_run -o dock_out copy node:dock_out ./results/dock_out.$jobname copy node:dock_cnt.mol2 ./results/dock_cnt.mol2.$jobname copy node:dock_chm.mol2 ./results/dock_chm.mol2.$jobname copy node:dock_nrg.mol2 ./results/dock_nrg.mol2.$jobname endtask 33 Nimrod-G Broker Automating Distributed Processing 90 80 70 60 50 40 30 20 10 0 East West North South 1st Qtr 2nd Qtr 3rd Qtr 4th Qtr 34 Compose, Submit, & Play! Nimrod & Associated Family of Tools Remote Execution Server (on demand Nimrod Agent) P-sweep App. Composition: Nimrod/Enfusion Resource Management and Scheduling: Nimrod-G Broker Design Optimisations: Nimrod-O 90 App. Composition and 80 70 Online Visualization: Active Sheets 60 50 Grid Simulation in Java: 40 30 GridSim 20 10 Drug Design on Grid: 0 Virtual Lab 35 East West North South 1st Qtr 2nd Qtr 3rd Qtr 4th Qtr File Transfer Server A Glance at Nimrod-G Broker Nimrod/G Client Nimrod/G Client Nimrod/G Client Nimrod/G Engine Schedule Advisor Trading Manager Grid Store Grid Dispatcher Grid Explorer Grid Middleware TM Globus, Legion, Condor, etc. TS GE GIS Grid Information Server(s) RM & TS RM & TS G RM & TS C L G Globus enabled node. See HPCAsia 2000 paper! 36 L Legion enabled node. RM: Local Resource Manager, TS: Trade Server G C L Condor enabled node. Nimrod/G Grid Broker Architecture Legacy Applications Customised Apps (Active Sheet) P-Tools (GUI/Scripting) (parameter_modeling) Monitoring and Steering Portals Farming Engine Meta-Scheduler Algorithm1 Programmable Entities Management Resources IP hourglass! Jobs Tasks Schedule Advisor Channels Agents JobServer Database Grid Explorer Dispatcher & Actuators Globus-A Legion-A Condor-A Legion Condor Globus Computers 37 ... AlgorithmN AgentScheduler Local Schedulers PC/WS/Clusters Nimrod-G Clients ... Trading Manager P2P-A P2P ... Storage Condor/LL/NQS Nimrod-G Broker GTS Networks Database ... GMD ... G-Bank Instruments Radio Telescope Middleware Fabric Deadline A Nimrod/G Monitor Cost 66 Arlington Alexandria Legion hosts She na nd o a h Rive r 64 64 81 Ra p p a ha n no c k Po to m a c Rive r Rive r Roanoke Ja m e s Rive r Ap p o m a to x Rive r Richmond Hampton Norfolk Virginia Beach Portsmouth Chesapeake Newport News 77 VIRGINIA 85 Globus Hosts Bezek is in both Globus and Legion Domains 38 User Requirements: Deadline/Budget 39 Another User Interface: Active Sheet for Spreadsheet Processing on Grid Nimrod Proxy Nimrod/G 40 41 Nimrod/G Interactions Nimrod-G Grid Broker Grid Scheduler Grid Tools And Applications Task Farming Engine Grid Dispatcher Grid Info Server Grid Trade Server Process Server Local Resource Manager Nimrod Agent User Process Do this in 30 min. for $10? File Server User Node 42 File access Grid Node Compute Node Adaptive Scheduling Steps Discover Establish Resources Rates Distribute Jobs 43 Compose & Schedule Discover More Resources Evaluate & Reschedule Meet requirements ? Remaining Jobs, Deadline, & Budget ? Deadline and Budget Constrained Scheduling Algorithms 44 Algorithm/ Strategy Execution Time Execution Cost (Deadline, D) (Budget, B) Cost Opt Limited by D Minimize Cost-Time Opt Minimize when possible Minimize Time Opt Minimize Limited by B Conservative-Time Opt Minimize Limited by B, but all unprocessed jobs have guaranteed minimum budget Application Scheduling Experiments on the WorldWide Grid Task Farming Applications on WW Grid World Wide Grid WW Grid The World Wide Grid Sites Cardiff/UK Portsmoth/UK Manchester, UK TI-Tech/Tokyo ETL/Tsukuba AIST/Tsukuba ANL/Chicago USC-ISC/LA UTK/Tennessee UVa/Virginia Dartmouth/NH BU/Boston UCSD/San Diego Santiago/Chile 46 EUROPE: ZIB/Germany PC2/Germany AEI/Germany Lecce/Italy CNR/Italy Calabria/Italy Pozman/Poland Lund/Sweden CERN/Swiss CUNI/Czech R. Vrije: Netherlands Kasetsart/Bangkok Singapore Monash/Melbourne VPAC/Melbourne World Wide Grid (WWG) WW Grid Australia North America ANL: SGI/Sun/SP2 USC-ISI: SGI UVa: Linux Cluster UD: Linux cluster UTK: Linux cluster UCSD: Linux PCs BU: SGI IRIX Monash U. : Cluster Nimrod/G VPAC: Alpha Globus+Legion GRACE_TS Solaris WS Globus/Legion GRACE_TS Europe Asia WW Grid Internet Tokyo I-Tech.: Ultra WS AIST, Japan: Solaris Cluster Kasetsart, Thai: Cluster NUS, Singapore: O2K Globus + GRACE_TS Chile: Cluster 47 Globus + GRACE_TS South America ZIB: T3E/Onyx AEI: Onyx Paderborn: HPCLine Lecce: Compaq SC CNR: Cluster Calabria: Cluster CERN: Cluster CUNI/CZ: Onyx Pozman: SGI/SP2 Vrije U: Cluster Cardiff: Sun E6500 Portsmouth: Linux PC Manchester: O3K Globus + GRACE_TS Experiment-1: Peak and Off-peak Workload: Deadline: 1 hrs. and budget: 800,000 units Strategy: Minimize Cost and meet the deadline Execution Cost with cost optimisation 48 165 jobs, each need 5 minute of cpu time AU Peaktime:471205 (G$) AU Offpeak time: 427155 (G$) Resources Selected & Price/CPU-sec. 49 Resource Type & Size Owner and Location Grid services Peaktime Cost (G$) Offpeak cost Linux cluster (60 nodes) Monash, Australia Globus/Condor 20 5 IBM SP2 nodes) ANL, Chicago, US Globus/LL 5 10 Sun (8 nodes) ANL, Chicago, US Globus/Fork 5 10 SGI (96 nodes) ANL, Chicago, US Globus/Condor-G 15 15 SGI (10 nodes) ISI, LA, US Globus/Fork 10 20 (80 Execution @ AU Peak Time Linux cluster - Monash (20) 12 Sun - ANL (5) SP2 - ANL (5) SGI - ANL (15) SGI - ISI (10) 10 Jobs 8 6 4 2 Time (minutes) 50 54 52 51 49 47 46 44 43 41 40 38 37 36 34 33 31 30 28 27 25 24 22 21 20 19 17 15 14 12 10 9 8 6 4 3 1 0 0 Execution @ AU Offpeak Time Linux cluster - Monash (5) 12 Sun - ANL (10) SP2 - ANL (10) SGI - ANL (15) SGI - ISI (20) 10 Jobs 8 6 4 2 Time (minutes) 51 60 57 55 53 50 48 46 43 41 39 37 35 32 31 28 26 23 21 19 17 15 13 10 8 7 4 3 0 0 Experiment-2 Setup Workload: Deadline: 2 hrs. and budget: 396000 G$ Strategies: 1. Minimise cost 2. Minimise time Execution: 52 165 jobs, each need 5 minute of CPU time Optimise Cost: 115200 (G$) (finished in 2hrs.) Optimise Time: 237000 (G$) (finished in 1.25 hr.) In this experiment: Time-optimised scheduling run costs double that of Cost-optimised. Users can now trade-off between Time Vs. Cost. Resources Selected & Price/CPU-sec. 53 Resource & Location Grid services & Fabric Cost/CPU sec.or unit No. of Jobs Executed Linux Cluster-Monash, Melbourne, Australia Globus, GTS, Condor 2 64 153 Linux-Prosecco-CNR, Pisa, Italy Globus, GTS, Fork 3 7 1 Linux-Barbera-CNR, Pisa, Italy Globus, GTS, Fork 4 6 1 Solaris/Ultas2 TITech, Tokyo, Japan Globus, GTS, Fork 3 9 1 SGI-ISI, LA, US Globus, GTS, Fork 8 37 5 Sun-ANL, Chicago,US Globus, GTS, Fork 7 42 4 Time_Opt Cost_Opt. Total Experiment Cost (G$) 237000 115200 Time to Complete Exp. (Min.) 70 119 Resource Scheduling for DBC Time Optimization Condor-Monash Linux-Prosecco-CNR Linux-Barbera-CNR Solaris/Ultas2-TITech SGI-ISI Sun-ANL 12 No. of Tasks in Execution 10 8 6 4 2 54 Time (in Minute) 68 72 60 64 52 56 44 48 36 40 28 32 20 24 16 8 12 4 0 0 Resource Scheduling for DBC Cost Optimization Condor-Monash Linux-Prosecco-CNR Linux-Barbera-CNR Solaris/Ultas2-TITech SGI-ISI Sun-ANL 14 No. of Tasks in Execution 12 10 8 6 4 2 Time (in Minute) 55 10 2 10 8 11 4 96 90 84 78 72 66 60 54 48 42 36 30 24 18 6 12 0 0 Experiment-3 Setup: Using GridSim Workload Synthesis: Exploration of many scenarios: 56 Deadline: 100 to 3600 simulation time, step = 500 Budget: 500 to 22000 G$, step = 1000 DBC Strategies: 200 jobs, each job processing requirement = 10K MI or SPEC with random variation from 0-10%. Cost Optimisation Time Optimisation Resources: Simulated WWG resources Simulated WWG Resources 57 Resource Name in Simulation Simulated Resource Characteristics Vendor, Resource Type, Node OS, No of PEs Equivalent Resource in Worldwide Grid (Hostname, Location) A PE SPEC/ MIPS Rating Resource Manager Type Price (G$/PE time unit) MIPS per G$ R0 Compaq, AlphaServer, CPU, OSF1, 4 grendel.vpac.org, VPAC, Melb, Australia 515 Time-shared 8 64.37 R1 Sun, Ultra, Solaris, 4 R2 Sun, Ultra, Solaris, 4 R3 Sun, Ultra, Solaris, 2 R4 hpc420.hpcc.jp, AIST, Tokyo, Japan hpc420-1.hpcc.jp, AIST, Tokyo, Japan hpc420-2.hpcc.jp, AIST, Tokyo, Japan 377 Time-shared 4 94.25 377 Time-shared 3 125.66 377 Time-shared 3 125.66 Intel, Pentium/VC820, Linux, 2 barbera.cnuce.cnr.it, CNR, Pisa, Italy 380 Time-shared 2 190.0 R5 SGI, Origin 3200, IRIX, 6 onyx1.zib.de, ZIB, Berlin, Germany 410 Time-shared 5 82.0 R6 SGI, Origin 3200, IRIX, 16 Onyx3.zib.de, ZIB, Berlin, Germany 410 Time-shared 5 82.0 R7 SGI, Origin 3200, IRIX, 16 410 Space-shared 4 102.5 R8 Intel, Pentium/VC820, Linux, 2 380 Time-shared 1 380.0 R9 SGI, Origin 3200, IRIX, 4 (accessible) green.cfs.ac.uk, Manchester, UK 410 Time-shared 6 68.33 R10 Sun, Ultra, Solaris, 8, pitcairn.mcs.anl.gov, ANL, Chicago, USA 377 Time-shared 3 125.66 mat.ruk.cuni.cz, Charles U., Prague, Czech Republic marge.csm.port.ac.uk , Portsmouth, UK DBC Cost Optimisation 100 200 4000 600 180 1100 160 1600 140 2100 3500 2600 120 3600 80 1100 Deadline Time 2000 Utilised 1600 17000 40 20 2100 1500 21000 60 600 2500 3100 Gridlets 100 100 3000 21000 17000 1000 13000 3600 Budget 0 5000 Deadline Deadline 600 100 9000 3600 3100 2600 2100 1600 1100 3600 3100 2600 2100 1600 1100 600 100 9000 3100 13000 500 Budget 0 2600 5000 25000 20000 100 600 15000 Budget Spent 10000 1100 1600 2100 2600 100 1100 2100 3100 5000 9000 11000 13000 0 7000 58 15000 Budget 3100 3600 17000 21000 19000 5000 Deadline DBC Time Optimisation 4000 3500 200 3000 180 2500 Budget Budget 3100 3600 2100 3600 1100 20000 3100 3100 100 11000 2600 0 21000 Deadline 8000 3600 3100 2600 2100 1600 1100 600 100 5000 0 14000 20 17000 40 2100 500 19000 2600 2100 1600 1000 17000 60 1600 1100 15000 1100 13000 120 Gidlets 100 Completed 80 600 1500 11000 600 100 9000 140 Time 2000 Utilised 7000 100 5000 160 Deadline Time-Optimise Time Optimise 25000 20000 100 600 1100 15000 Budget Spent 1600 2100 10000 21000 17000 5000 2600 3100 3600 13000 Budget 0 100 Deadline 600 3600 3100 2600 2100 1600 1100 59 9000 5000 Time Optimise Comparison: D = 3100, B varied 3500 3000 Deadline Spent 2500 Cost Opt 2000 Cost Optimisation Time Optimisation 1500 Time Opt 1000 500 21 00 0 19 00 0 17 00 0 15 00 0 13 00 0 11 00 0 Budget Execution Time vs. Budget 20000 Completion Time Budget Spent Deadline = 3100 90 00 25000 70 00 50 00 0 15000 Cost Optimisation Time Optimisation 10000 5000 60 Deadline = 3100 Budget 21 00 0 19 00 0 17 00 0 15 00 0 13 00 0 11 00 0 90 00 70 00 0 50 00 Execution Cost vs. Budget Processing Expenses Conclude with a comparison to the Electrical Grid……….. Where we are ???? Courtesy: Domenico Laforenza Alessandro Volta in Paris in 1801 inside French National Institute shows the battery while in the presence of Napoleon I Fresco by N. Cianfanelli (1841) (Zoological Section "La Specula" of National History Museum of Florence University) What ?!?! Oh, mon Dieu ! This is a mad man… 63 ….and in the future, I imagine a Worldwide Power (Electrical) Grid …... 2002 - 1801 = 201 Years 64 Electric Grid Management and Delivery methodology is highly advanced Production Utility Consumption Regional Grid Central Grid Local Grid Regional Grid Local Grid Whereas, our Computational Grid is in primitive/infancy state? 65 Can we Predict its Future ? ” I think there is a world market for about five computers.” Thomas J. Watson Sr., IBM Founder, 1943 66 Summary and Conclusion 67 Grid Computing is emerging as a next generation computing platform for solving large scale problems through sharing of geographically distributed resources. Resource management is a complex undertaking as systems need to be adaptive, scalable, competitive,…, and driven by QoS. We proposed a framework based on “computational economies” for resource allocation and for regulating supply-and-demand for resources. Scheduling experiments on the World Wide Grid demonstrate our Nimrod-G broker ability to dynamically lease services at runtime based on their quality, cost, and availability depending on consumers QoS requirements. Easy to use tools for creating Grid applications are essential to attracting and getting application community on board. The use of economic paradigm for resource management and scheduling is essential for pushing Grids into mainstream computing and weaving the World-Wide Grid Marketplace! Download Software & Information Nimrod & Parameteric Computing: Economy Grid & Nimrod/G: http://www.buyya.com/ecogrid/wwg/ Cluster and Grid Info Centres: 68 http://www.buyya.com/gridsim/ World Wide Grid (WWG) testbed: http://www.buyya.com/vlab/ Grid Simulation (GridSim) Toolkit (Java based): http://www.buyya.com/ecogrid/ Virtual Laboratory Toolset for Drug Design: http://www.csse.monash.edu.au/~davida/nimrod/ www.buyya.com/cluster/ || www.gridcomputing.com Selected GridSim Users! 69 Final Word? 70 Backup Slides Further Information Books: IEEE Task Force on Cluster Computing 72 www.gridforum.org IEEE/ACM CCGrid’xy: www.ccgrid.org http://www.ieeetfcc.org Global Grid Forum High Performance Cluster Computing, V1, V2, R.Buyya (Ed), Prentice Hall, 1999. The GRID, I. Foster and C. Kesselman (Eds), Morgan-Kaufmann, 1999. CCGrid 2002, Berlin: ccgrid2002.zib.de Grid workshop - www.gridcomputing.org Further Information Cluster Computing Info Centre: Grid Computing Info Centre: http://computer.org/dsonline/gc Compute Power Market Project 73 http://www.gridcomputing.com IEEE DS Online - Grid Computing area: http://www.buyya.com/cluster/ http://www.ComputePower.com 74 Deadline and Budget-based Cost Minimization Scheduling 1. 2. 3. 75 Sort resources by increasing cost. For each resource in order, assign as many jobs as possible to the resource, without exceeding the deadline. Repeat all steps until all jobs are processed. Deadline and Budget Constraint (DBC) Time Minimization Scheduling 1. 2. 3. 4. 76 For each resource, calculate the next completion time for an assigned job, taking into account previously assigned jobs. Sort resources by next completion time. Assign one job to the first resource for which the cost per job is less than the remaining budget per job. Repeat all steps until all jobs are processed. (This is performed periodically or at each scheduling-event.) DBC Conservative Time Min. Scheduling 1. 2. 3. 4. 77 Split resources by whether cost per job is less than budget per job. For the cheaper resources, assign jobs in inverse proportion to the job completion time (e.g. a resource with completion time = 5 gets twice as many jobs as a resource with completion time = 10). For the dearer resources, repeat all steps (with a recalculated budget per job) until all jobs are assigned. [Schedule/Reschedule] Repeat all steps until all jobs are processed. Deadline-based Costminimization Scheduling M - Resources, N - Jobs, D - deadline Note: Cost of any Ri is less than any of Ri+1 …. Or Rm Ct - Time when accessed (Time now) Ti - Job runtime (average) on Resource i (Ri) [updated periodically] RL: Resource List need to be maintained in increasing order of cost Ti is acts as a load profiling parameter. Ai - number of jobs assigned to Ri , where: Ai = Min (No.Unassigned Jobs, No. Jobs Ri can complete by remaining deadline) ALG: Invoke Job Assignment() periodically until all jobs done. Job Assignment()/Reassignment(): 78 No.UnAssignedJobsi = Diff( N, (A1+…+Ai-1)) JobsRi consume = RemainingTime (D- Ct) DIV Ti Establish ( RL, Ct , Ti , Ai ) dynamically – Resource Discovery. For all resources (I = 1 to M) { Assign Ai Jobs to Ri , if required} What is Grid ? An infrastructure that logically couples distributed resources: 79 Wide Computers – PCs, workstations, clusters, area supercomputers, laptops, notebooks, mobile devices, PDA, etc; Software – e.g., ASPs renting expensive special purpose applications on demand; data Catalogued data and databases – e.g. transparent archives access to human genome database; Special devices – e.g., radio telescope – SETI@Home searching for life in galaxy. People/collaborators. and presents them as an integrated global resource for solving large-scale problems. It enables the creation of virtual enterprise (VE) for resource sharing and aggregation. Virtual Enterprise A temporary alliance of enterprises or organizations that come together to share resources and skills, or competencies in order to better respond to business opportunities or challenges, and who cooperation is supported by computer networks. 80 Many Testbeds ? & who pays ?, who regulates supply and demand ? GUSTO (decommissioned) WW Grid World Wide Grid Legion Testbed 81 NASA IPG Testbeds so far -- observations Who contributed resources & why ? How long ? Short term: excitement is lost, too much of admin. Overhead (Globus inst+), no incentive, policy change,… What we need ? Grid Marketplace! 82 Volunteers: for fun, challenge, fame, charismatic apps, public good like distributed.net & SETI@Home projects. Collaborators: sharing resources while developing new technologies of common interest – Globus, Legion, Ninf, Ninf, MC Broker, Lecce GRB,... Unless you know lab. leaders, it is impossible to get access! Regulates supply-and-demand, offers incentive for being players, simple, scalable solution, quasideterministic – proven model in real-world. Grid Open Trading Protocols Trade Manager Get Connected Call for Bid(DT) API Reply to Bid (DT) Trade Server Negotiate Deal(DT) …. Pricing Rules Confirm Deal(DT, Y/N) DT - Deal Template: - resource requirements (TM) - resource profile (TS) - price (any one can set) - status - change the above values - negotiation can continue - accept/decline - validity period 83 Cancel Deal(DT) Change Deal(DT) Get Disconnected Layered Grid Architecture APPLICATIONS Applications and Portals Scientific … Prob. Solving Env. Collaboration Engineering Web enabled Apps USER LEVEL MIDDLEWARE Development Environments and Tools Languages/Compilers Libraries Debuggers … Monitors Web tools Resource Management, Selection, and Aggregation (BROKERS) CORE MIDDLEWARE Distributed Resources Coupling Services Security Information Data Process Trading … QoS SECURITY LAYER Local Resource Managers Operating Systems Queuing Systems Libraries & App Kernels … Internet Protocols Networked Resources across Organizations 84 Computers Networks Storage Systems Data Sources … Scientific Instruments Grid Components Applications and Portals Scientific Collaboration Engineering … Prob. Solving Env. Development Environments and Tools Languages Libraries Debuggers Monitoring Resource Brokers Web enabled Apps … Distributed Resources Coupling Services Security Information Process Resource Trading Market Info Web tools … QoS Grid Apps. Grid Tools Grid Middleware Local Resource Managers Operating Systems Queuing Systems Libraries & App Kernels … TCP/IP & UDP Networked Resources across Organisations 85 Computers Clusters Storage Systems Data Sources … Scientific Instruments Grid Fabric Economy Grid = Globus + GRACE Applications Science Engineering Commerce … Portals ActiveSheet High-level Services and Tools Cactus MPI-G CC++ … Nimrod Parametric Language Nimrod-G Broker Higher Level Resource Aggregators Grid Apps. Grid Tools Core Services MDS GRAM GASS DUROC GARA GMD GBank GTS Middleware Globus Security Interface (GSI) Condor LSF 86 GRD PBS QBank eCash Local Services Grid JVM TCP UDP Linux Irix Solaris Grid Fabric Virtual Drug Design A Virtual Lab for “Molecular Modeling for Drug Design” on P2P Grid Data Replica Catalogue Grid Market Directory “Give me list PDBs sources Of type aldrich_300?” “Screen 2K molecules in 30min. for $10” Grid Info. Service GTS Resource Broker “mol.5 please?” GTS (RB maps suitable Grid nodes and Protein DataBank) PDB2 GTS GTS PDB1 87 GTS (GTS - Grid Trade Server) P-study Applications -Characteristics 88 Code (Single Program: sequential or threaded) High Resource Requirements Long-running Instances Numerous Instances (Multiple Data) High Computation-to-Communication Ratio Embarrassingly/Pleasantly Parallel Many Grid Projects & Initiatives Australia Europe Japan 89 Nimrod-G GridSim Virtual Lab Active Sheets DISCWorld ..new coming up UNICORE MOL UK eScience Poland MC Broker EU Data Grid EuroGrid MetaMPI Dutch DAS XW, JaWS and many more... USA Cycle Stealing & .com Initiatives Distributed.net SETI@Home, …. Entropia, UD, Parabon,…. Public Forums Ninf DataFarm and many more... Globus Legion OGSA Javelin AppLeS NASA IPG Condor-G Jxta NetSolve AccessGrid and many more... Global Grid Forum P2P Working Group IEEE TFCC Grid & CCGrid conferences http://www.gridcomputing.com Using Pure Globus/Legion commands Do all yourself! (manually) Total Cost:$??? 90 Build Distributed Application & Scheduler Build App case by case basis Complicated Construction 91 E.g., AppLeS/MPI based Total Cost:$??? Experiment-3 Setup Workload: Deadline: 4 hrs. and budget: 250,000 G$ Strategies: 1. Minimise cost 2. Minimise time Execution: 92 200 jobs, each need 10 minute of CPU time Optimise Cost: 141,869 (G$) (finished in 150min./2.5hrs) Optimise Time: 199,968 (G$) (finished in 250min.) In this experiment: Time-optimised scheduling run costs double that of Cost-optimised. Users can now trade-off between Time Vs. Cost. Organization & Location Monash University, Melbourne, Australia Sun: Ultra-1, 1 node, bezek.dstc.monash.edu.au VPAC, Melbourne, Australia Compaq: Alpha, 4 CPU, OSF1, grendel.vpac.org AIST, Tokyo, Japan Sun: Ultra-4, 4 nodes, Solaris, hpc420.hpcc.jp AIST, Tokyo, Japan Sun: Ultra-4, 4 nodes, Solaris, hpc420-1.hpcc.jp AIST, Tokyo, Japan Sun: Ultra-2, 2 nodes, Solaris, hpc420-2.hpcc.jp University of Lecce, Italy Compaq: Alpa cluster, OSF1, sierra0.unile.it Institute of the Italian National Research Council, Pisa, Italy Institute of the Italian National Research Council, Pisa, Italy Konrad-Zuse-Zentrum Berlin, Berlin, Germany Konrad-Zuse-Zentrum Berlin, Berlin, Germany Charles University, Prague, Czech Republic 93 Vendor, Resource Type, # CPU, OS, hostname Unknown: Dual CPU PC, Linux, barbera.cnuce.cnr.it Unknown: Dual CPU PC, Linux, novello.cnuce.cnr.it SGI: Onyx2K, IRIX, 6, onyx1.zib.de SGI: Onyx2K, IRIX, 16 onyx3.zib.de SGI: Onyx2K, IRIX, mat.ruk.cuni.cz University of Portsmouth, UK Unknown: Dual CPU PC, Linux, marge.csm.port.ac.uk University of Manchester, UK SGI: Onyx3K, 512 node, IRIX, green.cfs.ac.uk Argonne National Lab, Chicago, USA SGI: IRIX lemon.mcs.anl.gov Argonne National Lab, Chicago, USA Sun: Ultra –8, Solaris, 8, pitcairn.mcs.anl.gov Grid Services, Fabric, and Role Price (G$ per CPU sec.) Number of Jobs Executed TimeOp t CostOp t -- -- -- 1 7 59 2 14 2 1 7 3 1 8 50 2 0 0 1 9 1 1 0 0 2 38 5 3 32 7 2 20 11 1 1 25 2 15 12 2 0 0 1 49 25 Total Experiment Cost (G$) 199968 141869 Time to Finish Expt. (Min.) 150 258 Globus, NimrodG, CDB Server, Fork (Master node) Globus, GTS, Fork (Worker node) Globus, GTS, Fork (Worker node) Globus, GTS, Fork (Worker node) Globus, GTS, Fork (Worker node) Globus, GTS, RMS (Worker node) Globus, GTS, Fork (Worker node) Globus, GTS, Fork (Worker node) Globus, GTS, Fork (Worker node) Globus, GTS, Fork (Worker node) Globus, GTS, Fork (Worker node) Globus, GTS, Fork (Worker node) Globus, GTS, NQS, (Worker node) Globus, GTS, Fork (Worker node) Globus, GTS, Fork (Worker node) Jobs Completed for DBC Time Optimization 60 grendel.vpac.org hpc420.hpcc.jp hpc420-1.hpcc.jp hpc420-2.hpcc.jp sierra0.unile.it barbera.cnuce.cnr.it novello.cnuce.cnr.it onyx1.zib.de onyx3.zib.de mat.ruk.cuni.cz marge.csm.port.ac.uk green.cfs.ac.uk lemon.mcs.anl.gov pitcairn.mcs.anl.gov No. of Jobs Finished 50 40 30 20 10 Time (in min.) 94 145.67 140.22 134.44 128.69 122.59 116.41 110.41 103.79 97.25 90.42 83.87 76.71 70.48 65.08 59.75 53.37 47.12 40.87 35.00 29.62 24.06 17.85 11.75 5.43 0.00 0 Jobs Completed for DBC Cost Optimization 40000 35000 grendel.vpac.org hpc420.hpcc.jp hpc420-1.hpcc.jp hpc420-2.hpcc.jp sierra0.unile.it barbera.cnuce.cnr.it novello.cnuce.cnr.it onyx1.zib.de onyx3.zib.de mat.ruk.cuni.cz marge.csm.port.ac.uk green.cfs.ac.uk lemon.mcs.anl.gov pitcairn.mcs.anl.gov Budget Spent 30000 25000 20000 15000 10000 5000 95 Time (in min.) 246.08 230.95 215.84 199.87 182.46 166.51 149.73 133.09 115.82 99.38 82.70 66.22 49.05 31.40 15.05 0.00 0 Active Sheet: Microsoft Excel Spreadsheet Processing on Grid Nimrod Proxy Nimrod-G World-Wide Grid 96