The CC – GRID? Era Infinite processing, storage, and bandwidth @ zero cost and latency Gordon Bell (gbell@microsoft.com) Bay Area Research Center Microsoft Corporation Copyright Gordon Bell Clusters & Grids Copyright Gordon Bell Clusters & Grids deja’ vu ARPAnet: c1969 – – To use remote programs & data Got FTP & mail. Machines & people overloaded. NREN: c1988 – – – BW => Faster FTP for images, data Latency => Got http://www… Tomorrow => Gbit communication BW, latency <’90 Mainframes, minis, PCs/WSs >’90 very large, dep’t, & personal clusters VAX: c1979 one computer/scientist Beowulf: c1995 one computer/scientist 1960s batch: opti-use allocate, schedule,$ 2000s GRID: opti-use allocate, schedule, $ Copyright Gordon Bell Clusters & Grids (… security, management, etc.) Some observations Clusters are purchased, managed, and used as a single, one room facility. Clusters are the “new” computers. They present unique, interesting, and critical problems… then Grids can exploit them. Clusters & Grids have little to do with one another… Grids use clusters! Clusters should be a good simulation of tomorrow’s Grid. Distributed PCs: Grids or Clusters? Perhaps some clusterable problems can be solved on a Grid… but it’s unlikely. – Lack of understanding clusters & variants Copyright Gordon Bell Clusters & Grids – Socio-, political, eco- wrt to Grid. Some observations GRID was/is an exciting concept … – – They can/must work within a community, organization, or project. What binds it? “Necessity is the mother of invention.” Taxonomy… interesting vs necessity – – – – – – – – Cycle scavenging and object evaluation (e.g. seti@home, QCD) File distribution/sharing aka IP theft (e.g. Napster, Gnutella) Databases &/or programs and experiments (astronomy, genome, NCAR, CERN) Workbenches: web workflow chem, bio… Single, large problem pipeline… e.g. NASA. Exchanges… many sites operating together Transparent web access aka load balancing Facilities managed PCs operating as cluster! Copyright Gordon Bell Clusters & Grids Grids: Why? The problem or community dictates a Grid Economics… thief or scavenger Research funding… that’s where the problems are Copyright Gordon Bell Clusters & Grids In a 5-10 years we can/will have: more powerful personal computers – processing 10-100x; multiprocessors-on-a-chip – – – adequate networking? PCs now operate at 1 Gbps – – 4x resolution (2K x 2K) displays to impact paper Large, wall-sized and watch-sized displays low cost, storage of one terabyte for personal use ubiquitous access = today’s fast LANs Competitive wireless networking One chip, networked platforms e.g. light bulbs, cameras Some well-defined platforms that compete with the PC for mind (time) and market share watch, pocket, body implant, home (media, set-top) Inevitable, continued cyberization… the challenge… interfacing platforms and people. SNAP … c1995 Scalable Network And Platforms A View of Computing in 2000+ We all missed the impact of WWW! Platform Gordon Bell Copyright Gordon Bell Network Jim Gray Clusters & Grids How Will Future Computers Be Built? Thesis: SNAP: Scalable Networks and Platforms • Upsize from desktop to world-scale computer • based on a few standard components Platform Network Because: • Moore’s law: exponential progress • Standardization & Commoditization • Stratification and competition When: Sooner than you think! • Massive standardization gives massive use • Economic forces are enormous p e r f o r m a n c e Volume drives simple, cost to standard price for platforms Stand-alone Desk tops high speed interconnect Distributed workstations PCs Clustered Computers 1-4 processor mP MPPs 1-20 processor mP price Copyright Gordon Bell Clusters & Grids Computing SNAP built entirely from PCs Wide-area global network Mobile Nets Wide & Local Area Networks for: terminal, PC, workstation, & servers Person Person servers servers (PCs) (PCs) ??? TC=TV+PC home ... (CATV or ATM or satellite) Portables Legacy mainframes & Legacy minicomputers mainframe & terms servers & minicomputer servers & terminals scalable computers built from PCs Centralized &Centralized departmental uni& mP servers & departmental (UNIX & NT) servers buit from PCs A space, time (bandwidth), & generation scalable environment Copyright Gordon Bell Clusters & Grids SNAP Architecture---------- Copyright Gordon Bell Clusters & Grids GB plumbing from the baroque: evolving from the 2 dance-hall model Mp — S — Pc : | : |——————-- S.fiber ch. — Ms | : |— S.Cluster |— S.WAN — vs, MpPcMs — S.Lan/Cluster/Wan — : Copyright Gordon Bell Clusters & Grids Modern scalable switches … also hide a supercomputer Scale from <1 to 120 Tbps 1 Gbps ethernet switches scale to 10s of Gbps, scaling upward SP2 scales from 1.2 Copyright Gordon Bell Clusters & Grids Interesting “cluster” in a cabinet 366 servers per 44U cabinet – – – Single processor 2 - 30 GB/computer (24 TBytes) 2 - 100 Mbps Ethernets ~10x perf*, power, disk, I/O per cabinet ~3x price/perf Network services… Linux based *42, 2 processors, 84 Ethernet, 3 TBytes Copyright Gordon Bell Clusters & Grids ISTORE Hardware Vision System-on-a-chip enables computer, memory, without significantly increasing size of disk 5-7 year target: MicroDrive:1.7” x 1.4” x 0.2” 2006: ? 1999: 340 MB, 5400 RPM, 5 MB/s, 15 ms seek 2006: 9 GB, 50 MB/s ? (1.6X/yr capacity, 1.4X/yr BW) Integrated IRAM processor 2x height Connected via crossbar switch growing like Moore’s law 16 Mbytes; ; 1.6 Gflops; 6.4 Gops 10,000+ nodes in one rack! 100/board = 1 TB; 0.16 Tf Copyright Gordon Bell Clusters & Grids The Disk Farm? or a System On a Card? 14" The 500GB disc card An array of discs Can be used as 100 discs 1 striped disc 50 FT discs ....etc LOTS of accesses/second of bandwidth A few disks are replaced by 10s of Gbytes of RAM and a processor to run Apps!! Copyright Gordon Bell Clusters & Grids The virtuous cycle of bandwidth supply and demand Increased Demand Increase Capacity (circuits & bw) Standards Create new service Telnet & FTP EMAIL Lower response time WWW Audio Voice! Video Redmond/Seattle, Map of GrayWABell Prize results single-thread single-stream tcp/ip New York via 7 hops desktop-to-desktop …Win 2K out of the box performance* Arlington, VA San Francisco, CA 5626 km 10 hops Copyright Gordon Bell Clusters & Grids The Promise of SAN/VIA:10x in 2 years http://www.ViArch.org/ Yesterday: – – – 10 MBps (100 Mbps Ethernet) ~20 MBps tcp/ip saturates 2 cpus round-trip latency ~250 µs Now – 250 Time µs to Send 1KB 200 150 Transmit receivercpu sender cpu 100 Wires are 10x faster Myrinet, Gbps Ethernet, ServerNet,… – Fast user-level communication - tcp/ip ~ 100 MBps 10% cpu round-trip latency is 15 us 1.6 Gbps demoed Copyright Gordon Bellon a WAN 50 0 100Mbps Gbps SAN Clusters & Grids 1st, 2nd, 3rd, or New Paradigm for science? Labscape Copyright Gordon Bell Clusters & Grids Labscape Copyright Gordon Bell Clusters & Grids Courtesy of Dr. Thomas Sterling, Caltech Lessons from Beowulf An experiment in parallel computing systems Established vision- low cost high end computing Demonstrated effectiveness of PC clusters for some (not all) classes of applications Provided networking software Provided cluster management tools Conveyed findings to broad community Tutorials and the book Provided design standard to rally community!* Standards beget: books, trained people, software … virtuous cycle* *observations Courtesy, Thomas Sterling, Caltech. The End How can GRIDs become a non- ad hoc computer structure? Get yourself an application community! Copyright Gordon Bell Clusters & Grids