The PC+ Era Infinite processing, memory, and bandwidth @ zero cost. Gordon Bell Bay Area Research Center Microsoft Corporation Copyright Gordon Bell & Jim Gray PC+ The only thing that matters at the end of the day is, it’s a great building. The Highly Probable Future c2025 83 items from J. Coates, Futurist, Vol. 84, 1994 8.4 B, english speaking, personally tagged & identified, prosthetic assisted and/or mutant, tense people who have access & control of their medical records Everything will be smart, responsive to environment. – – – – – A “managed”, physical and man-made world – – Sensing of everything… challenge for science & engineering! Fast broadband network Smart appliances & AI Tele-all: shop, vote, meet, work, etc. Robots do everything, but there may be conflict with labor… Reliable weather reports “Many natural disasters e.g. floods, earthquakes, will be mitigated, controlled or prevented” Nobel prize to “economist” for “value of information” Copyright Gordon Bell & Jim Gray PC+ Copyright Gordon Bell & Jim Gray PC+ PC At An Inflection Point PCs Non-PC devices and Internet Copyright Gordon Bell & Jim Gray PC+ TV/AV Mobile Companions Consumer PCs The Dawn Of The PC-Plus Era, Not The Post-PC Era… Communications Copyright Gordon Bell & Jim Gray Automation & Security Household Management PC+ PCTV a.k.a. MilliBillg Using PCs to drive large screens e.g. tv sets, Plasma Panels Gordon Bell Jim Gemmell Bay Area Research Center Microsoft Research Copyright Gordon Bell & Jim Gray Copyright 1999 Microsoft Corporation PC+ Another big bang? Internet to TV and audio: The Net, PC meet the TV “milliBill” Home CATV Video capture PC broadcasts are mixed into home CATV in analog and/or MPEG digital Settop box Analog/digital cable distribution Ethernet Home network Basic ideas: 1. PC records or plays thru video cable channels. 2. PC “broadcasts” art images, webcams, presentations, videos, DVDs, etc. 3. Ethernet not cable? PC will prevail for the next decade as the dominant platform… to HPCC community its COTS! Moore’s Law to reduce price Lack of last mile bandwidth to move pictures, data, and interact favors home mainframes aka PCs Very large disks (1TB by 2005) to “store everything” personal Screens to enhance use Home entertainment server… Office and portable requirements Copyright Gordon Bell & Jim Gray PC+ Etc. My betting record: No losses … so far (>5year old bets) TMC & MPP will not be dominant by 1995 Video On Demand will not exist by 1995 AT&T acquisition of NCR will not be successful 10K desktop-desktop will not exist by 1/2001 1 B internet users by 1/2001 or 1/2002 Cars won’t drive themselves by 2005 PCs continue with 2 digit growth through 2002 Copyright Gordon Bell & Jim Gray PC+ Outline Future predictions… 2020 and the world Caveat: How far out can we see? WWW just >5 years old Background: SNAP at RCI 3/95 conference, Albuquerque My own history of supercomputing… data/compute The hardware scene in 5-10 years? – – – Processing and Moore’s Law Networking Disks Challenges: – – – – OSS Communities with dbases & hs nets ASP: workbenches If simulation is third mode after theory, expt, what is 4th? connection with the experimental world for data; then control… biologist workbench where work is being done. Copyright Gordon Bell & Jim Gray PC+ SNAP … as given at RCI, 3/95 Scalable Network And Platforms A View of Computing in 2000+ (I missed the impact of WWW) Gordon Bell Jim Gray Platform Network Copyright Gordon Bell & Jim Gray PC+ How Will Future Computers Be Built? Thesis: SNAP: Scalable Networks and Platforms • upsize from desktop to world-scale computer • based on a few standard components • similar to NEC’s Computers & Communications 1983 vision Platform Network Because: • Moore’s law: exponential progress • Standardization & Commoditization • Stratification and competition When: Sooner than you think! • massive standardization gives massive use Copyright Gordon Bell & Jim Gray PC+ • economic forces are enormous 000 Performance versus time for various microprocessors DEC PC MIPS 100 10 1 1978 1980 1982 1984 1986 Copyright Gordon Bell & Jim Gray 1988 1990 1992 1994 1996 1998 2000 PC+ p e r f o r m a n c e Volume drives simple, cost to standard price for Stand-alone Desk tops high speed platforms interconnect Distributed workstations Clustered Computers PCs 1-4 processor mP MPPs 1-20 processor mP price Copyright Gordon Bell & Jim Gray PC+ Section: The economics of operating systems and databases (or why NT has the advantage over proprietary or vanity chips and UNIX dialects ) Copyright Gordon Bell & Jim Gray PC+ The UNIX Trap: creating the myth of “open systems” “Standard” now means different! VendorIX platforms have created the “downsizing” market that provides an apparent, order of magnitude cost reduction Hardware platform vendors lock-in users with servers of proprietary UNIX dialects and unique chips to maintain margins for chip and UNIX development Users hostage with client-server, database, and apps An implicit or unconscious cartel forms that maintains the industry status quo Copyright Gordon Bell & Jim Gray PC+ The UNIX Cartel and Tax: It’s not competitive and it introduces higher downstream costs xx 10,000 programmers @75 companies maintain dialects R & D costs $1.4 - $2 billion Implied selling price $10 - 14 billion for $1.4 billion, or a sales tax of 1 million UNIX units of $10,000 Cost could be reduced to $400 million for ONE UNIX, sales price for 1 million units would be $2,400 - 4,000 NT sales price is $650; OS2 needs to sell for $1.2b/6m Furthermore: The downstream effects on database vendors is 40% R&D efficiency causing an implied database tax of 2.5x the sales price! The downstream effects on apps vendors is similar Copyright Gordon Bell & Jim Gray PC+ Section: SNAP Architecture---------- Copyright Gordon Bell & Jim Gray PC+ Computing SNAP built entirely from PCs Person Person servers servers (PCs) (PCs) Portables Wide-area global Local & ATM network Mobile global data Nets comm ATM† & Local world Area Networks for: terminal, PC, workstation, & servers ??? Legacy mainframes & Legacy minicomputers mainframe & terms servers & minicomputer servers & terminals scalable computers built from PCs Centralized &Centralized departmental uni& mP servers & departmental (UNIX & NT) servers buit from PCs TC=TV+PC home ... (CATV or ATM or satellite) A space, time (bandwidth), & generation scalable environment Copyright Gordon Bell & Jim Gray PC+ GB with NT, Compaq, & HP cluster Copyright Gordon Bell & Jim Gray PC+ In a decade we can/will have: more powerful personal computers – – – – adequate networking? PCs now operate at 1 Gbps – – processing 10-100x 4x resolution (2K x 2K) displays to impact paper Large, wall-sized and watch-sized displays low cost, storage of one terabyte for personal use ubiquitous access = today’s fast LANs Competitive wireless networking One chip, networked platforms e.g. light bulbs, cameras everywhere, etc. managed by PCs! Some well-defined platforms that compete with the PC for mind (time) and market share watch, pocket, body implant, home Inevitable, continued cyberization… the challenge… interfacing platforms and people. High Performance Computing A 60+ year view Copyright Gordon Bell & Jim Gray PC+ Copyright Gordon Bell & Jim Gray PC+ Star Bridge Copyright Gordon Bell & Jim Gray PC+ Linux super howls Copyright Gordon Bell & Jim Gray PC+ Dead Supercomputer Society Copyright Gordon Bell & Jim Gray PC+ Dead Supercomputer Society ACRI Alliant American Supercomputer Ametek Applied Dynamics Astronautics BBN CDC Convex Cray Computer Cray Research Culler-Harris Culler Scientific Cydrome Dana/Ardent/Stellar/Stardent Denelcor Elexsi ETA Systems Evans and Sutherland Computer Floating Point Systems Copyright Gordon Galaxy YH-1Bell & Jim Gray Goodyear Aerospace MPP Gould NPL Guiltech Intel Scientific Computers International Parallel Machines Kendall Square Research Key Computer Laboratories MasPar Meiko Multiflow Myrias Numerix Prisma Tera Thinking Machines Saxpy Scientific Computer Systems (SCS) Soviet Supercomputers Supertek Supercomputer Systems Suprenum Vitesse Electronics PC+ Steve Squires & Cray Copyright Gordon Bell & Jim Gray PC+ 1000 100 10 Bell Prize and Future Peak Tflops (t) *IBM Petaflops study target 1 NEC 0.1 CM2 0.01 0.001 XMP NCube 0.0001 1985 1990 1995 Copyright Gordon Bell & Jim Gray 2000 2005 2010 PC+ Top 10 tpc-c Top two Compaq systems are: 1.1 & 1.5X faster than IBM SPs; 1/3 price of IBM 1/5 price of SUN Copyright Gordon Bell & Jim Gray PC+ High Performance Computing Supers we knew are Japanese… scalability & COTS in… but you have to roll your own else pay the Unix & proprietary taxes Beowulf is $14K/TB ( 6 x 4 x 40 GB) IBM 4000R 1 rack: 2x42 500Mhz processors, 84 GB, 84 disks (3TB @36GB/disk) $420K … still cheaper than the “big buys” $10-20K/node for special purpose vs $2K for a MAC EMC, IBM at $1 million/TB; vs $14K Copyright Gordon Bell & Jim Gray PC+ High performance architectures timeline 1950 . 1960 . 1970 . Vtubes Trans. MSI(mini) Processor overlap, lookahead 1980 . 1990 . 2000 Micro RISC nMicr “IBM PC” “killer micros” Cray era 6600 7600 Cray1 X Y C T Func Pipe Vector-----SMP----------------> SMP DSM?? mainframes---> Clusters Tandm VAX MPP if n>1000 Local “multis”-----------> Mmax. KSR SGI----> IBM Ncube UNIX-> Intel IBM-> NOW and Global Networks n>10,000 Copyright Gordon Bell & Jim Gray Grid PC+ 1950 . Vtubes High performance architectures timeline 1960 . MSI(mini) 1980 . 1990 . 2000 Micro RISC nMicr “IBM PC” Sequential programming---->-----------------------------(single execution stream e.g. Fortran) Processor overlap, lookahead “killer micros” Cray era Trans. 1970 . 6600 7600 Cray1 X Y C T Func Pipe Vector-----SMP----------------> SMP mainframes---> “multis”-----------> DSM?? Mmax. KSR DASHSGI---> <SIMD Vector--//--------------Parallelization-------------------THE NEW BEGINNING----------------------Parallel programs aka Cluster Computing <--------------multicomputers <--MPP era-----Clusters Tandm VAX IBM UNIX-> MPP if n>1000 Ncube Intel IBM-> Local NOW Beowlf Copyright Gordon Bell & Jim Gray PC+ and Global Networks n>10,000 Grid High performance architecture/program timeline 1950 . 1960 . 1970 . Vtubes Trans. MSI(mini) 1980 . 1990 . Micro RISC 2000 nMicr Sequential programming---->-----------------------------(single execution stream) <SIMD Vector--//--------------Parallelization--- Parallel programs aka Cluster Computing multicomputers ultracomputers 10X in size & price! “in situ” resources 100x in //sm geographically dispersed Copyright Gordon Bell & Jim Gray <--------------<--MPP era-----10x MPP NOW VLSCC Grid PC+ Computer types -------- Connectivity-------WAN/LAN Netwrked Supers… SAN VPPuni DSM SM NEC super NEC mP Cray X…T (all mPv) Clusters GRID Legion T3E SGI DSM Mainframes Condor SP2(mP) clusters & Multis BeowulfNOW SGI DSM WSs PCs NT clusters Copyright Gordon Bell & Jim Gray PC+ Technical computer types: Pick of: 4 nodes, 2-3 interconnects SAN DSM NEC Fujitsu Hitachi IBM ?PC? SGI cluster SGI DSM Beow/NT T3 HP? Copyright Gordon Bell & Jim Gray SMP NEC super Cray ??? Fujitsu Hitachi HP IBM Intel SUN plain old PCs PC+ Technical computer types WAN/LAN SAN Netwrked Supers… New DSM SM NEC mP NEC super Old Cray X…T T series World (all mPv) world: VPPuni Clustered GRID ( one Computing Legion SGI DSM program Mainframes (multiple program SP2(mP) Condor clusters & Multis NOW stream) streams) Beowulf SGI DSM WSs PCs Copyright Gordon Bell & Jim Gray PC+ Technical computer types WAN/LAN SAN DSM SM Netwrked Supers… NEC mP NEC super Vectorize Cray X…T Linda, PVM, VPPuni Parallellelize T series (all mPv) MPI, GRID Cactus, ??? distributed function Legion SGI DSM Mainframes SP2(mP) Condor clusters Parallellelize & Multis Computing NOW Beowulf Copyright Gordon Bell & Jim Gray SGI DSM WSs PCs PC+ Gaussian Parallelism Copyright Gordon Bell & Jim Gray PC+ Beyond Moore’s Law …>10 yrs Just FCB (faster, cheaper, better)… COTS will soon mean consumer off the shelf Moore’s Law and technology progress likely to continue for another decade for: processing & memory, storage, LANs, & WANs are really evolving System-on-a chip of interesting sizes will emerge to create 0 cost systems No DNA, molecular, or quantum computers, or new stores Any displacement technology is unlikely … Carver Mead’s Law c1980 A technology takes 11 years to get established On the other hand, we are on Internet time! PC+ Copyright Gordon Bell & Jim Gray We get more of everything Copyright Gordon Bell & Jim Gray PC+ Computer ops/sec x word length / $ 1.E+09 doubles every 1.0 1.E+06 .=1.565^(t-1959.4) 1.E+03 y = 1E-248e0.2918x 1.E+00 1.E-03 doubles every 2.3 doubles every 7.5 1.E-06 Copyright Gordon Bell & Jim Gray 1880 1900 1920 1940 1960 1980 PC+ 2000 Performance in Mflop/s Growth of microprocessor performance 10000 1000 100 Cray 2 Cray Y-MP Cray C90 Alpha RS6000/590 Alpha RS6000/540 Cray X-MP Cray 1S 10 Cray T90 Supers Micros i860 R2000 1 0.1 0.01 8087 80387 6881 80287 Copyright Gordon Bell & Jim Gray PC+ Albert Yu predictions ‘96 When Clock (MHz) MTransistors Mops Die (sq. in.) 2000 900 40 2400 1.1 Copyright Gordon Bell & Jim Gray 2006 4000 350 20,000 1.4 PC+ Processor Limit: DRAM Gap “Moore’s Law” 100 10 1 µProc 60%/yr. . Processor-Memory Performance Gap: (grows 50% / year) DRAM DRAM 7%/yr.. CPU 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 Performance 1000 • Alpha 21264 full cache miss / instructions executed: 180 ns/1.7 ns =108 clks x 4 or 432 instructions • Caches in Pentium Pro: 64% area, 88% transistors Copyright Gordon Bell & Jim Gray *Taken from Patterson-Keeton Talk to SigMod PC+ Sony Playstation export limiits Copyright Gordon Bell & Jim Gray PC+ System-on-a-chip alternatives FPGA Sea of un-committed gate arrays Compile Unique processor for a system every app Systolic | Many pipelined or array parallel processors DSP | Special purpose VLIW processors Pc & Mp. Gen. Purpose cores. Specialized by I/O, etc. ASICS Universal Multiprocessor array, Micro programmable I/o Xylinx, Altera Tensillica TI Intel, Lucent, IBM Cradle Cradle: Universal Microsystem trading Verilog & hardware for C/C++ UMS : VLSI = microprocessor : special systems Software : Hardware Single part for all apps Programming @ run time via FPGA & ROM 5 quad mPs at 3 Gflops/quad = 15 Glops Single shared memory space, caches Programmable periphery including: 1 GB/s; 2.5 Gips PCI, 100 baseT, firewire $4 per flops; 150 mW/Gflops UMS Architecture DRAM CONTROL CLOCKS, DEBUG MEMORY MEMORY M M M M S S S S P P P P M M M M S S S S P P P P PROG I/O PROG I/O PROG I/O MEMORY PROG I/O MEMORY PROG I/O PROG I/O PROG I/O PROG I/O PROG I/O PROG I/O M M M M S S S S P P P P NVMEM PROG I/O PROG I/O M M M M S S S S P P P P DRAM Memory bandwidth scales with processing Scalable processing, software, I/O Each app runs on its own pool of processors Enables durable, portable intellectual property Free 32 bit processor core Copyright Gordon Bell & Jim Gray PC+ Linus’s Law: Linux everywhere Software is or should be free All source code is “open” Everyone is a tester Everything proceeds a lot faster when everyone works on one code Anyone can support and market the code for any price Zero cost software attracts users! All the developers write lots of code ISTORE Hardware Vision System-on-a-chip enables computer, memory, without significantly increasing size of disk 5-7 year target: MicroDrive:1.7” x 1.4” x 0.2” 2006: ? 1999: 340 MB, 5400 RPM, 5 MB/s, 15 ms seek 2006: 9 GB, 50 MB/s ? (1.6X/yr capacity, 1.4X/yr BW) Integrated IRAM processor 2x height Connected via crossbar switch growing like Moore’s law 16 Mbytes; ; 1.6 Gflops; 6.4 Gops 10,000+ nodes in one rack! 100/board = 1 TB; 0.16 Tf Copyright Gordon Bell & Jim Gray PC+ The Disk Farm? or a System On a Card? 14" The 500GB disc card An array of discs Can be used as 100 discs 1 striped disc 50 FT discs ....etc LOTS of accesses/second of bandwidth A few disks are replaced by 10s of Gbytes of RAM and a processor to run Apps!! Copyright Gordon Bell & Jim Gray PC+ Nanochip.com Copyright Gordon Bell & Jim Gray PC+ Disk vs Tape At 10K$/TB disks are competitive with nearline tape. Disk – – – – – – – Tape 40 GB – 40 GB 20 MBps – 10 MBps 5 ms seek time – 10 sec pick time 3 ms rotate latency – 30-120 second seek time 7$/GB for drive – 2$/GB for media 3$/GB for ctlrs/cabinet8$/GB for drive+library Guestimates 4 TB/rack – 10 TB/rack Cern: 200 TB 1 hour scan – 1 week scan The price advantage of tape is narrowing, and the performance advantage of disk is growing Copyright Gordon Bell & Jim Gray 3480 tapes 2 col = 50GB Rack = 1 TB =20 drives PC+ 1988 Federal Plan for Internet The virtuous cycle of bandwidth supply and demand Increased Demand Increase Capacity (circuits & bw) Standards Create new service Telnet & FTP EMAIL Lower response time WWW Audio Voice! Video 744Mbps over 5000 km to transmit 14 GB ~ 4e15 bit meters per second 4 Peta Bmps (“peta bumps”) Single Stream tcp/ip throughput Information Sciences Institute Microsoft QWest University of Washington Pacific Northwest Gigapop HSCC (high speed connectivity consortium) DARPA Copyright Gordon Bell & Jim Gray PC+ Redmond/Seattle, Map of GrayWABell Prize results single-thread single-stream tcp/ip New York via 7 hops desktop-to-desktop …Win 2K out of the box performance* Arlington, VA San Francisco, CA 5626 km 10 hops Copyright Gordon Bell & Jim Gray PC+ Ubiquitous 10 GBps SANs in 5 years 1Gbps Ethernet are reality now. – Also FiberChannel ,MyriNet, GigaNet, ServerNet,, ATM,… 1 GBps 10 Gbps x4 WDM deployed now (OC192) – 3 Tbps WDM working in lab In 5 years, expect 10x, wow!! 120 MBps (1Gbps) 80 MBps 40 MBps Copyright Gordon Bell & Jim 20 Gray MBps 5 MBps PC+ The Promise of SAN/VIA:10x in 2 years http://www.ViArch.org/ Yesterday: – – – 250 10 MBps (100 Mbps Ethernet) ~20 MBps tcp/ip saturates 2 cpus round-trip latency ~250 µs Now – Time µs to Send 1KB 200 150 Transmit receivercpu sender cpu 100 Wires are 10x faster Myrinet, Gbps Ethernet, ServerNet,… – 50 Fast user-level communication - tcp/ip ~ 100 MBps 10% cpu round-trip latency is 15 us Copyright 1.6 Gbps demoed Gordon Bell & Jim on Graya WAN 0 100Mbps Gbps SAN PC+ How much does wire-time cost? $/Mbyte? Odlyzko, 1998 & Jim Gray Cost ($) Gbps Ethernet 100 Mbps Ethernet OC12 (650 Mbps) DSL POTs Wireless Copyright Gordon Bell & Jim Gray .2µ .3µ .003 .0006 .002 .80 Time 10 ms 100 ms 20 ms 25 sec 200 sec 500 sec PC+ Bandwidt Seat cost h $/3y B/s $/MB Time GBpsE 2000 1.00E+08 2.E-07 0.010 100MbpsE 700 1.00E+07 7.E-07 0.100 OC12 12960000 5.00E+07 3.E-03 0.020 OC3 3132000 3.00E+06 1.E-02 0.333 T1 28800 1.00E+05 3.E-03 10.000 DSL 2300 4.00E+04 6.E-04 25.000 POTS 1180 5.00E+03 2.E-03 200.000 Modern scalable switches … are also supercomputers Scale from <1 to 120 Tbps 1 Gbps ethernet switches scale to 10s of Gbps, scaling upward SP2 scales from 1.2 Copyright Gordon Bell & Jim Gray PC+ So where are the challenges? Continued development based on clusters … Scalar processors need to compete with vectors. The U.S. has cast its lot with COTS! WWW is here. Now exploit it in every respect. – Exploit OSS! Grid Application Service Providers for scientific and technical apps – – – Biologist and chemist workbenches are prototypes Labscape @ Cell laboratory, U. of WA Sloan sky survey Copyright Gordon Bell & Jim Gray PC+ 1st, 2nd, 3rd, or New Paradigm for science? Labscape Copyright Gordon Bell & Jim Gray PC+ Labscape Copyright Gordon Bell & Jim Gray PC+ Labscape Copyright Gordon Bell & Jim Gray PC+ Labscape sensors Location tracking of people/samples – – multiple resolutions passive and active tags Manual tasks (e.g., use of reagents, tools) Audio/video records, vision and indexing Networked instruments (e.g., pipettes, refrigerators, etc.) Copyright Gordon Bell & Jim Gray PC+ What am I willing to predict? Processing can be anywhere… – – – – Maui… in the winter. BW is the limiter! Japan… if supers are so super, otherwise use PCs In the disks Application Service Providers: separation of our data from ourselves and businesses The GRID e.g. biologist & chemist workbenches iff the IP doesn’t get in way Collaboration ala astrophysics (high energy physics, math, earth sci. and any pure science if pure science continues!) OSS is the big bang for supercomputing?? Copyright Gordon Bell & Jim Gray PC+ The End Copyright Gordon Bell & Jim Gray PC+