Jim Gray Talk at University of Tokyo Personal views on PITAC report: invest in long term research Preview of Turing lecture: 10 long term research problems • Bush: Summarize info in cyberspace • Turing: Intelligent Computers • 7 9s: build systems that are always up and prove it. 5-Minute rule • For disks • For tapes Sorting Progress • PennySort • Terabyte Sort (!) Slides will be at http://research.Microsoft.com/~Gray/talks Jim Gray / Presented at U. Tokyo / 23 Jan 1999 Presidential Advisory Committee on High Performance Computing and Communications, Information Technologies, and the Next Generation Internet Information Technology http://www.ccic.gov/ac/interim/ or http://research.microsoft.com/~Gray/papers/PITAC_Interim_Report_8_98.doc Jim Gray / Presented at U. Tokyo / 23 Jan 1999 Charter for the Committee: provide an independent assessment of High-Performance Computing and Communications (HPCC) • Progress • Balance among research components; Next Generation Internet initiative; • Progress • Balance IT Research and development • Maintain United States leadership in —IT and —Applications Jim Gray / Presented at U. Tokyo / 23 Jan 1999 Committee Members Co-Chairs: • Bill Joy, Sun Microsystems • Ken Kennedy, Rice University Members: • • • • • • • • • • • • Eric Benhamou, 3Com Ching-chih Chen, Simmons Steve Dorfman, Hughes Bob Ewald, SGI Sherri Fuller, U. of Washington Susan Graham, UC Berkeley Danny Hillis, Disney, Inc David Nagel, AT&T Ted Shortliffe, Stanford Joe Thompson, Miss. State U. Andy Viterbi, Qualcom Irving Wladawsky-Berger, IBM • Vinton Cerf, MCI • David Cooper, LLNL • David Dorman, PointCast • David Farber, U. of Pennsylvania • Hector Garcia-Molina, Stanford • Jim Gray, Microsoft • John Miller, Montana State Univ. • Raj Reddy, Carnegie Mellon • Larry Smarr, U. of Illinois @ UC • Les Vadasz, Intel • Steve Wallach, Centerpoint Jim Gray / Presented at U. Tokyo / 23 Jan 1999 My Summary of the Report 1/3 of the US economic growth since 1992 was in the IT sector. IT is key to our health, wealth, and safety. Created 400 B$ of wealth in last 3 years (!!) Federal IT research funding of twenty years ago, created the boom. Federal IT research funding for the last decade has been flat Research funding is increasingly near-term & applied development The committee recommends Increase long-term research funding in: • Software design and implementation technologies • Technologies to scale the Next Generation Internet to 6 billion users. • Tools, algorithms, and systems for high-performance computing. Spend a billion dollars over the next 5 years on Lewis and Clark style "expeditions" into cyberspace. (in constant dollars). Jim Gray / Presented at U. Tokyo / 23 Jan 1999 Myths 1. Now that IT is a big business, Industry will do long term research. FACT: industry spends LITTLE on long-term research. it is not in their best interest 2. IT research = buy computers for scientists. FACT computer science research is different from the application of computers to some discipline. Jim Gray / Presented at U. Tokyo / 23 Jan 1999 Research Priorities Findings: • Total federal Information technology R&D investment is inadequate • Federal IT R&D is excessively focused on near-term problems Recommendations: • Create a strategic initiative in long-term IT R&D • Increase the investment for research in software, scalable information infrastructure, high-end computing, and socio-economic and workforce impacts Jim Gray / Presented at U. Tokyo / 23 Jan 1999 Software Research Findings: • • • • Demand for software far exceeds the nation’s ability to produce it The nation depends on fragile software Technologies to build reliable and secure software are inadequate The nation is under-investing in fundamental software research Recommendations: • Fund more fundamental research in software development methods and component technologies • Sponsor a national library of software components • Make software research a substantive component of every major IT research initiative • Support research in human-computer interfaces and interaction Make fundamental software research an absolute priority Jim Gray / Presented at U. Tokyo / 23 Jan 1999 Scalable Information Infrastructure Findings: • The Internet has grown well beyond the intent of its original designers • Our nation’s dependence on the information infrastructure is increasing daily • We cannot safely extend what we currently know to more complex systems • Learning how to build large-scale, highly reliable and secure systems requires research Recommendations: • Increase funding in research and development of core software and communications technologies aimed directly at the challenge of scaling the information infrastructure • Expand the Next Generation Internet test beds to include additional industry partnerships in order to foster the rapid commercialization and deployment of enabling technologies Jim Gray / Presented at U. Tokyo / 23 Jan 1999 High-End Computing Findings HEC is: • essential for science and engineering research • an element of the United States national security • ripe for new applications • suppliers suffer from unusual market pressures Research& Development Recommendations • Fund innovative technologies and architectures • Fund HEC software (parallel programming) • Aim for a real application petaops by 2010 through a both hardware and software strategies • Fund HEC systems for science and engineering research Jim Gray / Presented at U. Tokyo / 23 Jan 1999 Social, Economic, Workforce Recommendations Expand research on the social and economic impacts of information technology diffusion and adoption Expand initiatives to increase IT literacy, access and research capabilities Address the shortage of high-technology workers Programs to re-train “stale” IT workers Encourage participation by women and minorities Short-term increase in immigration of skilled IT workers Jim Gray / Presented at U. Tokyo / 23 Jan 1999 Conclusions IT is an essential foundation for commerce, education, health care, environmental stewardship, and national security: • Dramatically transform the way we communicate, learn, deal with information and conduct research • Transform the nature of work, nature of commerce, product design cycle, practice of health care, and the government itself The total Federal IT R&D investment is inadequate The Federal IT R&D is excessively focused on near-term problems U. S. government must: • Create a strategic initiative in long-term IT R&D • Establish an effective structure for managing and coordinating IT Jim Gray / Presented at U. Tokyo / 23 Jan 1999 Jim Gray Talk at University of Tokyo Personal views on PITAC report: invest in long term research Preview of Turing lecture: 10 long term research problems • Bush: Summarize info in cyberspace • Turing: Intelligent Computers • 7 9s: build systems that are always up and prove it. 5-Minute rule • For disks • For tapes Sorting Progress • PennySort • Terabyte Sort (!) Slides will be at http://research.Microsoft.com/~Gray/talks Jim Gray / Presented at U. Tokyo / 23 Jan 1999 Vanaveer Bush: Memex Memex: Proposed putting all information online (1948) It will happen Result: InfoGlut. Too much information in the shoebox Challenge: • Organize the information. • Give answers as good as an expert in the field. • Anticipate questions and so inform “subscriber” Protect personal privacy • A hacker cannot get access to your personal information without your consent. Jim Gray / Presented at U. Tokyo / 23 Jan 1999 Turing’s Test (1951): Intelligent Machines Computers helped with the 4-color problem end game Computers (and people) won world chess championship Computers will likely be our 5th brain • Augment our intelligence • See for us, hear for us, read for us, • Prosthetic eyes, ears, voices, arms, legs,…. Probably computers will be intelligent like plants and animals. Perhaps computers can be intelligent like people • • • • • Pass the Turing Test (easy/impossible?) (70%, 5 minutes, B can lie) Translating telephone (as good as a human translator) Read a textbook and pass the written exam. Pass a graduate programming class Pass a graduate literature class Radical: Download someone. Jim Gray / Presented at U. Tokyo / 23 Jan 1999 Dependable Systems Build a system used by millions of people each day. Then: • Prove that it does what it is supposed to do (code matches spec). • Prove that it delivers 99.99999% (7 9s) availability (1 hr per millennium) • Prove that it cannot be “hacked” for less than 1B$ (Y2K $) Then build the system automatically from the specification. Jim Gray / Presented at U. Tokyo / 23 Jan 1999 Jim Gray Talk at University of Tokyo Personal views on PITAC report: invest in long term research Preview of Turing lecture: 10 long term research problems • Bush: Summarize info in cyberspace • Turing: Intelligent Computers • 7 9s: build systems that are always up and prove it. 5-Minute rule • For disks • For tapes Sorting Progress • PennySort • Terabyte Sort (!) Slides will be at http://research.Microsoft.com/~Gray/talks Jim Gray / Presented at U. Tokyo / 23 Jan 1999 Storage Hierarchy (9 levels) Cache 1, 2 Main (1, 2, 3 if nUMA). Disk (1 (cached), 2) Tape (1 (mounted), 2) Jim Gray / Presented at U. Tokyo / 23 Jan 1999 Meta-Message: Technology Ratios Are Important If everything gets faster & cheaper the same rate THEN nothing really changes. at Things getting MUCH BETTER: • communication speed & cost 1,000x • processor speed & cost 100x • storage size & cost 100x Things staying about the same • speed of light (more or less constant) • people (10x more expensive) • storage speed (only 10x better) Jim Gray / Presented at U. Tokyo / 23 Jan 1999 Today’s Storage Hierarchy : Speed & Capacity vs Cost Tradeoffs Size vs Speed 1012 109 106 104 Cache Nearline Tape Offline Main 102 Tape Disc Secondary Online Online Secondary Tape Tape 100 Disc Main Offline Nearline Tape Tape -2 $/MB Typical System (bytes) 1015 Price vs Speed 10 Cache 103 10-4 10-9 10-6 10-3 10 0 10 3 Access Time (seconds) 10-9 10-6 10-3 10 0 10 3 Access Time (seconds) Jim Gray / Presented at U. Tokyo / 23 Jan 1999 Storage Ratios Changed 10x better access time 10x more bandwidth 4,000x lower media price DRAM/DISK 100:1 to 10:10 to 50:1 Disk Performance vs Time (accesses/ second & Capacity) Disk Performance vs Time 100 10000 1 1980 1990 Year 1 2000 10 10 1 1980 1 1990 Year 0.1 2000 1000 100 $/MB 10 Accesses per Second 10 bandwidth (MB/s) access time (ms) 100 Disk Capackty (GB) 100 Storage Price vs Time 10 1 0.1 0.01 1980 1990 2000 Jim Gray / Presented atYear U. Tokyo / 23 Jan 1999 The 5 Minute Rule Derived M$: cost of a RAM page RAM $/MB PageSize x Lifetime $ A$: cost of a disk access Disk Price AccessesPerSec x Lifetime M$= A$/RI RI: Reference Interval time between accesses to page Breakeven: Reference Interval =Time M$ = A$ / Reference Interval Reference Interval = M$/A$ = DiskPrice x PageSize RAMprice x AccPerSec Jim Gray / Presented at U. Tokyo / 23 Jan 1999 The Five Minute Rule Observations BreakEvenReferenceInterval PagesPerMBofRAM PricePerDi skDrive 1 AccessPerSecondPerDi sk PricePerMB ofRAM Break even has two terms: (2) Economic term: DiskPrice / RAM_MB_Price ~ 400:4 = 100:1 (1) Technology term: PageSize / DiskAccPerSec ~ 8KB : 80 = 100:1 Economic term trends down Technology term trends up to compensate. Still at 5 minute for random, 1 minute sequential Jim Gray / Presented at U. Tokyo / 23 Jan 1999 Shows Best Page Index Page Size ~16KB Index Page Utility vs Page Size and Disk Performance Index Page Utility vs Page Size and Index Elemet Size 1.00 0.90 0.90 0.80 0.80 Utility 16 byte entries 32 byte 0.70 10 MB/s 0.70 5 MB/s 0.60 0.60 64 byte 0.50 0.40 Utility 1.00 128 byte 2 4 8 16 0.40 32 3 MB/s 0.50 2 4 8 16 32 64 128 128 40 MB/s 0.65 0.74 0.83 0.91 0.97 0.99 0.94 16 B 0.64 0.72 0.78 0.82 0.79 0.69 0.54 10 MB/s 0.64 0.72 0.78 0.82 0.79 0.69 0.54 32 B 0.54 0.62 0.69 0.73 0.71 0.63 0.50 5 MB/s 0.62 0.69 0.73 0.71 0.63 0.50 0.34 64 B 0.44 0.53 0.60 0.64 0.64 0.57 0.45 3 MB/s 0.51 0.56 0.58 0.54 0.46 0.34 0.22 128 B 0.34 0.43 0.51 0.56 0.56 0.51 0.41 1 MB/s 0.40 0.44 0.44 0.41 0.33 0.24 0.16 Page Size (KB) 64 1MB/s Page Size (KB) Jim Gray / Presented at U. Tokyo / 23 Jan 1999 Standard Storage Metrics Capacity: • RAM: • Disk: • Tape: MB and $/MB: today at 10MB & 100$/MB GB and $/GB: today at 10 GB and 200$/GB TB and $/TB: today at .1TB and 25k$/TB (nearline) Access time (latency) • RAM: • Disk: • Tape: 100 ns 10 ms 30 second pick, 30 second position Transfer rate • RAM: • Disk: • Tape: 1 GB/s 5 MB/s - - - Arrays can go to 1GB/s 5 MB/s - - - striping is problematic Jim Gray / Presented at U. Tokyo / 23 Jan 1999 New Storage Metrics: Kaps, Maps, SCAN? Kaps: How many KB objects served per second • The file server, transaction processing metric • This is the OLD metric. Maps: How many MB objects served per sec • The Multi-Media metric SCAN: How long to scan all the data • The data mining and utility metric And • Kaps/$, Maps/$, TBscan/$ Jim Gray / Presented at U. Tokyo / 23 Jan 1999 For the Record (good 1998 devices packaged in system ) http://www.tpc.org/results/individual_results/Dell/dell.6100.9801.es.pdf Unit capacity (GB) Unit price $ $/GB Latency (s) Bandwidth (Mbps) Kaps Maps Scan time (s/TB) $/Kaps $/Maps $/TBscan DRAM 1 4000 4000 1.E-7 500 5.E+5 5.E+2 2 9.E-11 8.E-8 $0.08 DISK 18 500 28 1.E-2 15 1.E+2 13.04 1200 5.E-8 4.E-7 $0.35 TAPE robot 35 X 14 10000 20 3.E+1 7 3.E-2 3.E-2 70000 3.E-3 3.E-3 $211 Jim Gray / Presented at U. Tokyo / 23 Jan 1999 For the Record (good 1998 devices packaged in system ) http://www.tpc.org/results/individual_results/Dell/dell.6100.9801.es.pdf 5.E+05 1.E+06 4.E+03 2820 1.E+03 500 157 500 99 13 0.03 1.E+00 7.E+04 1200 DRAM 2 TAPE robot DISK 211 X 14 0.35 0.08 0.03 3.E-03 3.E-03 1.E-03 4.E-07 5.E-088.E-08 1.E-06 9.E-11 1.E-09 Bs ca n $/ T ap s ap s $/ M e t im $/ K B) (s /T ap s Ka ps bp s) (M M Sc an Ba nd w id th $/ G B 1.E-12 Jim Gray / Presented at U. Tokyo / 23 Jan 1999 How To Get Lots of Maps, SCANs parallelism: use many little devices in parallel At 10 MB/s: 1.2 days to scan 1,000 x parallel: 100 seconds SCAN. 1 Terabyte 1 Terabyte 10 MB/s Parallelism: divide a big problem into many smaller ones to be solved in parallel. Beware of the media myth Beware of the access time myth Jim Gray / Presented at U. Tokyo / 23 Jan 1999 The Disk Farm On a Card The 1 TB disc card An array of discs Can be used as 100 discs 1 striped disc 10 Fault Tolerant discs ....etc 14" LOTS of accesses/second bandwidth Life is cheap, its the accessories that cost ya. Processors are cheap, it’s the peripherals that cost ya (a 10k$ disc card). Jim Gray / Presented at U. Tokyo / 23 Jan 1999 Tape Farms for Tertiary Storage Not Mainframe Silos 100 robots 1M$ 50TB 50$/GB 3K Maps 10K$ robot 14 tapes 27 hr Scan 500 GB 5 MB/s 20$/GB Scan in 27 hours. independent tape robots 30 Maps many (like a disc farm) Jim Gray / Presented at U. Tokyo / 23 Jan 1999 Tape & Optical: Beware of the Media Myth Optical is cheap: 200 $/platter 2 GB/platter => 100$/GB (2x cheaper than disc) Tape is cheap: => 1.5 $/GB 30 $/tape 20 GB/tape (100x cheaper than disc). Jim Gray / Presented at U. Tokyo / 23 Jan 1999 Tape & Optical Reality: Media is 10% of System Cost Tape needs a robot (10 k$ ... 3 m$ ) 10 ... 1000 tapes (at 20GB each) => 20$/GB ... 200$/GB (1x…10x cheaper than disc) Optical needs a robot (100 k$ ) 100 platters = 200GB ( TODAY ) => 400 $/GB ( more expensive than mag disc ) Robots have poor access times Not good for Library of Congress (25TB) Data motel: data checks in but it never checks out! Jim Gray / Presented at U. Tokyo / 23 Jan 1999 The Access Time Myth The Myth: seek or pick time dominates The reality: (1) Queuing dominates (2) Transfer dominates BLOBs (3) Disk seeks often short Implication: many cheap servers better than one fast expensive server Wait • shorter queues • parallel transfer • lower cost/access and cost/byte This is now obvious for disk arrays This will be obvious for tape arrays Transfer Transfer Rotate Rotate Seek Seek Jim Gray / Presented at U. Tokyo / 23 Jan 1999 Jim Gray Talk at University of Tokyo Personal views on PITAC report: invest in long term research Preview of Turing lecture: 10 long term research problems • Bush: Summarize info in cyberspace • Turing: Intelligent Computers • 7 9s: build systems that are always up and prove it. 5-Minute rule • For disks • For tapes Sorting Progress • PennySort • Terabyte Sort (!) Slides will be at http://research.Microsoft.com/~Gray/talks Jim Gray / Presented at U. Tokyo / 23 Jan 1999 Penny Sort Ground Rules http://research.microsoft.com/barc/SortBenchmark How much can you sort for a penny. • • • • • Hardware and Software cost Depreciated over 3 years 1M$ system gets about 1 second, 1K$ system gets about 1,000 seconds. Time (seconds) = SystemPrice ($) / 946,080 Input and output are disk resident Input is • 100-byte records (random data) • key is first 10 bytes. Must create output file and fill with sorted version of input file. Daytona (product) and Indy (special) categories Jim Gray / Presented at U. Tokyo / 23 Jan 1999 PennySort Hardware • 266 Mhz Intel PPro • 64 MB SDRAM (10ns) • Dual Fujitsu DMA 3.2GB EIDE disks Software • NT workstation 4.3 • NT 5 sort Performance • sort 15 M 100-byte records (~1.5 GB) • Disk to disk • elapsed time 820 sec —cpu time = 404 sec PennySort Machine (1107$ ) Disk 25% Cabinet + Assembly 7% Memory 8% board 13% Other 22% Network, Video, floppy 9% Software 6% cpu Jim Gray / Presented at U. Tokyo / 23 Jan 1999 32% Sort Speed Doubles Every Year ? ?h ? Jim Gray / Presented at U. Tokyo / 23 Jan 1999 Recent Results NOW Sort: 9 GB on a cluster of 100 UltraSparcs in 1 minute MilleniumSort: 16x Dell NT cluster: 100 MB in 1.8 Sec (Datamation) Tandem/Sandia Sort: 68 CPU ServerNet 1 TB in 47 minutes Rumor of IBM Sort: 7000 cpu Blue Pacific 1 TB in 1024 seconds (17 minutes). 10 Mrps (1GBps) Jim Gray / Presented at U. Tokyo / 23 Jan 1999 Jim Gray Talk at University of Tokyo Personal views on PITAC report: invest in long term research Preview of Turing lecture: 10 long term research problems • Bush: Summarize info in cyberspace • Turing: Intelligent Computers • 7 9s: build systems that are always up and prove it. 5-Minute rule • For disks • For tapes Sorting Progress • PennySort • Terabyte Sort (!) Slides will be at http://research.Microsoft.com/~Gray/talks Jim Gray / Presented at U. Tokyo / 23 Jan 1999