Jim Gray Talk at University of Tokyo Personal views on PITAC report:

advertisement
Jim Gray
Talk at University of Tokyo
 Personal views on PITAC report:
invest in long term research
 Preview of Turing lecture: 10 long term research problems
• Bush: Summarize info in cyberspace
• Turing: Intelligent Computers
• 7 9s: build systems that are always up and prove it.
 5-Minute rule
• For disks
• For tapes
 Sorting Progress
• PennySort
• Terabyte Sort (!)
 Slides will be at http://research.Microsoft.com/~Gray/talks
Jim Gray / Presented at U. Tokyo / 23 Jan 1999
Presidential Advisory Committee
on
High Performance Computing and
Communications,
Information Technologies, and the
Next Generation Internet
Information Technology
http://www.ccic.gov/ac/interim/ or
http://research.microsoft.com/~Gray/papers/PITAC_Interim_Report_8_98.doc
Jim Gray / Presented at U. Tokyo / 23 Jan 1999
Charter for the Committee:
provide an independent assessment of
 High-Performance Computing and Communications (HPCC)
• Progress
• Balance among research components;
Next Generation Internet initiative;
• Progress
• Balance
IT Research and development
• Maintain United States leadership in
—IT and
—Applications
Jim Gray / Presented at U. Tokyo / 23 Jan 1999
Committee Members
 Co-Chairs:
• Bill Joy, Sun Microsystems
• Ken Kennedy, Rice University
 Members:
•
•
•
•
•
•
•
•
•
•
•
•
Eric Benhamou, 3Com
Ching-chih Chen, Simmons
Steve Dorfman, Hughes
Bob Ewald, SGI
Sherri Fuller, U. of Washington
Susan Graham, UC Berkeley
Danny Hillis, Disney, Inc
David Nagel, AT&T
Ted Shortliffe, Stanford
Joe Thompson, Miss. State U.
Andy Viterbi, Qualcom
Irving Wladawsky-Berger, IBM
• Vinton Cerf, MCI
• David Cooper, LLNL
• David Dorman, PointCast
• David Farber, U. of Pennsylvania
• Hector Garcia-Molina, Stanford
• Jim Gray, Microsoft
• John Miller, Montana State Univ.
• Raj Reddy, Carnegie Mellon
• Larry Smarr, U. of Illinois @ UC
• Les Vadasz, Intel
• Steve Wallach, Centerpoint
Jim Gray / Presented at U. Tokyo / 23 Jan 1999
My Summary of the Report
 1/3 of the US economic growth since 1992 was in the IT sector.
IT is key to our health, wealth, and safety.
 Created 400 B$ of wealth in last 3 years (!!)
 Federal IT research funding of twenty years ago, created the boom.
 Federal IT research funding for the last decade has been flat
 Research funding is increasingly near-term & applied development
 The committee recommends Increase long-term research funding in:
• Software design and implementation technologies
• Technologies to scale the Next Generation Internet to 6 billion users.
• Tools, algorithms, and systems for high-performance computing.
 Spend a billion dollars over the next 5 years on
Lewis and Clark style "expeditions" into cyberspace.
(in constant dollars).
Jim Gray / Presented at U. Tokyo / 23 Jan 1999
Myths
1. Now that IT is a big business,
Industry will do long term research.
FACT:
industry spends LITTLE on long-term research.
it is not in their best interest
2. IT research = buy computers for scientists.
FACT
computer science research
is different from
the application of computers to some discipline.
Jim Gray / Presented at U. Tokyo / 23 Jan 1999
Research Priorities
Findings:
• Total federal Information technology R&D investment is
inadequate
• Federal IT R&D is excessively focused on near-term
problems
Recommendations:
• Create a strategic initiative in long-term IT R&D
• Increase the investment for research in
software,
scalable information infrastructure,
high-end computing, and
socio-economic and workforce impacts
Jim Gray / Presented at U. Tokyo / 23 Jan 1999
Software Research
 Findings:
•
•
•
•
Demand for software far exceeds the nation’s ability to produce it
The nation depends on fragile software
Technologies to build reliable and secure software are inadequate
The nation is under-investing in fundamental software research
 Recommendations:
• Fund more fundamental research in software development methods and
component technologies
• Sponsor a national library of software components
• Make software research a substantive component of every major IT
research initiative
• Support research in human-computer interfaces and interaction
 Make fundamental software research an absolute priority
Jim Gray / Presented at U. Tokyo / 23 Jan 1999
Scalable Information Infrastructure
 Findings:
• The Internet has grown well beyond the intent of its original
designers
• Our nation’s dependence on the information infrastructure is
increasing daily
• We cannot safely extend what we currently know to more complex
systems
• Learning how to build large-scale, highly reliable and secure
systems requires research
 Recommendations:
• Increase funding in research and development of core software and
communications technologies aimed directly at the challenge of
scaling the information infrastructure
• Expand the Next Generation Internet test beds to include additional
industry partnerships in order to foster the rapid commercialization
and deployment of enabling technologies
Jim Gray / Presented at U. Tokyo / 23 Jan 1999
High-End Computing
Findings HEC is:
• essential for science and engineering research
• an element of the United States national security
• ripe for new applications
• suppliers suffer from unusual market pressures
Research& Development Recommendations
• Fund innovative technologies and architectures
• Fund HEC software (parallel programming)
• Aim for a real application petaops by 2010
through a both hardware and software strategies
• Fund HEC systems for science and engineering research
Jim Gray / Presented at U. Tokyo / 23 Jan 1999
Social, Economic, Workforce Recommendations
Expand research on the social and economic impacts of
information technology diffusion and adoption
Expand initiatives to increase IT literacy, access and
research capabilities
Address the shortage of high-technology workers
Programs to re-train “stale” IT workers
Encourage participation by women and minorities
Short-term increase in immigration of skilled IT workers
Jim Gray / Presented at U. Tokyo / 23 Jan 1999
Conclusions
 IT is an essential foundation for commerce, education, health care,
environmental stewardship, and national security:
• Dramatically transform the way we communicate, learn, deal with
information and conduct research
• Transform the nature of work, nature of commerce, product design cycle,
practice of health care, and the government itself
 The total Federal IT R&D investment is inadequate
 The Federal IT R&D is excessively focused on near-term problems
 U. S. government must:
• Create a strategic initiative in long-term IT R&D
• Establish an effective structure for managing and coordinating IT
Jim Gray / Presented at U. Tokyo / 23 Jan 1999
Jim Gray
Talk at University of Tokyo
 Personal views on PITAC report:
invest in long term research
 Preview of Turing lecture: 10 long term research problems
• Bush: Summarize info in cyberspace
• Turing: Intelligent Computers
• 7 9s: build systems that are always up and prove it.
 5-Minute rule
• For disks
• For tapes
 Sorting Progress
• PennySort
• Terabyte Sort (!)
 Slides will be at http://research.Microsoft.com/~Gray/talks
Jim Gray / Presented at U. Tokyo / 23 Jan 1999
Vanaveer Bush: Memex
 Memex: Proposed putting all information online (1948)
 It will happen
 Result: InfoGlut. Too much information in the shoebox
 Challenge:
• Organize the information.
• Give answers as good as an expert in the field.
• Anticipate questions and so inform “subscriber”
 Protect personal privacy
• A hacker cannot get access to your personal information
without your consent.
Jim Gray / Presented at U. Tokyo / 23 Jan 1999
Turing’s Test (1951): Intelligent Machines
 Computers helped with the 4-color problem end game
 Computers (and people) won world chess
championship
 Computers will likely be our 5th brain
• Augment our intelligence
• See for us, hear for us, read for us,
• Prosthetic eyes, ears, voices, arms, legs,….
 Probably computers will be intelligent like plants and
animals.
 Perhaps computers can be intelligent like people
•
•
•
•
•
Pass the Turing Test (easy/impossible?) (70%, 5 minutes, B can lie)
Translating telephone (as good as a human translator)
Read a textbook and pass the written exam.
Pass a graduate programming class
Pass a graduate literature class
 Radical: Download someone.
Jim Gray / Presented at U. Tokyo / 23 Jan 1999
Dependable Systems
 Build a system used by millions of people each day.
 Then:
• Prove that it does what it is supposed to do
(code matches spec).
• Prove that it delivers 99.99999% (7 9s) availability
(1 hr per millennium)
• Prove that it cannot be “hacked”
for less than 1B$ (Y2K $)
 Then build the system automatically from the
specification.
Jim Gray / Presented at U. Tokyo / 23 Jan 1999
Jim Gray
Talk at University of Tokyo
 Personal views on PITAC report:
invest in long term research
 Preview of Turing lecture: 10 long term research problems
• Bush: Summarize info in cyberspace
• Turing: Intelligent Computers
• 7 9s: build systems that are always up and prove it.
 5-Minute rule
• For disks
• For tapes
 Sorting Progress
• PennySort
• Terabyte Sort (!)
 Slides will be at http://research.Microsoft.com/~Gray/talks
Jim Gray / Presented at U. Tokyo / 23 Jan 1999
Storage Hierarchy (9 levels)
Cache 1, 2
Main (1, 2, 3 if nUMA).
Disk (1 (cached), 2)
Tape (1 (mounted), 2)
Jim Gray / Presented at U. Tokyo / 23 Jan 1999
Meta-Message:
Technology Ratios Are Important
 If everything gets faster & cheaper
the same rate
THEN nothing really changes.
at
 Things getting MUCH BETTER:
• communication speed & cost 1,000x
• processor speed & cost 100x
• storage size & cost 100x
 Things staying about the same
• speed of light (more or less constant)
• people (10x more expensive)
• storage speed (only 10x better)
Jim Gray / Presented at U. Tokyo / 23 Jan 1999
Today’s Storage Hierarchy :
Speed & Capacity vs Cost Tradeoffs
Size vs Speed
1012
109
106
104
Cache
Nearline
Tape Offline
Main
102
Tape
Disc
Secondary
Online
Online
Secondary
Tape
Tape
100
Disc
Main
Offline
Nearline
Tape
Tape
-2
$/MB
Typical System (bytes)
1015
Price vs Speed
10
Cache
103
10-4
10-9 10-6 10-3 10 0 10 3
Access Time (seconds)
10-9 10-6 10-3 10 0 10 3
Access Time (seconds)
Jim Gray / Presented at U. Tokyo / 23 Jan 1999
Storage Ratios Changed
10x better access time
10x more bandwidth
4,000x lower media price
DRAM/DISK 100:1 to 10:10 to 50:1
Disk Performance vs Time
(accesses/ second & Capacity)
Disk Performance vs Time
100
10000
1
1980
1990
Year
1
2000
10
10
1
1980
1
1990
Year
0.1
2000
1000
100
$/MB
10
Accesses per
Second
10
bandwidth (MB/s)
access time (ms)
100
Disk Capackty
(GB)
100
Storage Price vs Time
10
1
0.1
0.01
1980
1990
2000
Jim Gray / Presented atYear
U. Tokyo / 23 Jan 1999
The 5 Minute Rule Derived
M$: cost of a RAM page
RAM $/MB
PageSize x Lifetime
$
A$: cost of a disk access
Disk Price
AccessesPerSec x Lifetime
M$= A$/RI
RI: Reference Interval
time between accesses to page
Breakeven:
Reference Interval =Time
M$ = A$ / Reference Interval
Reference Interval = M$/A$
= DiskPrice x PageSize
RAMprice x AccPerSec
Jim Gray / Presented at U. Tokyo / 23 Jan 1999
The Five Minute Rule Observations
BreakEvenReferenceInterval 
PagesPerMBofRAM
PricePerDi skDrive
1

AccessPerSecondPerDi sk PricePerMB ofRAM
Break even has two terms:
(2) Economic term:
DiskPrice / RAM_MB_Price ~ 400:4
= 100:1
(1) Technology term:
PageSize / DiskAccPerSec ~ 8KB : 80 = 100:1
Economic term trends down
Technology term trends up to compensate.
Still at 5 minute for random, 1 minute sequential
Jim Gray / Presented at U. Tokyo / 23 Jan 1999
Shows Best Page Index Page Size ~16KB
Index Page Utility vs Page Size
and Disk Performance
Index Page Utility vs Page Size
and Index Elemet Size
1.00
0.90
0.90
0.80
0.80
Utility
16 byte entries
32 byte
0.70
10 MB/s
0.70
5 MB/s
0.60
0.60
64 byte
0.50
0.40
Utility
1.00
128 byte
2
4
8
16
0.40
32
3 MB/s
0.50
2
4
8
16
32
64
128
128
40 MB/s 0.65 0.74 0.83 0.91 0.97 0.99 0.94
16 B
0.64 0.72 0.78 0.82 0.79 0.69 0.54
10 MB/s 0.64 0.72 0.78 0.82 0.79 0.69 0.54
32 B
0.54 0.62 0.69 0.73 0.71 0.63 0.50
5 MB/s
0.62 0.69 0.73 0.71 0.63 0.50 0.34
64 B
0.44 0.53 0.60 0.64 0.64 0.57 0.45
3 MB/s
0.51 0.56 0.58 0.54 0.46 0.34 0.22
128 B 0.34 0.43 0.51 0.56 0.56 0.51 0.41
1 MB/s
0.40 0.44 0.44 0.41 0.33 0.24 0.16
Page Size (KB)
64
1MB/s
Page Size (KB)
Jim Gray / Presented at U. Tokyo / 23 Jan 1999
Standard Storage Metrics
Capacity:
• RAM:
• Disk:
• Tape:
MB and $/MB: today at 10MB & 100$/MB
GB and $/GB: today at 10 GB and 200$/GB
TB and $/TB: today at .1TB and 25k$/TB
(nearline)
Access time (latency)
• RAM:
• Disk:
• Tape:
100 ns
10 ms
30 second pick, 30 second position
Transfer rate
• RAM:
• Disk:
• Tape:
1 GB/s
5 MB/s - - - Arrays can go to 1GB/s
5 MB/s - - - striping is problematic
Jim Gray / Presented at U. Tokyo / 23 Jan 1999
New Storage Metrics:
Kaps, Maps, SCAN?
Kaps: How many KB objects served per second
• The file server, transaction processing metric
• This is the OLD metric.
Maps: How many MB objects served per sec
• The Multi-Media metric
SCAN: How long to scan all the data
• The data mining and utility metric
And
• Kaps/$, Maps/$, TBscan/$
Jim Gray / Presented at U. Tokyo / 23 Jan 1999
For the Record
(good 1998 devices packaged in system
)
http://www.tpc.org/results/individual_results/Dell/dell.6100.9801.es.pdf
Unit capacity (GB)
Unit price $
$/GB
Latency (s)
Bandwidth (Mbps)
Kaps
Maps
Scan time (s/TB)
$/Kaps
$/Maps
$/TBscan
DRAM
1
4000
4000
1.E-7
500
5.E+5
5.E+2
2
9.E-11
8.E-8
$0.08
DISK
18
500
28
1.E-2
15
1.E+2
13.04
1200
5.E-8
4.E-7
$0.35
TAPE robot
35 X 14
10000
20
3.E+1
7
3.E-2
3.E-2
70000
3.E-3
3.E-3
$211
Jim Gray / Presented at U. Tokyo / 23 Jan 1999
For the Record
(good 1998 devices packaged in system
)
http://www.tpc.org/results/individual_results/Dell/dell.6100.9801.es.pdf
5.E+05
1.E+06 4.E+03
2820
1.E+03
500
157
500
99
13
0.03
1.E+00
7.E+04
1200
DRAM
2
TAPE robot
DISK
211
X 14
0.35
0.08
0.03
3.E-03 3.E-03
1.E-03
4.E-07
5.E-088.E-08
1.E-06
9.E-11
1.E-09
Bs
ca
n
$/
T
ap
s
ap
s
$/
M
e
t im
$/
K
B)
(s
/T
ap
s
Ka
ps
bp
s)
(M
M
Sc
an
Ba
nd
w
id
th
$/
G
B
1.E-12
Jim Gray / Presented at U. Tokyo / 23 Jan 1999
How To Get Lots of Maps, SCANs
parallelism: use many little devices in parallel
At 10 MB/s:
1.2 days to scan
1,000 x parallel:
100 seconds SCAN.
1 Terabyte
1 Terabyte
10 MB/s
Parallelism: divide a big problem into many smaller ones
to be solved in parallel.
Beware of the media myth
Beware of the access time myth
Jim Gray / Presented at U. Tokyo / 23 Jan 1999
The Disk Farm On a Card
The 1 TB disc card
An array of discs
Can be used as
100 discs
1 striped disc
10 Fault Tolerant discs
....etc
14"
LOTS of accesses/second
bandwidth
Life is cheap, its the accessories that cost ya.
Processors are cheap, it’s the peripherals that cost ya
(a 10k$ disc card).
Jim Gray / Presented at U. Tokyo / 23 Jan 1999
Tape Farms for Tertiary Storage
Not Mainframe Silos
100 robots
1M$
50TB
50$/GB
3K Maps
10K$ robot
14 tapes
27 hr Scan
500 GB
5 MB/s
20$/GB Scan in 27 hours.
independent tape robots
30 Maps many
(like a disc farm)
Jim Gray / Presented at U. Tokyo / 23 Jan 1999
Tape & Optical:
Beware of the Media Myth
Optical is cheap: 200 $/platter
2 GB/platter
=> 100$/GB (2x cheaper than disc)
Tape is cheap:
=> 1.5 $/GB
30 $/tape
20 GB/tape
(100x cheaper than disc).
Jim Gray / Presented at U. Tokyo / 23 Jan 1999
Tape & Optical Reality:
Media is 10% of System Cost
Tape needs a robot (10 k$ ... 3 m$ )
10 ... 1000 tapes (at 20GB each) => 20$/GB ... 200$/GB
(1x…10x cheaper than disc)
Optical needs a robot (100 k$ )
100 platters = 200GB ( TODAY ) => 400 $/GB
( more expensive than mag disc )
Robots have poor access times
Not good for Library of Congress (25TB)
Data motel: data checks in but it never checks out!
Jim Gray / Presented at U. Tokyo / 23 Jan 1999
The Access Time Myth
The Myth: seek or pick time dominates
The reality: (1) Queuing dominates
(2) Transfer dominates BLOBs
(3) Disk seeks often short
Implication: many cheap servers
better than one fast expensive server
Wait
• shorter queues
• parallel transfer
• lower cost/access and cost/byte
This is now obvious for disk arrays
This will be obvious for tape arrays
Transfer Transfer
Rotate
Rotate
Seek
Seek
Jim Gray / Presented at U. Tokyo / 23 Jan 1999
Jim Gray
Talk at University of Tokyo
 Personal views on PITAC report:
invest in long term research
 Preview of Turing lecture: 10 long term research problems
• Bush: Summarize info in cyberspace
• Turing: Intelligent Computers
• 7 9s: build systems that are always up and prove it.
 5-Minute rule
• For disks
• For tapes
 Sorting Progress
• PennySort
• Terabyte Sort (!)
 Slides will be at http://research.Microsoft.com/~Gray/talks
Jim Gray / Presented at U. Tokyo / 23 Jan 1999
Penny Sort Ground Rules
http://research.microsoft.com/barc/SortBenchmark
 How much can you sort for a penny.
•
•
•
•
•
Hardware and Software cost
Depreciated over 3 years
1M$ system gets about 1 second,
1K$ system gets about 1,000 seconds.
Time (seconds) = SystemPrice ($) / 946,080
 Input and output are disk resident
 Input is
• 100-byte records (random data)
• key is first 10 bytes.
 Must create output file
and fill with sorted version of input file.
 Daytona (product) and Indy (special) categories
Jim Gray / Presented at U. Tokyo / 23 Jan 1999
PennySort
 Hardware
• 266 Mhz Intel PPro
• 64 MB SDRAM (10ns)
• Dual Fujitsu DMA 3.2GB EIDE disks
 Software
• NT workstation 4.3
• NT 5 sort
 Performance
• sort 15 M 100-byte records (~1.5 GB)
• Disk to disk
• elapsed time 820 sec
—cpu time = 404 sec
PennySort Machine (1107$ )
Disk
25%
Cabinet +
Assembly
7%
Memory
8%
board
13%
Other
22%
Network,
Video, floppy
9%
Software
6%
cpu
Jim Gray / Presented at U. Tokyo / 23 Jan 1999
32%
Sort Speed Doubles Every Year
?
?h
?
Jim Gray / Presented at U. Tokyo / 23 Jan 1999
Recent Results
 NOW Sort: 9 GB on a cluster of 100 UltraSparcs in 1 minute
 MilleniumSort: 16x Dell NT cluster: 100 MB in 1.8 Sec (Datamation)
 Tandem/Sandia Sort: 68 CPU ServerNet
1 TB in 47 minutes
 Rumor of IBM Sort: 7000 cpu Blue Pacific
1 TB in 1024 seconds (17 minutes). 10 Mrps (1GBps)
Jim Gray / Presented at U. Tokyo / 23 Jan 1999
Jim Gray
Talk at University of Tokyo
 Personal views on PITAC report:
invest in long term research
 Preview of Turing lecture: 10 long term research problems
• Bush: Summarize info in cyberspace
• Turing: Intelligent Computers
• 7 9s: build systems that are always up and prove it.
 5-Minute rule
• For disks
• For tapes
 Sorting Progress
• PennySort
• Terabyte Sort (!)
 Slides will be at http://research.Microsoft.com/~Gray/talks
Jim Gray / Presented at U. Tokyo / 23 Jan 1999
Download