Microsoft SQL Server, Scalability, & Database Research Jim Gray

advertisement
Microsoft SQL Server,
Scalability, &
Database Research
Jim Gray
Researcher
Microsoft Corporation
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
1
3 October 1999
Chicago, Ill.
Outline
Summary of what you heard. (10 min)
The database scene in general. (10 min)
Scaleability: Farms, Clones,Parts & Packs (15 min)
Microsoft DB research focus. (15 min)
• TerraServer (design and ops).
• RAGS.
• Data Mining
Q&A
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
(10 min)
2
3 October 1999
Chicago, Ill.
Organizations Are Going Online
Building a digital nervous system.
Inexpensive hardware means huge
databases are possible.
But, we are drowning in data.
Databases help organize information.
Microsoft’s goal:
• Information at your fingertips.
• Make it easy to capture,
manage, and
analyze information.
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
3
3 October 1999
Chicago, Ill.
Microsoft SQL Server 7 Goals


Easy




Scalability



Data Warehousing 

Dynamic self management
Multi-site management
Operation Scripting
Job scheduling and execution
Alert/response management
Scriptable Install+upgrade
DBA profiling/tuning tools
Unicode
English Language Query
Integrated with NT Security
Integrated with NT files
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
4
3 October 1999
Chicago, Ill.
Scalability




Scalability



Data Warehousing 
Easy
Win9x/NTW version
Dynamic row-level locking
Improved query optimizer
Intra-query parallelism
VLDB improvements
Replication improvements
Distributed query
High Availability Clusters
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
11
3 October 1999
Chicago, Ill.
Scale Down to Windows 95-98
Full function (same as NTW)
Integration with Access 97
MSDE in Office2000 and MSDN
WinCE version demonstrated
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
12
3 October 1999
Chicago, Ill.
Replication
Publisher
Transactional and Merge
Remote update
ODBC and OLE DB subscribers
Wizards
Performance
OS 390
DB2
2PC,
RPC
Distributor
DB2
VSAM
Subscriber CICS
Subscriber Subscriber Subscriber Updating Subscriber
(immediate updates)
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
13
3 October 1999
Chicago, Ill.
Query Processor Enhancements
Focus on Complex Queries
Parallelism
Improved scan, fetch, & sort
Smart hash & merge join
Large joins & grouping
Better query optimization
Multi-index operations
Automatic statistics maintenance
Distributed Query
Heterogeneous Query
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
14
3 October 1999
Chicago, Ill.
Parallel Query
SMP & Disk Parallelism
Global Agg.
Result 50 rows
+
4 x 50 rows
Local Agg.
+
+
+
+
50,000 rows
•# of emp. per group
Disks
•total inc. per group
Plus Distributed
Plus Hash Join (fanciest on the planet)
Plus Optimized Partitioned views
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
15
3 October 1999
Chicago, Ill.
Distributed Heterogeneous Queries
Data Fusion / Integration
Join spread sheets,
databases,
directories,
Text DBs
etc.
Any source that
exposes OLE DB
interfaces
SQL Server as
gateway,
even on the
desktop
Directory
Service
Database
(DB2, VSAM, Oracle, …)
Spreadsheet
SQL 7.0
Query
Processor
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
Photos
Mail
Maps
Documents
and the
Web
3 October
1999
16
Chicago, Ill.
Utilities
The Key to LARGE Databases
Auto-Repair
Index creation
 Backup
• Fuzzy
• Parallel
• Incremental
• Restartable
~2x faster than 6.5
DBCC
• not required,
• a good practice
• 5x - 100x faster
Recovery
• Fast
• File granularity
• shrinks file
• reclusters file
Recovery time (secs)
Reorganize
60
50
SQL Server 6.5
SQL Server 7.0
40
30
20
10
0
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
1
2
3
# of indices
17
4
3 October 1999
Chicago, Ill.
Data Warehousing




Warehousing Framework
Visual data modeler
Microsoft repository
Data
transformation
services
Scalability
(DTS)
 Plato & Dcube - Multi
Dimensional Data Cubes
Data Warehousing  English query 2.0
Easy
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
20
3 October 1999
Chicago, Ill.
Data Warehouse / Data Analysis
Data Transformation Services
to get data into the warehouse
CUBE (OLE/DB OLAP)
to analyze data
Operational
Data
Extact
& Load
OLAP
Data Warehouse
Storage
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
22
3 October 1999
Chicago, Ill.
Plato and Data Cube
and HOLAP
By Year
By Make
By Make & Year
Source table
Europe
RED
WHITE
BLUE
By Color & Year
Sum
Partition 1
By Color
“Plato”
User 1
ROLAP
Dcube
SQL
Designer
USA
Partition 2
MD SQL
Client
app
“Plato”
server
Asia
Dcube
Partition 3
User 2
ROLAP
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
Client
app
25
3 October 1999
Chicago, Ill.
English Query
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
26
3 October 1999
Chicago, Ill.
Easy
Scalable
Data
Warehousing
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
27
3 October 1999
Chicago, Ill.
“Shiloh” The Next SQL Server
Shiloh (H1’00) - Strengthen Position
• Data Warehousing leadership
 Materialized Views
 Cascading Referential Integrity
(#1 requested user-group feature)
 XML support
• Scalability
 WinCE support
 W2K VLM (36 and 64 bit)
 Multi-instance support
Yukon – Next Big Step
• Scalability (Clusters, Partitions)
• Programmability
• Ease of Use (Self Tuning, Auto Config)
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
3 October 1999
Chicago, Ill.
Outline
Summary of what you heard. (10 min)
The database scene in general. (10 min)
Scaleability: Farms, Clones,Parts & Packs (15 min)
Microsoft DB research focus. (15 min)
• TerraServer (design and ops).
• RAGS.
• Data Mining
Q&A
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
(10 min)
29
3 October 1999
Chicago, Ill.
Info Capture
Yotta
Zetta
Exa
 You can record everything
you see or hear or read.
 What will you do with it?
 How will you organize &
analyze it?
 Most data will never be seen
 Analysis an summarization
are key technologies
Everything
!
Recorded
All Books
MultiMedia
Peta
Tera
Giga
Mega
Kilo
All books
(words)
.Movi
e
A Photo
Video
Audio
Read or write:
8 PB per lifetime (10GBph)
30 TB (10KBps)
8 GB (words)
See: http://www.lesk.com/mlesk/ksg97/ksg.html
A Book
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
30
3 October 1999
Chicago, Ill.
Data Tidal Wave
 Seagate 47GB drive @ 783$ (= 1.7 ¢/mb)
• 100 GB penny per MB drive coming in 2000
 10 $/GB = 10 k$/ Terabyte!
• “Everyone” can afford one
 What’s a terror bite?
•
•
•
If you sell ten billion items a year (e.g Wal-Mart)
And you record 100 bytes on each one
Then you get a TeraByte/year
 Where will the terror bytes come from?
• Multimedia (like the TerraServer) and...
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
31
3 October 1999
Chicago, Ill.
Reducing Data’s Cost-of-Ownership
Self-Managing data
Cost of ownership:
One admin/TB (100K$ vs 10K$)
Admin cost exceeds storage cost.
SQL 7:
Suggests indices
Migrates data away from end of file
Truncates file
Someday:
Automatic move files to balance disks
Online defragmentation & restructuring
Online physical redesign
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
32
3 October 1999
Chicago, Ill.
OBJECT RELATIONAL
The Next Great DBMS Wave
All DB vendors have added objects to DB
Microsoft is adding DBs to Objects
Integration with COM+
Gives user-defined types and objects
Plug-ins will be Billion dollar industry
• Blades for SQL Server razor
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
33
3 October 1999
Chicago, Ill.
Why Is XML Important?
Self-describing data
Data stream in a typical interface…
“ABC47-Z”, “100”, “STL”, “C”, “3”, “28”
Same data stream in XML…
<INVENTORY>
<PART_NUM>ABC47-Z</PART_NUM>
<QUANTITY>100</QUANTITY>
<WAREHOUSE>STL</WAREHOUSE>
<ZONE>C</ZONE>
<AISLE>3</AISLE>
<BIN>28</BIN>
</INVENTORY>
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
34
3 October 1999
Chicago, Ill.
table.xsl
bar.xsl
art.xsl
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
35
3 October 1999
Chicago, Ill.
XML Applications
Exposing Software as a “Service”
• Websites without UI’s
• Exposed services with common scheme
• Integration points at the enterprise, valuechain, workgroup, desktop and intelligent
gizmo “levels”
B2B value chains
• Uses XML to transmit wide range of date to a
broad set of stakeholders (regulatory
agencies, suppliers, customers, etc.).
• Leverage for prior efforts like EDI
• BizTalk a key industry effort in this regard
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
36
3 October 1999
Chicago, Ill.
XML: BizTalk Framework
www.biztalk.org
XML
XML
schema
XML
XML
MVS CICS
SAP R/3
Library
Order Processing
Service
Interface
XML
XML
XML Message
Another Service
XML
Document
XML Message
XML
Document
JD Edwards
Browser
Client Apps
New Form Factors
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
37
3 October 1999
Chicago, Ill.
Outline
Summary of what you heard. (20 min)
The database scene in general. (10 min)
Scaleability: Farms, Clones,Parts & Packs (10 min)
Microsoft DB research focus. (15 min)
• TerraServer (design and ops).
• RAGS.
• Data Mining
Q&A
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
(15 min)
38
3 October 1999
Chicago, Ill.
Terminology for scaleability
Farm
Farms of servers:
• Clones: identical
 Scaleability + availability
Partition
Clone
• Partitions:
 Scaleability
• Packs
Pack
 Partition availability via fail-over
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
39
3 October 1999
Chicago, Ill.
Unpredictable Growth
The TerraServer Story:
•
•
•
•
We expected 5 M hits per day
We got 50 M hits on day 1
We peak at 15-20 M hpd on a “hot” day
Average 5 M hpd after 1 year
Most of us cannot predict demand
• Must be able to deal with NO demand
• Must be able to deal with HUGE demand
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
40
3 October 1999
Chicago, Ill.
An Architecture for Internet Services?
Need to be able to add capacity
• New processing
• New storage
• New networking
Need continuous service
• Online change of all components (hardware and software)
• Multiple service sites
• Multiple network providers
Need great development tools
• Change the application several times per year.
• Add new services several times per year.
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
41
3 October 1999
Chicago, Ill.
Premise: Each Site is a
Farm
Buy computing by the slice (brick):
• Rack of servers + disks.
Grow by adding slices
• Spread data and
computation
to new slices
Two growth styles:
1997 Microsoft.Com Farm
Building 11
Staging Servers
(7)
Internal WWW
FTP Servers
Download
Replication
Router
SQL Reporting
Live SQL Server
www.microsoft.com
(4)
premium.microsoft.com
European
Data Center
www.microsoft.com (1)
(3)
SQLNet
SQL SERVERS
Feeder LAN
SQL
Consolidators
(2)
RouterDMZ Staging Servers
FTP
Live SQL Servers
Download Server
(1)
MOSWest
Switched
Admin LAN
Ethernet
MOSWest
register.microsoft.com www.microsoft.com msid.msn.com
(1)
(2)
(4)
search.microsoft.com
(3)
home.microsoft.com
home.microsoft.com
FDDI Ring
(3)
(4)
(MIS2)
premium.microsoft.com
(2)
activex.microsoft.com
(2)
FDDI
Ring
(MIS1)
cdm.microsoft.com
(1)
Router
Router
msid.msn.com
Router
(1)
Primary
Router
Gigaswitch
premium.microsoft.com Router
www.microsoft.com (1)
Router
(3)
Secondary
Gigaswitch
Router
Router
FTP.microsoft.com
(3)
FDDI Ring
home.microsoft.com(MIS3) msid.msn.com
www.microsoft.com
(2)
(1)
(5)
0
register.microsoft.com
FDDI Ring
(2)
(MIS4)
register.microsoft.com
home.microsoft.com
support.microsoft.com
(1)
(5)
register.msn.com
(2)
(2)
support.microsoft.com
search.microsoft.com(1)
(3)
search.microsoft.com
(1)
msid.msn.com
(1)
Router
Japan Data Center
www.microsoft.com
SQL SERVERS
premium.microsoft.com(3)
(2)
(1)
msid.msn.com
(1)
Switched
Ethernet
FTP
Download Server
(1)
HTTP
search.microsoft.com
Download Servers
(2)
(2)
Router
2
OC3
(45Mb/Sec Each)
2
Ethernet
(100 Mb/Sec Each)
13
DS3
(45 Mb/Sec Each)
Internet
• Clones: anonymous servers
• Parts+Packs: Partitions fail over within a pack
In both cases,
remote farm for disaster recovery
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
42
3 October 1999
Chicago, Ill.
Scaleable Systems
Scale UP and Scale OUT
Everyone does both.
Choice is
• Size of a brick
• Clones or partitions
• Size of a pack
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
43
3 October 1999
Chicago, Ill.
Everyone scales out
What’s the Brick?
1M$/slice
• IBM S390?
• Sun E 10,000?
 100 K$/slice
• Wintel 8X
 10 K$/slice
• Wintel 4x
1 K$/slice
• Wintel 1x
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
44
3 October 1999
Chicago, Ill.
Clones: Availability+Scalability
Some applications are
• Read-mostly
• Low consistency requirements
• Modest storage requirement (less than 1TB)
Examples:
• HTTP web servers (IP sprayer/sieve + replication)
• LDAP servers (replication via gossip)
• App/compute servers or firewalls
Replicate app at all nodes (clones)
Spray requests across nodes.
Grow by adding clones
Fault tolerance: stop sending to dead
clone.
Growth: add a clone.
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
45
3 October 1999
Chicago, Ill.
Facilities Clones Need
Automatic replication
• Applications (and system software)
• Data
Automatic request routing
• Spray or sieve
Management:
• Who is up?
• Update management & propagation
• Application monitoring.
Clones are very easy to manage:
• Rule of thumb: 100’s of clones per admin
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
46
3 October 1999
Chicago, Ill.
Partitions for Scalability
Clones are not appropriate for some apps.
• Statefull apps do not replicate well
• high update rates do not replicate well
• Huge DBs (disk to expensive to clone)
Examples
• Email / chat / …
• Databases
Partition state among servers
Scalability (online):
• Partition split/merge
• Partitioning must be transparent to client.
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
47
3 October 1999
Chicago, Ill.
Partitioned
(aka. Clustered)
Apps
Mail servers
• Perfectly partitionable
 Business Object Servers
• Partition by set of objects.
Parallel Databases
Transparent access to partitioned tables
Parallel Query
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
48
3 October 1999
Chicago, Ill.
Packs for Availability
Each partition may fail (independent of others)
Partitions migrate to new node via fail-over
• Fail-over in seconds
Pack: the nodes supporting a partition
•
•
•
•
•
VMS Cluster
Tandem Process Pair
SP2 HACMP
Sysplex™
WinNT MSCS (wolfpack)
Cluster In A Box
now commodity
Partitions grow in packs.
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
49
3 October 1999
Chicago, Ill.
What Parts+Packs Need
Automatic partitioning (in dbms, mail, files,…)
• Location transparent
• Partition split/merge
• Grow without limits (100x10TB)
Simple failover model
• Partition migration is transparent
• MSCS-like model for services
Application-centric request routing
Management:
• Who is up?
• Automatic partition management (split/merge)
Gray, Research and Microsoft
SQL Server
• Jim
Application
monitoring.
50
Microsoft Research http://research.Microsoft.com/~gray/talks/
3 October 1999
Chicago, Ill.
Services on Clones & Partitions
Application provides a set of services
If cloned:
• Services are on subset of clones
If partitioned:
• Services run at each partition
System load balancing routes request to
• Any clone
• Correct partition.
• Routes around failures.
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
51
3 October 1999
Chicago, Ill.
Farm pairs: Always Up
 Two farms
 Changes from one
sent to other
 When one farm fails
other provides service
 Masks
• Hardware/Software faults
• Operations tasks (reorganize, upgrade move
• Environmental faults (power fail)
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
52
3 October 1999
Chicago, Ill.
Availabilty for a simple web site
Clones for availability
Packs for availability
Web File Store
SQL Database
SQL Temp State
Front End
Load Balance
Web Clients
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
53
3 October 1999
Chicago, Ill.
Farm Scale Out Scenarios
The FARM: Clones and Packs of Partitions
Packed Partitions: Database Transparency
SQL Partition 3
SQL Partition 2
replication
Web File StoreA
Web File StoreB
SQL
SQLPartition1
Database
SQL Temp State
Cloned Packed file servers
Web
Clients
Load Balance
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
54
Cloned
Front Ends
(firewall, sprayer,
3 October 1999
web server
)Chicago, Ill.
Reliable, Scalable, Modular
Network Load
Clients
Balancing
Clones
1
Component Load
Balancing (COM+)
Clones
Cluster Service
Pack
1
2
2
3
4
3
…
…
32
8
COM+ Components
IIS Web Server
Application Servers
Gray, Research
and services
Microsoft SQL Server
orJimother
IP based
Microsoft Research http://research.Microsoft.com/~gray/talks/
55
Data Servers
3 October 1999
SQL, Exchange, File
Chicago, Ill.
Talk 2 (if there is time)
Terminology for scaleability
Farms of servers:
Farm
• Clones: identical
 Scaleability + availability
Partition
Clone
• Partitions:
 Scaleability
• Packs
Pack
 Partition availability via fail-over
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
56
3 October 1999
Chicago, Ill.
Scalability: COM+ progress
serving 1,000-statement ASP’s (servelets)
 Poor SMP Scaleability on IIS4 NT4
450
 Big
SPS: servelets per second
improvements 400 (ASPs served per second by IIS,1P
from standard 350 1,000 statement VBscript)
300
Transaction
250
Processing
tricks
200
Shift from 4x200
Mhz
to 8 450 Mhz
 Out of Proc
150
(safe execution) 100
now much faster 50
than In Proc
0
was on IIS4
NT4 W2K W2K W2K NT4
InProc
B1
RC1 RC2
inProc InProc InProc
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
OOP
57
2P
W2K
B1
OOP
4P
W2K
RC1
OOP
8P
W2K
RC2
OOP
3 October 1999
Chicago, Ill.
Scaleability:
So, What about the death of NT/Alpha?
Two simultaneous Compaq TPC-C numbers
 Intel Profusion
 Alpha




NT/SQL/COM+
550 Mhz
8 Processors
4 GB memory
 40,368 TPM-C @
18.46$/tpmC
 745 K$ 5-year cost
 Avail: 12/31/99




Unix/Sybase/Tuexdo
700 Mhz
8 Processors
16 GB memory
 42,437 TPM-C @
55.45 $/tpmC
 $2.35 M$ 5-year cost
 Avail: 10/18/99
200% more expense for 5% more performance?
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
58
3 October 1999
Chicago, Ill.
Outline
Summary of what you heard. (10 min)
The database scene in general. (10 min)
Scaleability: Farms, Clones,Parts & Packs (15 min)
Microsoft DB research focus. (15 min)
• TerraServer (design and ops).
• RAGS.
• Data Mining
Q&A
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
(10 min)
59
3 October 1999
Chicago, Ill.
The TerraServer
http://www.terraserver.microsoft.com/
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
60
3 October 1999
Chicago, Ill.
Database & application UI
 Coverage: Range from 70ºN to 70ºS
today: 35% U.S., 1% outside U.S.
 Source Imagery:
•
 Concept: User
navigates an ‘almost
seamless’ image of
4 TB 1sq meter/pixel Aerial (USGS - 60,000
earth
46Mb B&W- 151Mb Color IR files)
• 1 TB 1.56 meter/pixel Satellite
(Spin-2 - 2400 300 Mb B&W)
 Display Imagery: 200x200 pixel images,
subsample to build image pyramid
 Store 5x compressed data
 Nav Tools:
•
•
•
•
1.5 m place names
“Click-on” Coverage map
Expedia & Virtual Globe
Pick of the week
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
200x200 m tile
.4 x.4 km browse
.8 x .8 km 8m thumbnail
1.6x 1.6 km “city view”
61
3 October 1999
Chicago, Ill.
Software:
Classic 3 Tier Design
Web Client
Image Server
Active Server Pages
(ADO)
24
Internet
Information
Server 5.0
46
HTML
Java
Viewer
browser
MTS
The Internet
20 (8/12)
Terra-Server
Stored Procedures
Fire wall
SQL Server 7
46
TerraServer DB
TerraServer Web Site
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
Internet Information
Server 4.0
Microsoft
Site Server EE
Image Delivery SQL Server
Application
7
1999
Image
Provider3 October
Site(s)
62
Chicago, Ill.
Logical Schema
Load
Mgmt
Famous
Category
Country
Name
State
Name
Famous
Place
SourceMeta
Scale
Job
Place
Type
Place
Name
Image
Search
Image
Load
Job
Pyramid
Small
Place Name
External
Geo
Image
Type
TerraServer
Gazetteer
External
Link
Imagery
External
Group
Search
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
Terra
Database
TerraAdmin
63
3 October 1999
Admin
Chicago, Ill.
TerraServer File Group Layout
Convert 324 disks to 28 RAID5 sets
plus 28 spare drives
Make 4 NT volumes (RAID 50)
595 GB per volume
Build 30 20GB files on each volume
DB is File Group of 120 files
E:
F:
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
H:
G:
64
3 October 1999
Chicago, Ill.
Hardware
Internet
Map Site
Server
Servers
SPIN-2
100 Mbps
Ethernet Switch
DS3
Web Servers
2.9 TB Database Server
AlphaServer 8400 8x400.
10 GB RAM
324 StorageWorks disks
10 drive tape library
(STC Timber Wolf DLT7000 )
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
65
3 October 1999
Chicago, Ill.
Load & Backup&Recovery
Backup and Recovery
Performance
• STK 9710 Tape robot
Data Bytes Backed Up
• SQL Server Backup & Total Time
Number of Tapes Consumed
Restore +
Total Tape Drives
• Legato Networker
Data ThroughPut
Average ThroughPut Per Device
• Fast, incremental,
Average Throughput Per Device
differential, online
NTFS Logical Volumes
• Clocked at 80 MBps (peak)
(~ 200 GB/hr)
1.2 TB
7.25 Hours
27
10
168 GB/Ho
16.8 GB/Hour
4.97 MB/Sec
2
Restore
• Fast, incremental (file
oriented), not online.
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
66
3 October 1999
Chicago, Ill.
BAD OLD Load
DLT
Tape
DLT
Tape
“tar”
NT
\Drop’N’
DoJob
LoadMgr
DB
Wait 4
Load
Backup
LoadMgr
...
LoadMgr
ESA
Alpha
Server
4100
100mbit
EtherSwitch
60
4.3 GB
Drives
Alpha
Server
4100
ImgCutter
\Drop’N’
\Images
10: ImgCutter
20: Partition
30: ThumbImg
40: BrowseImg
45: JumpImg
50: TileImg
55: Meta Data
60: Tile Meta
70: Img Meta
80: Update Place
Enterprise Storage Array
STC
DLT
Tape
Library
108
9.1 GB
Drives
108
9.1 GB
Drives
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
108
9.1 GB
Drives
Alpha
Server
8400
67
3 October 1999
Chicago, Ill.
New Image Load and Update
DLT
Tape
“tar”
Active Server Pages
Cut & Load
Scheduling
System
Metadata
Load DB
Dither
Image Pyramid
From base
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
Image
Cutter
Merge
ODBC
Tx
TerraLoader
ODBC TX
ODBC Tx
68
TerraServer
SQL
DBMS
3 October 1999
Chicago, Ill.
After a Year:
30M
Count
 1 TB of data
750 M records
10M
 2.3 billion Hits
 2.0 billion DB Queries 0
 1.7 billion Images sent
 368 million Page Views
 99.93% DB Availability
 3rd design now Online
 Built and operated by
team of 4 people
TerraServer Daily Traffic
Jun 22, 1998 thru June 22, 1999
Sessions
Hit
Page View
DB Query
Image
20M
(Hours:minutes)
8640
6:00
7920
5:30
7200
5:00
6480
Operations
4:30
5760
4:00
5040
4320
3600
2880
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
Down Time
TotalTime (Hours)
Up
3:30
3:00
2:30
Scheduled
2:00
2160
1:30
1440
1:00
720
0:30
0
0:00
70
HW+Software
3 October 1999
Chicago, Ill.
What TerraServer Shows
Can serve huge databases on Internet
for about a penny a page view
mostly phone bill (!).
Advertising pays more than a penny a page.
Commodity tools do scale fairly far.
A few people (3 developers, 1 operator)
using power tools
can build an impressive web site
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
74
3 October 1999
Chicago, Ill.
Outline
Summary of what you heard. (20 min)
The database scene in general. (10 min)
Scaleability: Packs & Mobs
(10 min)
Microsoft DB research focus. (15 min)
• TerraServer (design and ops).
• RAGS.
• Data Mining
Q&A
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
(15 min)
75
3 October 1999
Chicago, Ill.
Automatic Testing
 60% of Microsoft R&D is testing.
 What can research do to help?
•
beyond joining the 500,000 Win2K beta testers
 Test generation robot:
• Make up SQL queries
• Send them to SQL Server,
Oracle, DB2, Informix,…
• If answer is the same, great,
if not there is a problem
Case
W
X
Y
Z
1672
1672
232
234
241
31
1
1
1
1
31
15
12
28
1
12
5
116
0
29
32
4
18
18
19
25
18
113
All four
agree 84%
1672 1672
45
19
Error
 Also good for stress tests
 Found MANY bugs in our products (all fixed).
 Found MANY bugs in other’s products.
 Very valuable tool.
W,X, and
agree 95%
Problem with
intermediate
table.
 MSR-TR-98-21 Massive Stochastic Testing of SQL, Slutz, Don
http://research.microsoft.com/scripts/pubDB/pubsasp.asp?RecordID=175
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
76
3 October 1999
Chicago, Ill.
Kilo
Mega
Giga
Tera
Peta
Some Tera-Byte Databases
The Web: 1 TB of HTML
TerraServer 1 TB of images
Several other 1 TB (file) servers
Hotmail: 20 TB of email
Sloan Digital Sky Survey:
40 TB raw, 2 TB cooked
EOS/DIS (picture of planet each week)
• 15 PB by 2007
Exa
Federal Clearing house: images of checks
• 15 PB by 2006 (7 year history)
Zetta
Nuclear Stockpile Stewardship Program
• 10 Exabytes (???!!)
Yotta
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
77
3 October 1999
Chicago, Ill.
Data Mining
 Find interesting structure (patterns, relationships) in data
• Prediction
• Segmentation (clustering)
• Dependency modeling (find distribution)
• Summarization
• Trend and change detection and modeling
 Allow user to state the query in terms of the business
logic
• User does not speak statistics or SQL
 Use data to build predictors
• regression, classification, segmentation etc.
 Generate summaries and reports for insight
• find “easy to describe” segments in data automatically
• find segments not known to analyst
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
78
3 October 1999
Chicago, Ill.
Data Mining:
Microsoft SiteServer Commerce 3.0
Http://www.holtoutlet.com/outlet4
Intelligent Cross-sell
Based on:
• Historical sales baskets in
stores
• Contents of current shopper
basket
• Browsing behavior of
shopper
Predict: ranking of products
in store likely to be most
interesting to shopper.
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
79
3 October 1999
Chicago, Ill.
100.0%
98.5%
94.8%
68.5%
56.9%
43.8%
34.5%
25.5%
6.7%
5.3%
1.3%
0.6%
0.3%
0.2%
0.1%
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
% Captured of true targets
Mail to 25% and capture 40%
400% improved response!
% mailed
RealJimdata
drawn
from
a Microsoft marketing example 3 October 1999
Gray, Research
and Microsoft
SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
80
Chicago, Ill.
How do people use
www.microsoft.com?
100M hits per day
14M users/week
User
browsing
data
X segments
Data Mining
(Clustering)
Engine
Cluster
Visualizer
Wizard
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
81
3 October 1999
Chicago, Ill.
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
82
3 October 1999
Chicago, Ill.
Outline
Summary of what you heard. (10 min)
The database scene in general. (10 min)
Scaleability: Farms, Clones,Parts & Packs (15 min)
Microsoft DB research focus. (15 min)
• TerraServer (design and ops).
• RAGS.
• Data Mining
Q&A
Jim Gray, Research and Microsoft SQL Server
Microsoft Research http://research.Microsoft.com/~gray/talks/
(10 min)
83
3 October 1999
Chicago, Ill.
Download