Jim Gray
Senior Researcher
Microsoft Corporation
Basic operations
Reacting to unplanned events
Digital
Nervous
System
Executing on planned events
Competitive advantage
PCs
Internet
Video conferencing
Electronic commerce
Strengthening Democracy
Unfiltered messages to citizens
Learn about candidates, check judicial records, endorsements
Watch debates and speeches
In-depth investigation of policies and issues of interest
How did my representative vote on this issue?
Informed voters, higher participation
Provide citizen services through a single point of contact
Improve responsiveness and accuracy
Improve productivity
Use technology to improve education
Encourage innovation
50% of house holds have PCs
Most computer literate society
Largest Internet user
Technically savvy government
Flights Standards Service (AFS)
Reinventing Information Flow
Dick Gordon
AFS
Sets standards,
Tests & certifies personnel
Inspects aircraft maintenance and operation
Backup server
HQ server
Re-engineered in 1995 to
reduce clerical tasks, make inspectors more productive
12 regional servers
Architecture
All MS Word docs stored in SQL Server
Templates come from HQ
Inspectors make certificates & reports
SQL Server replication propagates changes
Over 100 district servers
Direct Citizen Contact
Information at Your Fingertips
USGS has vast quantities of data for
Biology
Geology
Mapping
Water for
Scientists
Professionals
Citizens
Delivers to citizens via USGS web site
K-12 program on the Web http://www.usgs.gov/education
Over 10 million accesses per month from over 230,000 people
Estimate over 50% of info service is via web
Associate Director for Operations
The U.S. Geological Survey provides the Nation with reliable, impartial information to describe and understand the Earth.
minimize loss of life and property from natural disasters
manage water, biological, energy, and mineral resources
contribute to wise economic and physical development
enhance and protect the quality of life
Build a general public -oriented browsing and retrieval capability for geospatial data on the
Internet .
Increase the public’s access to and awareness of geospatial data.
Tom Barclay
Microsoft Research
Hedy Rossmeissl
Senior Program Advisor, USGS
Scaleup to Big Databases
Build a 1 TB SQL Server database
Data must be
1 TB
Unencumbered
Interesting to everyone everywhere
And not offensive to anyone anywhere
Loaded
1.5 M place names from Encarta World Atlas
3 M Sq Km from USGS (1 meter resolution)
2 M Sq Km from Russian Space agency (1.5 m)
On the web (world’s largest atlas)
Sell images with commerce server.
Microsoft TerraServer Background
Earth is 500 Tera-meters square
USA is 10 tm 2
100 TM 2 land in 70ºN to 70ºS
We have pictures of 6% of it
3 tsm from USGS
2 tsm from Russian Space Agency
Compress 5:1 (JPEG) to 1.5 TB.
Slice into 10 KB chunks
Store chunks in DB
Navigate with
Encarta™ Atlas
globe
gazetteer
StreetsPlus™ in the USA
Someday
multi-spectral image of everywhere once a day / hour
1.8x1.2 km 2 tile
10x15 km 2 thumbnail
20x30 km 2 browse image
40x60 km 2 jump image
(DOQ)
US Geologic Survey
4 Tera Bytes
Most data not yet published
Based on a CRADA
Microsoft TerraServer makes data available.
1x1 meter
4 TB
Continental
US
New Data
Coming
USGS “DOQ”
Russian Space Agency (SovInfomSputnik)
SPIN-2
(Aerial Images is Worldwide Distributor)
1.5 Meter Geo Rectified imagery of (almost) anywhere
Almost equal-area projection
De-classified satellite photos (from 200 KM),
More data coming (1 m)
Selling imagery on Internet.
Putting 2 tm 2 onto Microsoft TerraServer.
SPIN-2
Microsoft
BackOffice
http://www.TerraServer.
Microsoft.com/
SPIN-2
navigate by coverage map to White House
Download image buy imagery from USGS navigate by name to Venice buy SPIN2 image & Kodak photo
Pop out to Expedia street map of Venice
Mention that DB will double in next 18 months (2x USGS, 2X SPIN2)
Segue back to Jim Gray
100 Mbps
Ethernet Switch
DS3
Internet
Map
Server
Site
Servers
Web Servers
SPIN-2
STK
9710
DLT
Tape
Library
Enterprise Storage Array
48
9 GB
Drives
48
9 GB
Drives
48
9 GB
Drives
Alpha
Server
8400
48
9 GB
8 x 440MHz Drives
Alpha cpus
10 GB DRAM
48
9 GB
Drives
48
9 GB
Drives
1TB Database Server
AlphaServer 8400 4x400. 10 GB RAM
324 StorageWorks disks
10 drive tape library
(STC Timber Wolf DLT7000 )
48
9 GB
Drives
Compaq AlphaServer 8400
8x400Mhz Alpha cpus
10 GB DRAM
324 9.2 GB StorageWorks Disks
3 TB raw, 2.4 TB of RAID5
STK 9710 tape robot (4 TB)
WindowsNT 4 EE, SQL Server 7.0
Image
Server
Active Server Pages
MTS
Internet
Information
Server 4.0
The Internet
Terra-Server
Stored Procedures
SQL Server 7
Internet Info
Server 4.0
Microsoft Automap
ActiveX Server
Automap Server
TerraServer DB
TerraServer Web Site
Web Client
HTML
Java
Viewer browser
Internet Information
Server 4.0
Microsoft
Site Server EE
Image Delivery
Application
SQL Server
7
Image Provider Site(s)
Backup and Recovery
STK 9710 Tape robot
Legato NetWorker™
SQL Server 7 Backup & Restore
Clocked at 80 MBps (peak)
(~ 200 GB/hr)
SQL Server Enterprise Mgr
DBA Maintenance
SQL Performance Monitor
Microsoft TerraServer File Group Layout
Convert 324 disks to 28 RAID5 sets plus 28 spare drives
Make 4 WinNT volumes
(RAID 50)
595 GB per volume
Build 30 20GB files on each volume
DB is File Group of 120 files
E: F: G: H:
Incremental load of 4 more TB in next 18 months
100mbit
EtherSwitch
DLT
Tape
DLT
Tape
“tar”
NT
Backup
\
Drop’N’
Alpha
Server
4100
ESA
60
4.3 GB
Drives
Alpha
Server
4100
DoJob LoadMgr
DB
LoadMgr
Wait 4
Load
LoadMgr
...
ImgCutter
\
Drop’N’
\Images
10: ImgCutter
20: Partition
30: ThumbImg
40: BrowseImg
45: JumpImg
50: TileImg
55: Meta Data
60: Tile Meta
70: Img Meta
80: Update Place
STK
DLT
Tape
Library
Enterprise Storage Array
108
9.1 GB
Drives
108
9.1 GB
Drives
108
9.1 GB
Drives
Alpha
Server
8400
1 billion transactions
100 million web hits
• Scale up: to large SMP nodes
• Scale out: to clusters of SMP nodes
4 terabytes of data
1.8 million mail messages
http://access.ncsa.uiuc.edu/CoverStories/SuperCluster/super.html
National Center for Supercomputing Applications
University of Illinois @ Urbana
512 Pentium II cpus, 2,096 disks, SAN
Compaq + HP +Myricom + WindowsNT
A Super Computer for 3M$
Classic Fortran/MPI programming
DCOM programming model
Heterogeneous & legacy systems
Year 2000 problem
Security and Standards
High Availability
Total Cost Of Ownership
Provide software building blocks
For efficient digital nervous systems
For information at your fingertips
For continued industry innovation
Make the PC easier and simpler
Listen to customers
Public and private cooperation
Sean Murphy
Senior Systems Engineer
Microsoft
Keeping You Mission Critical
Applications Running...
Interoperability With Everyone
File & Print Services
Macintosh
NetWare
Unix
Terminal Server
SNA Server
Integrated Security
Security Interop
UNIX + NetWare
LDAP Directory
HTTP, XML
SNMP, POP3, IMAP4..
BackOffice Client
Support
Macintosh (AppleTalk
NetWare 3.x, 4.x (IPX)
Banyan Vines
OS/2
Unix
Networking
TCP/IP
SNA
IPX/SPX
AppleTalk
Vines
DECnet
LanMan
Year 2000 Compliance
Microsoft customer commitment
www.Microsoft.com/Year2000
Y2K certification
Win95 / 98
NT4 / 5
Office (Word, Excel, Access,…)
BackOffice (SQL, Exchange, SNA,..)
Compliance tests complete by Sept 1998
Downloads available from web site.
Federal Standards Support
Windows NT C2 Certification
FIPS 140-1 Certification for MS Crypto
API
Fortezza Support
Browser, Web Server, Etc
X.509 Certificate Server for PKI (Public
Key Infrastructure)
SQL Server 7 C2 Certification
Microsoft Exchange DMS
High Assurance Messaging
Total Support From the beginning
Fortezza
MSP4.0 version DMS 2.0b
Medium Assurance Messaging
(COTS)
KMS (Key Management Server)
X.509 v3 certificates, MS Certificate
Server integration
S/MIME
Futures
For NT 4.0
Security Configuration Editor
NTLM / PKI Integration
Services For UNIX
For NT 5.0
Kerberos Client Authentication
Encrypting File System
IPSec
Thin Client Windows
Windows GUI and apps on any platform
Windows-to-Windows solution
Lower total cost of ownership
Microsoft
Terminal
Server
Thin Client
SW or device
Windows-based terminal
Net PC
Workstation or Desktop PC
DOS, MAC, UNIX clients
(via Citrix plug-in)
Microsoft Message Queue
Microsoft Transaction Server
Internet Information Server
Windows NT Server
Integrated Messaging & Groupware
Supports all the popular standards
POP3, IMAP4, LDAP, X400, SNMP, S/MIME,
….
Complete Collaboration Environment
Chat and NetMeeting
Web-based Collaboration
Workflow Event Scripting and Routing Wizard
Integration with and NT security, Office, and BackOffice.
Lowest cost: Own, Manage, & Admin
Single seat admin, Sites, Proactive monitors, Auto re-routing
Most popular Groupware server
Outsells Lotus Notes
More details later today
The Most Popular SQL System
SQL Server 6.5 integrated with NT
GUI admin and tools client / server stored procedures integrated security, performance monitor
Replication (publish/distribute/subscribe)
SQL Server 6.5 Enterprise Edition
High Availability: failover.
English Language Query
Large Memory support (3GB)
Windows NT5.0
Reduced Total Cost of Ownership
Plug & Play
Remote operations (MMC and scripting)
Active Directory
Intelli-Mirroring
Hierarchical Storage Management
High Availability Windows NT:
Failover with Microsoft Cluster Server
Browser
Server 2
Web site
Database
Web site files
Database files
Web site
Database
Web site
Database
Web site files
Database files
Downtime costs U.S. business $4 billion/year*
On average, one period of downtime costs:
$140,000 in retail industry
$450,000 in securities industry
Mission-critical application support**
1992, 12% of applications ran 24 hours
1996, 28% of applications will run 24 hours
* Strategic Research Division of Find/SVP, 1992
**Standish Group, 1992
Windows NT Server
Protection against application faults
Preemptive multitasking, protected memory
“Instrumented” monitoring, logging, alerting
Protection against loss of data
Journalled, recoverable file system
Disk mirroring, RAID striping with parity
Protection against loss of power
Uninterruptible Power Supply (UPS) support
Protection against loss of a logon server
Replicated Directory & Backup logon servers
Current NT Server High-Availability
& Clustering Solutions
Vendor
Amdahl
Product
EnVista Central Server
Compaq
Digital
Server
Fujitsu/ICL
Recovery Server
Digital Clusters for NT
High-Availability Manager
Hewlett Packard MC/ServiceGuard
Marathon MIAL
NCR LifeKeeper
NSI
Netframe
Octopus
Stratus
Tandem
Veritas
Vinca
Double-Take
ClusterServer
Octopus A.S.O.
RADIO
CAS
FirstWatch
StandBy Server
Microsoft Cluster Server (Windows NT
Server Enterprise Edition)
Client PCs Printers
Server 2 Server 1
“Heartbeat” Connection
Server Storage
Microsoft Cluster Server
Report
Server 1 Server 2 eService eService
Microsoft Cluster Server
Server 1 Server 2 eService eService
Microsoft Cluster Server
Server 2 eService eService
Microsoft Cluster Server
Server 1 eService eService
Microsoft Cluster Server
Server 2 eService eService
Compaq
Alpha WKS blah blah
Bingcrosby
Abbott Costello
Raid Array Drives X:,Y:,Z:
Compaq Pentium
Pro Servers blah blah
WinNT + SQLServer Progress in Transaction Processing
100% better per year!
Tpm-C
14,000
And now 16,257 TPM-C
More coming
12,000
WinNT & SQL
Enterprise Edition
Price also improved
10,000
8,000
6,000
4,000
2,000
NT 4.0 + SQL 6.5
95th Percentile
Market Requirement
Windows NT Server 3.51
0
Price Performance
Improved 100%/year for last 3 years
Software is ~10% of costs (30% on UNIX)
UNIX hardware and software has a 6x premium
45
10
10
5
5
0
0
8
8 processor processor
Sun Oracle 52 k tpmC @ 134$/tpmC
HP+ NT4 +SQL Server 16.2 ktpmC @ 33$/tpmC
HP+ NT4 +SQL Server 16.2 ktpmC @ 33$/tpmC
35
30
$ cost of each component disk disk
4
4 software software
7
5
5 net net
12
3
3 total/10 total/10
SQL Server 7: Easy & Powerful
Easy
Scalability
Data Warehousing
Dynamic self management
Multi-site management
Alert/response management
Job scheduling and execution
Scriptable management
profiling/tuning tools
Fully Unicode
English Language Query
Integrated text search engine
(fewer knobs)
Desktop & Workgroups
Auto Configure Engine / Dynamic Disk/memory
Reduce Learning Curve, Increase Productivity
SelfManaging SQLAgent, Wizards, “Task Pads”
Large Servers
Deploy/manage hundreds of SQL Servers
Lower TCO for Large Environments
MultiServer Operations/ “Lights-out” Environment
Built-in GUI
data/schema design data query & edit integrated with programming tools
SQL Server Profiler
Selected server events and trace criteria
“Capture” output to screen or replay
SQL Server Expert
Analyzes actual server usage history
Makes recommendations to improve performance
Recommends Index design
Recommends operations procedures
Multi-Site Management
Admin servers from one place
Automate simple stuff
Wizards for common stuff
Manage arrays of servers
operations, security,…
Replication
Import/export
Interface is scriptable
COM object model
Script with Java, VB, ...
Scheduling and Multi-step jobs
Wizards galore (over 50 at last count)
MS Access as a query interface
Built-in data access tools (integrated with tools)
Graphical show plan
Create a Database
Scheduled Backup
Create a Maintenance Plan
Create a Scheduled Job
Create an Alert
Security Wizard
Import Data to SQL Server
Export Data From SQL
Clustering (Wolfpack)
Index Tuning Wizard
Web Assistant
Register Servers
Configure Replication
Create Publication
Create Pull Subscription
Create Push Subscription
Replication Partitioning
Create an Index
Create Stored Procedure
Create a View
More to come...
Transactional and Merge
Remote update
ODBC and OLE DB subscribers
Wizards
Performance
Publisher
2PC,
RPC
Distributor
OS 390
VSAM
DB2
Subscriber
CICS
DB2
Subscriber Subscriber Subscriber Updating Subscriber
(immediate updates)
SMP & Disk Parallelism
Global Agg.
Result 50 rows +
4 x 50 rows
Local Agg.
+ + + +
Disks
Plus Distributed
Plus Hash Join
Plus Optimized Partitioned views
50,000 rows
• # of emp. per group
• total inc. per group
Distributed Heterogeneous Queries
Data Fusion / Integration
Join spread sheets, databases, directories,
Text DBs etc.
Any source that exposes OLE DB interfaces
SQL Server as gateway, even on the desktop
SQL 7.0
Query
Processor
Directory
Service
Database
(DB2, VSAM, Oracle, …)
Spreadsheet
Photos
Maps
Documents and the Web
Scalability
Easy
Scalability
Data Warehousing
Win95 Win98 version
Dynamic row-level locking
Improved query optimizer
Intra-query parallelism
64-bit support
Replication
Distributed query
High Availability Clusters
Full function ( same as NTW )
Self managing
Many tools
Integration with Next MS
Access
Great for imbedded apps
Seagate 47GB drive @ 3k$
100 GB penny per MB drive coming in 2000
10 $/GB = 10 k$/ Terabyte! (in y2k)
Everyone can afford one
What’s a terror bite ?
If you sell ten billion items a year (e.g Wal-Mart)
And you record 100 bytes on each one
Then you get a Terabyte
Where will the terabytes come from?
Multimedia (like the TerraServer) and...
Multi Media:
Very Large Data Bases
Photo is 100 KB, not 100 B
So, photo DBs are 1,000x larger
Examples:
Scanned documents
Photo records of products/people/places
Surveillance
Scientific monitoring
Kilo
Mega
Giga
Microsoft TerraServer
Tera
Sloan Digital Sky Survey:
Peta
40 TB raw, 2 TB cooked
Exa
EOS/DIS (picture of planet each week)
Zetta
15 PB by 2007
Yotta
Federal Reserve Clearing house: images of checks
15 PB by 2006 (7 year history)
Nuclear Stockpile Stewardship Program
10 Exabytes (???!!)
Easy
Scalability
Data Warehousing
Warehousing Framework
Visual data modeler
Microsoft repository
Data transformation services
(DTS)
Plato & Dcube - Multi
Dimensional Data Cubes
English query 2.0
Built-in text-index engine
Debates between
MOLAP and
ROLAP vendors obscure customer needs
Plato is the product that best supports MOLAP,
ROLAP and Hybrid and offers the most seamless integration of all three
Users & apps only see cubes
Data load
Data access
Persistent
Store
MD
Cache
MD
Cache
User
View
User
View
User
View
Source table
Europe
Partition 1
By Year
By Make & Year
“Plato”
By Make
RED
WHITE
BLUE
By Color & Year
Sum By Color
User 1
ROLAP
SQL
USA
Partition 2
Designer MD SQL
Client app
“Plato” server
Asia
Partition 3
Client app
User 2
ROLAP
Plato Data Explosion Wizard
Aggregation Wizard finds the “80-20” rule in the data
The 20 percent of all possible pre-aggregations that provide 80 percent of the performance gain
Analyses level counts for each dimensions and parent-child ratios for each level
Independent of OLAP data model
Windows 9X
Microsoft Office
Visual Studio
Windows NT and
Microsoft BackOffice
Support and
Services
Microsoft’s Enterprise Focus
Internet
Simplicity
Scalability
Interoperability
Reliability
Microsoft is investing to make
Windows NT™
BackOffice™
Enterprise Ready
Key Enterprise Features
Powerful: do the job
Scaleable: can handle the biggest jobs
Simple: Easy to build and operate
Economic: Lowest total cost of ownership
Inter-operate with other systems