IBM DB2 June 24th, 2002 Kenny Blair IBM Software Group

advertisement

IBM DB2

June 24th, 2002

Kenny Blair

IBM Software Group blairk2@uk.ibm.com

Agenda

The IBM Software Group

DB2

History

RDBMS Comparison

Performance / Scalability

DB2 Features

Summary

The IBM Software

Group

History

A History Lesson

1970's

IBM's E. F. Codd publishes the industry's first paper on relational database technology.

IBM's Don Chamberlin and Ray Boyce publish the paper, "SEQUEL: A Structured English Query

Language"

IBM's Jim Gray publishes "Granularity of Locks and Degrees of Consistency in a Shared Data Base"

IBM's Pat Selinger writes "Access Path Selection in a Relational Database Management System."

1980's

1983 Oracle

1987 Forerunner to DB2 on OS/2

1990's

1993 DB2 for AIX

1998 DB2 first on Linux

DB2 Invention & Innovation Leadership

Strong Linkage with IBM Research

Invented the Relational Model & SQL

First RDBMS with Cost Based Optimization

First RDBMS with Object Extensions

First Federated RDBMS

First RDBMS with Java Support

First RDBMS with In-Memory Text Search

First RDBMS with Industry Std. Web Services

First RDBMS with SMP Support

First RDBMS with Query Rewrite

First RDBMS with Integrated OLAP & Mining

First RDBMS to Publish BI Benchmarks

First RDBMS to Publish Linux Benchmarks

First RDBMS Certified for Windows 2000

… and more

Data Management Patents

1996 - 2000

IBM

Oracle

1,141

175

IBM DB2 Universal Database

Enterprise Servers

DB2 UDB for OS/390 and z/OS

DB2 for VM and VSE

DB2 UDB for OS/400

Clients

Personal

Win 95, 98,

NT, 2000

Linux

OS/2

Workgroup

Win

NT,2000

Linux

AIX

Solaris

HP-UX

OS/2

DB2 Connect

Enterprise

Win NT, 2000, OS/2

AIX, HP-UX, Solaris

Linux, NUMA-Q

Enterprise -

Extended

AIX

Solaris

Win NT, 2000

HP-UX

NUMA-Q

Linux

Everyplace

PalmOS

Win CE

EPOC-32

Linux

Win32

RDBMS

Comparison

DB2 and Oracle

The same ...

Relational Concepts

Tables, rows, columns

Tablespaces, containers

Logs

Application programming

ODBC, JDBC, VB, Perl

Same DBA tasks

Manage database availablility

Manage database space

Manage database performance

Manage database security

Help developers design/build new applications

DB2 and Oracle

...but different

Optimiser

Cost vs Rule based

Clustering Approach

Shared disk vs Shared nothing

Static and Dynamic SQL

No Redo logs

No versioning

Isolation Levels

Security Models

OCI vs CLI

Query Patroller

Performance and

Scalability

Performance Leadership

Cross Platform, Cross Workload

TPC-H Performance

128 CPU Clusters

TPC-H Price/Performance

128 CPU Clusters

Business Intelligence

#1 TPC-H @ 300GB (Unix)

#1 TPC-H @ 100GB (Linux)

#1 TPC-H @ 1TB (Windows)

#1 TPC-H @ 3TB (Windows)

25,000

20,000

15,000

10,000

ISVs

#1 SAP

Sales and Distribution (AIX)

Sales and Distribution (Linux)

Sales and Distribution Parallel (OS/390)

Advanced Planning & Optimization (AIX)

5,000

0

#1 PeopleSoft 8

CRM(AIX), FTP (AIX)

Financial Online (AIX)

GL 8 Batch (AIX)

#1 Baan

ERP 2 Tier (AIX), ERP 3 Tier (AIX)

#1 i2

Web

#1 ECPerf Java Benchmark

TPC-W @ 100,000 Items

21,053

10,764

$291

$0

DB2 7.2 (32 x 4-w ay ProLiant DL760 w /Pentium III Xeon)

Oracle 9i RAC (2 x 64-w ay Sun Starfire 10000 w /Ultra Sparc II)

SAP R/3

Sales & Distribution

25,560

$1,500

$1,000

$500

24,000

23,000

IBM Microsoft Oracle

Reality Check: http://www.ibm.com/software/data/highlights

Results as of 02/06/02; TPC,TPC-C & TPC-H are Registered Trademarks of the Transaction Processing Council

$1,250

Optimizer Technology

Select Optimization

SET CURRENT QUERY OPTIMIZATION = 1

SELECT * FROM EMPLOYEE

WHERE SALARY>50000

0 = Minimal Optimization

1 = V1 Optimization

5 = V2 Optimization

9 = Maximum Optimization

Query Rewrite Transformations

Addition of implied predicates

General predicate pushdown

Existential subquery to join transformation

Redundant join elimination

Convert quantified predicates to scalar subqueries

Conversion of OR to IN

Conversion of IN to Join

View Merging

DISTINCT Elimination

Optimizer Extensions

Removal of limits

RID List Sorting

Index ORing

Increase in execution plan analysis

Avoidance of Cartesian product

Improved join size estimation

Non-uniform distribution statistics

Fetch I/O Statistics

Different weights for random and sequential I/Os

Changes to lock intent hints

Variable CPU and I/O Cost estimation

Updateable Catalog Statistics

Performance Related - Hints

*/

Hints in ORACLE

SELECT /*+ FULL(employee) name, address, age, waist_size from employee;

SELECT /* INDEX(emp_idx1) */

location, image, age, alcohol_consumption_rate

from employee

where name ='Kenny Blair';

Location Image

Alcohol_Consumption_Rate

Age

--------------------------------------------------------------------------------------------

Cell Block H Censored N/A 20 ppd

Clustering

DB2 Oracle

Shared nothing each node has subset of data

Shared disk requires distributed lock manager

Scalability Testing at the Teraplex

Double the data, double the hardware resources - then database operations should take about the same amount of time

Phase1:

500 GB raw data

1 RS/6000 S80, 24 processors, 64gb memory, 3

Sharks (ESS V1.0, model E20)

DB2 UDB EEE V7.1

Phase 2:

1 TB raw data

2 RS/6000 S80, 48 processors, 128gb memory, 6

Sharks (ESS V1.0, model E20)

Connected by Gigabit ethernet

DB2 UDB EEE V7.1

DB2 UDB EEE

Near Linear Scaling

24 way

24 way

Tests done on 1 node with 500gb user data

Tests repeated on 2 nodes with 1TB user data total

Overall Scalability > 95%

256 concurrent users

500GB 500GB

White Paper - http://www-3.ibm.com/software/data/pubs/papers/#eeescale

Database Build

Query Performance

600

1200

508 507

500

404

438

1000

400

800

300

500 GB

1 TB 600

200 173 177

400

100 83 84

200

0

Load Runstats

Create Index Build ASTs

0

Query

500GB

1 TB

Parallel Performance and Scalability

Memory Management

Multiple Large Bufferpools

MPP Parallel

Support

Cluster

Massively Parallel

Processor (MPP)

Enhanced SMP

Parallel Support

Symmetric

Multiprocessor

(SMP)

Uniprocessor

Parallel Transaction

SQL

SQL

SQL

SQL

CPU

CPU

CPU

CPU

SQL

CPU

CPU

CPU

CPU

Parallel Query

DB2 UDB Terabyte Club

What Consultants Are Saying...

Meta Group

" The greater than 2TB DW club really belongs to NCR/Teradata and IBM RS/6000

SP with DB2 EEE (though EEE can run on other hardware clustering platforms such as Sun Cluster"

Philip Dawson, Over the Warehouse Walls, February 2001

"IBM has mounted an assault on high end DW with some success",

Philip Dawson, Over the Warehouse Walls presentation, October 2001

Giga Group

"Competition between IBM and NCR intensifies in the data warehousing area ... selected areas exist in which DB2 is in the lead as of late 2000."

Giga Group, November 2000

"IBM is on track with DB2 UDB Enterprise Extended Edition technically "

Terilyn Palanca, November 2001

DB2 Features

Supporting All Forms of Electronic Data

Image

QBIC

Universal Data

Large Objects (LOBS)

User-Defined Types (UDTs)

Business Rules

Declarative RI

Check Constraints

Defaults

Triggers

Optimized

SQL

Business Logic

DB2 Extenders

User-Defined Functions (UDFs)

Stored Procedures

Recursive SQL

Common Table Expressions

Outer Join

Table Functions

User-Defined Types (UDTs)

Text

Image

Audio

Video

XML

Net Search

Spatial

Audio

Documents

Contextual

Search

Video

Play

Seek

DataJoiner

DOS/Windows

AIX

OS/2

HP-UX

Solaris

Macintosh

Single-DBMS

Image

DRDA AR

3270

WWW

Replication

Apply others

Transparency

Global Optimization

DataJoiner

DB2

DB2 for MVS

DB2 for VSE & VM

DB2 for OS/400

DB2 PE

DB2 for OS/2

DB2 for AIX

DB2 for NT

DB2 for HP-UX or DB2 for Solaris

VSAM

IMS

Oracle

Oracle Rdb

Sybase

Microsoft

SQL Server

Informix other Relational

or Non-Relational

DBMSs

Federated Database Function

"Live", high-performance retrieval of data from multiple DBs

Application sees a single database

R/O support for DB2 and Oracle databases

Federated Database Support/Relational Connect

Subset of functionality available in DataJoiner

Other data sources can be accessed using Table Functions

Broad source selection (e.g. OLE DB Table Functions)

Static access to data using built-in warehousing capability

CREATE NICKNAME O_EMP FOR

DB2OS390.J15USER3.EMP

CREATE NICKNAME S_OFFICE FOR

ORACLE.J15USER1.OFFICE

SELECT

O_EMP.EMPNAME, S_OFFICE.OFFICENO

WHERE

O_EMP.EMPNO= S_OFFICE.EMPNO

EMPNAME OFFICENO

Smith

Jones

Adams

Miller

Bennett

C200

C202

C204

C206

C208

A table EMP exists on DB2 for OS/390

EMPNO EMPNAME

1 00

200

300

400

500

Owner is J15USER3

Smith

Jones

Adams

Miller

Bennett

A table OFFICE exists on Oracle:

EMPNO OFFICENO

1 00

200

300

400

500

Owner is J15USER1

C200

C202

C204

C206

C208

Managing Data in External Files

DB2 Data Links Manager provides comprehensive control over external data:

Integrity

Access control

Coordinated backup and recovery

Transaction consistency

Applications

File API requests

DB2 File Manager

Medical Table

EKG 1960

Gene 1983

Protein 1995

Genome 1997

Datalinks

Intranet / Internet Support

Web

Browser

Java

Applications

Java

Applets /

Servlets

Internet

Server

Intranet

Application Portability for Intranets and Internet

SQLJ Support

JDBC API

Java Client Applications

Connect via Runtime client

Excellent for intranet applications

Java Applets

Excellent for Internet applications

Can be started as an NT Service

Java Support at the Server

Stored Procedures

User-Defined Functions (UDFs)

Perl Interface

DB2 Server

Stored Procedures

UDFs

Stored Procedure Builder

GUI development environment

Integrated with MS Visual Studio, and IBM VisualAge for

Java

Web Applications

WWW

DB2

Connect

TCP/IP

DB2 Family

Relational

Connect WebSphere Application Server

Enterprise Java Beans

Java Sever Pages

Servlets

Net.Data

NSAPI / ISAPI support

FASTCGI

Non-IBM

Non-Relational

Legacy Server(s)

Web Browsers

Data Warehouse Center

Register and access data sources

DB2, Oracle,

Microsoft, Sybase,

Informix, flat file sources, and others

Model, automate, and monitor processes

Schedules, triggers, dependencies, retries, notifications

Define extraction and transformation steps

Over 100 built-in transformations leveraging SQL

Define data movement and warehouse population

Full refresh and incremental data movement

Create generic schema models

Ability to design logical star schema as source for

OLAP Server

Manage and interchange metadata

Standards-based, OMG CWMI

DB2 UDB Support for Business

Intelligence Applications

Star Join CUBE Parallel Query ROLLUP

Genus

Phylum

Kingdom

Kingdom

Phylu m

Gen us

CPU

CPU

CPU

CPU

On-Line

Analytical

Processing

Advanced Cost-Based

Optimizer

110011101010111010

111101101010101010

110001101010101010

Improved Indexing for

Faster Queries

Query Re-write

GUI-Generated

"Ugly" SQL

Optimized

SQL

PC

Server

320

Automatic Summary Tables

Replicated Tables

Index Smart

Guide

Bi-directional

Index

DB2 OLAP Server Starter Kit

Integrated OLAP capability

Easy to use interface to build and manage OLAP applications

Hyperion analytic engine combined with the power of DB2

Leverage hundreds of existing Essbase applications

Easy to install and use

Components

DB2 UDB OLAP Server

OLAP spreadsheet plug-ins for Excel and Lotus 1-2-3

Integration Server Tre

Typ atm e en t

By Year

By Tissue Type

1995

1996

1997

1998

Protein Class

Disease Life Cycle

Intelligent Miner

Data-Driven

Discovery for

Competitive

Advantage

Client

7 Techniques:

10 algorithms

(neural networks and non-neural networks)

Associations

Patterns

Clusters

Classifications

Prediction

Time sequences

Step-wise

Polynomial regression

Server

Business

Analyst

Data

Analyst

Applications

Java Admin

GUI

Results GUI

Environment/Result API

Data Mining

Kernel

PreProcessing

Data Access API

DB2

DataJoiner

Flat

Files

Informix

Sybase

SQL

Serve r

Oracle

Management

The Control Centre

Windows like front end

Icons represent other callable tools

Ability to remotely manage other DB2s

Built in replication administration

Ability to drive all DBA tasks

Menu Bar

Tool Bar

Objects Pane

Contents Pane Tool

Bar

Contents Pane

Components of Control Centr

Command Centre

Interactive GUI interface for executing OS, DB, SQL commands. Provides result viewing, storage + retrieval of commands, SQL explain + interaction with Script Centre for loading/saving scripts.

Script Centre

Facilitates the creation, modification, storage + execution of OS, DB and SQL scripts. Also provides ability to schedule when scripts should run and execution completion action.

Alert Centre

Provides alerts when "counters" reach or breach a defined threshold. Ability to raise alerts and run scripts/commands or display messages.

Journal

Shows pending, running and job history for a system. Maintains a history of system admin functions and DB2 messages and enables the rescheduling/enablement/disablement of jobs.

License Centre

Shows installed DB2 editions and current license status. Provides connection and statistics details for proper license management!

Stored Procedure Builder

RAD GUI tool for developing DB2 SQL and Java stored procedures. Enables creation, testing, debugging and registration of DB2 Stored Procedures.

Tools Settings

Configure the DB2 GUI tools and some of their settings

Information Centre

DB2 Central Library! Provides the user with quick access to the DB2 online documentation.

Components of Control Centre ctd

Command Line Processor (CLP)

Window based OS, SQL and system command entry and results.

Visual Explain

Graphical display of query optimisation and cost information with drill visual down.

Client Configuration Assistant (CCA)

Tool to enable communications information gathering and registration for remote connections, including ability to search network for DB2 Servers.

Performance Monitor

Powerful online monitoring of DB2 Objects and tasks. Provides detailed information on buffer pools, sorts, locks, I/O, CPU activity, connections, SQL activity, etc in graphical or text based format.

Wizards

DB2 "Helpers" that guide new users and DBA's, step-by-step, through some common DB2 tasks

Add Database

Backup Database

Create Database

Restore Database

Create Table

Create Tablespace

Create Index

Performance Tuning

SQL Assist

You don't wanna' do it like that.....

...you wanna' do it like this!

Working with table objects via the

Control Center

Integrated Tools

Administrator

Web Control Center (Navigator) and

Utilities

Performance Monitor

Performance Configuraton Wizard

Network Configuration Wizard

Parallel Load

Job Scheduler

Governor

Data Reorg

DB2 Server

Data Replication

Capture

Apply

DB2

Connect

Gateway

DB2 Server

Developers

Command Center (GUI CLP)

Visual Explain

DB2 Extenders

Wizards

Wizards walk you through a process step by step

They ask goal oriented questions and then do the work for you

Database Performance

Create Database

Create Index

Stored Procedure

Client Configuration

Create Table

Create Tablespace

Backup

Restore

SQL Assist

DB2 Visual Explain

SQL Statements

SELECT

EMPNO

FIRSTNAME

LASTNAME

FROM

DBA.EMP

Access Plan Graph

Statement Node Model View

NWN (11)

Help

ISCAN (10)

PK_S_SUPPKEY

SUPPLIERS

PK_S_SUPPKEY

ISAN (9)

SCAN (8)

SUPPLIERS TEMP (7)

NWN (6)

SCAN (1) FILTER (5)

PARTSUPP GROUP_BY (3)

SCAN (2)

LINEITEM

SCAN (4)

PARTS

Help

CARDINALITY

CATALOG STATISTICS

PREDICATE INFO

I/O COSTS

CPU COSTS Control Center enabled

Graphical presentation of Explain output

Detailed optimizer information

High Availability Cluster Support

Idle

Fail-over to Node 1

Active

Active

Active Active

Fail-over to Node 1

Active Active

Active Standby

Active Active

Active Active

Mutual Takeover

Active

Active

Active

Others

Active

Summary

Partner Momentum

Channel Revenue

44%

Partners

16,000

Applications

26,000

0%

'95 '96 '97 '98 '99 '00 '01

2,100

'94 '95 '96 '97 '98 '99 '00 '01

4,700

'96 '97 '98 '99 '00 '01

50% YTY Increase in ISV Sales

DB2 Certifications Up 150%

Over 80K DB2 FastPath for DBA

Downloads

Meeting the Demand for Skill

Technology

Built-in productivity

Optimization

Productivity tools

Wizards

Control Center

DB Tools...

SMART Databases

Self-managing

Self-tuning

Self- administering

"…We have one database administrator. We would have needed three times that many [DBAs], at least, to run Oracle…"

Customer Quote, BusinessWeek Online, Nov. 2001

Companies need to focus on their core competencies.

People

Certifications up 41%

DBA cross training

Over 70,000 downloads

DB2 Skills Plus Network

DB2 Scholars Programs

>4,000 Universities

Training Institutions

Skill demand is outpacing supply.

"… DB2 efficiencies yield an overall reduction in the work effort of 6% for OLTP systems, 15% for large OLTP systems,

20% for Internet-enabled databases, and 18% for data warehousing.

DB2 vs. Oracle8i: D.H. Brown, Total Cost of Ownership, December, 2000

Partners

16,000+ partners

26,000+ appls.

IBM Service and Support

Includes

One year program service

Multiple upgrade options

IBM development defect support

Other Services

IBM Education

CF281--Fast Path to DB2 UDB for Experienced DBAs

IBM Consulting

Database Migration

IBM Technical Conference

AIX Leadership Conference

DB2 Technical Conference

User Groups - IDUG, SHARE, GUIDE

Database Migrations to DB2 UDB

Migrations at major companies started in 1999

Mantech - Database Software Migration Workbench http://www.mssc-mantech.com/homepage/sqlannounce.asp

Ports: TSQL, Database Objects etc.....

Non-rel.

Informix

Sybase

Microsoft

Oracle

1996

1997

1998

1999

DB2 UDB Summary

Competitive Advantage

Significant Savings

Technology Leadership

Performance

Scalability

Accessibility

Openness

Bringing it all Together

Leverage Investments

Applications

Data

Best of Breed Solutions

Investment Protection

Service and Support

V8...

DB2 - powering e-business solutions

1995-1996

Optimization

Rich objects

Cross platform

Cluster parallelism Cluster parallelism

JDBC

Java Stored

Procedures

Java UDFs Java UDFs

Multimedia extenders

Digital libraries

SAP, PeopleSoft

SQLJ

Web-based Control Center

Intel parallelism

O/R enhancements

Spatial extender

Data Links

Web appl. servers

OLAP OLAP

Text mining

BI partnerships

Commerce partnerships

2000 e-business enablement

XML integration

In-memory database options

Federated data access

Business intelligence

Warehouse Management

Integrated OLAP tools

Query perf. & mgmt.

Spatial Support

Migration enhancements

Solution focused - CRM, SCM, ERP, e-commerce, BI

Integration with IBM's Application Framework for e-business

1997

Web access

Control Center

SMP parallelism

Sysplex parallelism Sysplex parallelism

Data mining

Replication

Baan

+++

1999

Java Stored Procedure Builder

Java-based Control Center

LDAP

OLE DB object access

Web AD and Management

Portability enhancements

Integrated analysis

Pervasive computing

Linux, Sequent

Futures

Growth in memory

Self-managing databases

Historical data modeling

Federated data management

Database / application server morphing

XML comes of age

Questions ...

... or more coffee

Download