Clustering Technology For Scaleability Jim Gray Microsoft Research

advertisement

Clustering Technology

For Scaleability

Jim Gray

Microsoft Research http://www.research.Microsoft.com/~Gray

The Answer:

BOTH SMP and Cluster?

SMP

Super Server

Departmental

Server

Personal

System

Grow Up with SMP

4xP6 is now standard

Grow Out with Cluster

Cluster has inexpensive parts

Cluster of PCs

Clusters being built

 Teradata 500 nodes (50k$/slice)

 Tandem,VMScluster 150 nodes (100k$/slice)

 Intel, 9,000 nodes @ 55M$ ( 6k$/slice)

 IBM: 512 nodes @ 100m$ (200k$/slice)

 PC clusters (bare handed) at dozens of nodes web servers (msn, PointCast,…), DB servers

 KEY TECHNOLOGY HERE IS THE APPS.

– Apps distribute data

– Apps distribute execution

So, What’s New?

 When slices cost 50k$, you buy 10 or 20.

 When slices cost 5k$ you buy 100 or 200.

 Manageability, programmability, usability become key issues (total cost of ownership).

 PCs are MUCH easier to use and program

So, What’s New?

 PCs create virtuous cycle

Vicious Cycle

No Customers!

New

MPP &

NewOS

New

App

New

MPP &

NewOS

New

App

New

MPP &

NewOS

New

App

New

MPP &

NewOS

New

App

Virtuous Cycle:

Standards allow progress and investment protection

Standard

OS & Hardware

Apps

Customers

What is Wolfpack?

 A consortium of 60 HW & SW vendors

(everybody who is anybody)

 A set of APIs for clustering and fault tolerance

 An enhancement to NT™ Server (in beta test )

 Key concepts

– System: a particular node

– Cluster: a collection of systems working together

– resource: a hardware or software module

– resource dependency: one resource needs another

– resource group: fails over as a unit: dependencies do not cross group boundaries

What is Wolfpack?

Cluster Management Tools

Database

Manager

Cluster Api DLL

RPC

Global Update

Manager

Cluster Service

Event Processor

Node

Manager

Open

Online

IsAlive

LooksAlive

Offline

Close

App

Resource

DLL

Failover Mgr

Communication

Manager

Resource Monitors

Physical

Resource

DLL

Logical

Resource

DLL

App

Resource

DLL

Resource

Management

Interface

Non Aware

App

Cluster Aware

App

Other Nodes

Cluster Advantages

 Clients and Servers made from the same stuff.

– Inexpensive: Built with commodity components

 Fault tolerance:

– Spare modules mask failures

 Modular growth

– grow by adding small modules

 Parallel data search

– use multiple processors and disks

Single System Image:Is It Important?

 Yes, if you don’t have it you fail

– parallel MPPs vs Tandem, Teradata, VAXcluster.

 NUMA & Cluster:

– some things are farther away.

– Must program in parallel to

• utilize multiple cpus, disks, wires

 OS, DBMS, TPmonitor, Web Server, ORB give transparency: load balance data and programs.

 Administrator, Programmer, User

– do not want to know about program & data location

What Happens When a Component

Fails?

 Redundant disk or path: configure around it.

 Non-redundant software: restart.

 Non-redundant hardware: migrate software to surviving nodes.

 Fault detection: 1 ms to 10 sec.

 Failover .1 sec to 1 min.

 This is standard in Tandem, Teradata,

VMScluster

What are Support Costs?

 Cluster lowers support costs by

– masking failures (instant repair via spare modules)

– allowing online maintenance and upgrades.

 Commodity parts are much cheaper

– 10$/MIPS vs 10,000$/MIPS

– 1k$/OS vs 30K$/month/OS

 Moden OSs are easier to install, configure, manage

– GUI

– Self-tuning

– Online and task-based help

– Built in wizards

Download