EXADATA: deployment story

advertisement

Сергей Щукин shchukin@devexperts.com

EXADATA: deployment story

Agenda

1.

Introduction

2.

Our Exadata

3.

First Issues

4.

Preparation for the Migration

5.

Migration

6.

Performance Comparison

7.

Issues Afterwards

Сергей Щукин shchukin@devexperts.com

Сергей Щукин shchukin@devexperts.com

Introduction

Early history

• Started as a storage subsystem – former name SAGE

• Originally designed for very large databases

• Fast data access: symbiosis of HW and SW

• Increase the channel or decrease the data flow?

Introduction

Main Exadata’s secret

Data Processing on Storage layer

1.

Less data is transferred over the Network

2.

Less resources used on DB servers

Сергей Щукин shchukin@devexperts.com

Introduction

Versions’s history

• V1

Released in 2008

• HP h/w + Oracle s/w (Database Machine)

• No Flash Cache

• Data Warehouse oriented

• V2

• Announced at OOW 2009

• Partnership between Sun and Oracle

Added 384GB of SSD + s/w changes

More than Data Warehouse

• X2-2

Announced at OOW 2010

Same as V2 except CPU (2x6 Cores, V2 used 4 Cores)

X2-8 used per Unit: 8 CPU x 8Cores + 1TB memory,

OLTP or mixed workload oriented

Сергей Щукин shchukin@devexperts.com

Сергей Щукин shchukin@devexperts.com

Introduction

Existing configurations

V2 X2-2

DB

Server 8x ProLiant DL

360 G5

V1

Storage DB Storage

14xProLiant

DL180 G5

Sun ?

Sun ?

DB

8xSun x4170

M2

Storage

14x Sun x4270 M2

DB

2xSun x4800

X2-8

Storage

14x Sun x4270 M2

CPU 2x 4Core

Xeon E5430

2,66 GHz

RAM

Disks

Info

32GB

2x 4Core

Xeon E5430

4Core ?

8GB

12xSAS

450GB

Added

384GB SSD

Flash

Cache

2x 6Core

Xeon X5670

2,93 GHz

2x 6Core

Xeon L5640

96GB 24GB

12xSAS

600GB

512MB

Controller cache+

384GB SSD

8x 8Core

Xeon

X7560

2.26Ghz

1tb

2x 6Core

Xeon L5640

24GB

12xSAS

600GB

512MB

Controller cache+

384GB SSD

Introduction

Existing configurations

3-D model links http://oracle.com.edgesuite.net/producttours/3d/exadata22/index.html

http://oracle.com.edgesuite.net/producttours/3d/exadata28/index.html

Physical parameters

Connectivity

Сергей Щукин shchukin@devexperts.com

Introduction

Existing configurations

DB server: Sun Fire X4170 M2

Сергей Щукин shchukin@devexperts.com

Introduction:

Existing configurations

Storage server: Sun Fire X4270 M2

Сергей Щукин shchukin@devexperts.com

Introduction

Exadata’s features

• Offloading

Smart Scan

Column projection

Predicate filtering

Storage Indexes

Hybrid Columnar Compression (decompression)

Encryption/Decription

Datafile initialization

RMAN (db block change tracking) offload

• Smart Flash Cache

• Parallel Operations

• Resource management

Сергей Щукин shchukin@devexperts.com

Сергей Щукин shchukin@devexperts.com

Our Exadata

System’s description

• Order processing system, Forex market

• OLTP database + Reporting database

• DB version: 10gR2 EE RAC, OS: OEL 5.5 x86_64

• Tpm - 1000

• DB Size – OLTP(1TB, logs per day: up to 70GB), Reporting (2TB)

• Users online (up to 10 000) + orders issued 1000 per minute

• Streams environment

Сергей Щукин shchukin@devexperts.com

Our Exadata

Shipment details

Date Event

21-APR-2011 Purchase of Exadata is approved by Client

27-MAY-2011 Exadata arrived with problem power supply

07-JUL-2011 Power supply was changed

19-JUL-2011

20-JUL-2011

Primary setup was completed

Testing was started

Days passed

36 days

11 days

12 days

1 day

Сергей Щукин shchukin@devexperts.com

Our Exadata

Primary setup

• We got pre-deploy network setup document from Oracle

• System engineers prepared network according the document (address space, dns, etc)

• Particular ports were dedicated on switches for Exadata

• Exadata arrived and was plugged in

• Checks

Our Exadata

What we got

• Software versions

• 11.2.0.2 BP8 DB servers

• 11.2.2.3.2 Cells

• ASM layout

• Local disks (LVM)

• Precreated database with DBFS

OFA

Сергей Щукин shchukin@devexperts.com

• ASM layout

Our Exadata

What we got

Сергей Щукин shchukin@devexperts.com

First issues

No place for copying backup

DBFS

Local LVM

NFS

Attach external storage

Сергей Щукин shchukin@devexperts.com

Сергей Щукин shchukin@devexperts.com

First issues

Slow NFS

1.

Was: 1 MB/sec

2.

Fixed after putting Exadata and Storage server into the same vlan

First issues

Restrictions in customization

Any changes in SW and HW(except switches) are prohibited

Сергей Щукин shchukin@devexperts.com

Preparation for the Migration

Real Application Testing (RAT)

Сергей Щукин shchukin@devexperts.com

Description

• Capture & Replay

• EE option

• Playback from 11.1.0.6

• 9i & 10g Capture only.

Patch is required

• Cost:11500 $ per processor

+ 2530 support

Restrictions

• Streams are skipped

Direct path load of data from external files

• Flashback queries

• Distributed transactions

Non SQL-based object access

Preparation for the Migration

Real Application Testing (RAT)

Streams workload is not captured

Сергей Щукин shchukin@devexperts.com

Сергей Щукин shchukin@devexperts.com

Source DB

Upgrade

Destination DB

Upgrade to any from release release

What patch you need to apply? Download Information

9.2.0.8.0

>=11.1.0.7.0

9.2.0.8.0 + one-off patch 9373986 One-off patch can be downloaded

+ one-off patch

8712466 on top of 11.1.0.7.0

or requested from MOS

9.2.0.8.0

>=11.2.0.2.0

9.2.0.8.0 + one-off patch 9373986 One-off patch can be downloaded and 11.2.0.2.0 + patch 11870615 from MOS

Comments

8712466 is a merge patch on top of

11.1.0.7.0

One-off patch can be downloaded from MOS

10.2.0.4.0

>=11.1.0.7.0

10.2.0.4.0

10.2.0.5.0

11.1.0.7.0

>=11.2.0.2.0

>=11.2.0.2.0

>=11.2.0.2.0

10.2.0.4.0 patchset+ one-off patch 10239989 + one-off patch

8712466 on top of 11.1.0.7.0

10.2.0.4.0 Patchset + one-off patch 10239989 and 11.2.0.2.0 + patch 11870615

10.2.0.5.0 Patchset + one-off patch 9373986 and

11.2.0.2.0 + one-off patch

11870615

11.1.0.7.0 + one-off patch

8712466 + 9373986 and 11.2.0.2.0 + one-off patch

11870615

Patchset can be downloaded from

MOS

Patchset can be downloaded from

MOS

Functionality already exists in

Patchset, download from Metalink

Functionality already exists in

Patchset, download from MOS

One-off patch can be downloaded or One-off patch can be downloaded or requested from MOS requested from MOS

One-off patch can be downloaded or One-off patch can be downloaded or requested from MOS requested from MOS

MOS 560977.1

Сергей Щукин shchukin@devexperts.com

Preparation for the Migration

Capture (10.2.0.4.4 -> 11.2.0.2)

• Get one-off patch 10239989 compatible with 10.2.0.4.4. Apply it

• Activate capture feature on a source system (wrrenbl.sql)

• Create filter (every time): e xec DBMS_WORKLOAD_CAPTURE.ADD_FILTER ( fname =>

'filter_name', fattribute => 'USER', fvalue => user)

• Start Capture: exec DBMS_WORKLOAD_CAPTURE.START_CAPTURE (name =>

'test_capture', dir => 'rat', duration => 1200, capture_sts

=> TRUE, sts_cap_interval => 300)

• Export AWR : exec dbms_workload_capture.export_awr (capture_id => )

• Useful views : DBA_WORKLOAD_CAPTURES DBA_WORKLOAD_FILTERS

Сергей Щукин shchukin@devexperts.com

Preparation for the Migration

Replay

• Copy files, Process them once exec DBMS_WORKLOAD_REPLAY.PROCESS_CAPTURE(capture_dir=>

'RAT')

• Restore consistent db copy, reset time

• Initialize Replay exec DBMS_WORKLOAD_REPLAY.initialize_replay (replay_name =>

'test_replay', replay_dir => 'RAT'); exec DBMS_WORKLOAD_REPLAY.REMAP_CONNECTION( !!!

) exec DBMS_WORKLOAD_REPLAY.prepare_replay (synchronization

=> TRUE);

• Calibrate WRC clients wrc mode=calibrate replaydir=/opt/oracle11g/rat

• Connect WRC clients wrc system/... mode=replay replaydir=/opt/oracle11g/rat

• Start Replay exec DBMS_WORKLOAD_REPLAY.start_replay;

• Analyze diffs in reports dbms_workload_capture.IMPORT_AWR

dbms_workload_repository.awr_diff_report_text

Сергей Щукин shchukin@devexperts.com

Preparation for the Migration

Bug in Playback

• Worked fine for Reporting DB (3 hours)

• Better DB Time (lower: 855.3 vs 138.7)

• Better CPU (faster, less cpu time: 11000sec vs 3600sec)

• Better IO reads (sinlge block read 13ms vs 0.6ms, 6,5ms vs 1ms)

Worse IO write , especially db file parallel write (4,5ms vs 20ms)

Better interconnect

• For Trading DB gave ora-00600 after 1 hour of working

ORA-00600: internal error code, arguments: [1433], [60], [],

[], [], [], [], [], [], [], [], [] CKPT (ospid: 4327): terminating the instance due to error 469

System state dump requested by (instance=1, osid=4327

(CKPT)), summary=[abnormal instance termination].

• Oracle made a patch after 2,5 months

Сергей Щукин shchukin@devexperts.com

Migration

Official methods

• Physical

• Pros

• Simple

Less downtime

• Cons

• Data structure can’t be changed (extent’s size, partitioning, HCC)

• Unnecessary data migrates as well

• Logical

• Pros

• One can change data structure

• Old data can be skipped

• More flexible in terms of changing the platform

• Cons

• More complicated

• Involve more downtime

Migration

Official methods

Physical

Сергей Щукин shchukin@devexperts.com

http://www.oracle.com/technetwork/database/features/ availability/xmigration-11-133466.pdf

Migration

Official methods

Logical

Сергей Щукин shchukin@devexperts.com

Сергей Щукин shchukin@devexperts.com

Migration

Our method (10gR2 -> 11gR2)

• Backup & Restore database

• Manual applying of the archivelogs

• Open in new DB with RESETLOGS and UPGRADE

Сергей Щукин shchukin@devexperts.com

Migration

Difficulties

• Streams

• Test runs with rollback

• 10gR2 standby in Amazon

• Need to run utlu112i.sql on the Prod DB with the following rollback

Сергей Щукин shchukin@devexperts.com

Migration

Changes made by utlu112i.sql

Unsupported

Unsupported set transaction read write;

Unsupported

Unsupported commit;

Unsupported

ALTER TABLE registry$database ADD (tz_version NUMBER); set transaction read write;

Unsupported commit; update "SYS"."OBJ$" set "OBJ#" = '898', "DATAOBJ#" = '898', "TYPE#"

"FLAGS" = '0', "OID$" = NULL, "SPARE1" = '6', "SPARE2" = '2' where " set transaction read write;

ALTER PUBLIC SYNONYM DBA_REGISTRY_DATABASE COMPILE; update "SYS"."SYN$" set "NODE" = NULL, "OWNER" = 'SYS', "NAME" = 'DB delete from "SYS"."DEPENDENCY$" where "D_OBJ#" = '7533' and "D_TIMES update "SYS"."OBJ$" set "OBJ#" = '7533', "DATAOBJ#" = NULL, "TYPE#" and "CTIME" = TO_DATE('09-AUG-11', 'DD-MON-RR') and "MTIME" = TO_DAT insert into "SYS"."DEPENDENCY$"("D_OBJ#","D_TIMESTAMP","ORDER#","P_O commit;

Сергей Щукин shchukin@devexperts.com

Migration

Step-by-step for Rep database

1.

Backup both databases (Trad & Rep)

2.

Restore Rep copy on Exadata (over NFS). Keep it synchronized

3.

Stop applications

4.

Disable synchronization with Standby

5.

Build Streams catalog (to be able to rewind changes applied on new REP copy)

6.

Create guaranteed restore point on Rep database

7.

Execute utlu112i.sql

on Rep db

8.

Apply all logs from Rep db to Exadata’s copy

9.

Open with RESETLOGS and UPGRADE Rep copy on Exadata. Run upgrade scripts

10.

Direct Streams to the new Rep copy on Exadata

11.

Direct application to the new Rep db

Rollback

1.

Flashback Rep db to the guaranteed restore point

2.

Rewind Streams on Trading db and re-apply changes on Reporting db

Go Live

1.

Upgrade Standby to the 11gR2. Synchronize it

Performance comparison

AWR diffs (RAT)

Сергей Щукин shchukin@devexperts.com

Performance comparison

AWR diffs (RAT)

• CPU Time: 11000sec vs 3600sec

• Sinlge block read 13ms vs 0.6ms

• Multiblock read 6,5ms vs 1ms

• DB file parallel write 4,5ms vs 20ms WORSE

• Log file sync 33ms vs 17ms

• CF parallel read 15ms vs 4ms

• CF sequential read 7.4ms

vs 1.5ms

Сергей Щукин shchukin@devexperts.com

Performance comparison

Improved/degraded events

1 week comparison

Сергей Щукин shchukin@devexperts.com

Сергей Щукин shchukin@devexperts.com

Performance comparison

Our own observations

1.

DB time/CPU time slightly better

2.

Amount of User IO increased (buffer cache 115GB->15G), but avg time decreased

3.

Half SQL performs better, another half performs worse

4.

Hard to analyze

1.

DB Caches decreased

2.

HW changed

3.

SW changed (plans)

4.

Exadata features hard to use (Smart Scan requires full scans)

5.

Not all the databases were migrated

Performance comparison

Redo writes to SSD

Exadata Smart Flash Log

Version 11.2.2.4 at the cell level

• Version 11.2.0.3 at the DB level

(when it comes out) or 11.2.0.2 BP11

Enabled by default

Сергей Щукин shchukin@devexperts.com

Сергей Щукин shchukin@devexperts.com

Performance comparison

HCC for historical DB

• Good for Historical data, that doesn’t change

• Requires direct inserts

• Doesn’t require additional license on Exadata

• Decompression on Storage server (Offloading)

• Our example gave 8.7 times compression (compress for archive high)

• Can be estimated by DBMS_COMPRESSION.GET_COMPRESSION_RATIO

Сергей Щукин shchukin@devexperts.com

Issues afterwards

Data corruption on Standby

Tue Nov 01 16:33:41 2011

Errors in file

/opt/orabase/diag/rdbms/reports/nlrcs/trace/nlrcs_mrp0_31928.trc

(incident=44233):

ORA-00600: internal error code, arguments: [3020], [8], [1938230],

[35492662], [], [], [], [], [], [], [], []

ORA-10567: Redo is inconsistent with data block (file# 8, block# 1938230, file offset is 3644022784 bytes)

ORA-10564: tablespace DATA_TS

ORA-01110: data file 8: '+DATA/reports/datafile/data_ts.270.740684989'

ORA-10561: block type 'TRANSACTION MANAGED DATA BLOCK', data object# 113755

• SR with Severity 1 status!

• No root causes within 3 months!!

Issues afterwards

Bug in statistic’s gathering

Сергей Щукин shchukin@devexperts.com

• Bug 12865902 NOWAIT lock requests could hang (like Parallel Queries may hang "enq:

TS - contention") in RAC

Fixed in 11.2.0.2 BP13

Q/A

Сергей Щукин shchukin@devexperts.com

Download