Oracle RAC From Dream To Production

advertisement
In RAC We Trust
ORACLE - From Dream To Production
BGOUG – Gabrovo 22.04.2005
,, Let someone k n o w ”
Plamen Zyumbyulev
Presentation Goals
 Describe the major steps in RAC implementation
 Show and explain the main problems and obstacles
 Show and explain solution and workarounds
 Give some practical ideas and variations
 Try to look at the problems from different angles
Non-Goals
 Explain RAC concepts and fundamentals
 Show ALL aspects of RAC
Agenda
 Introduction
 RAC Installation
 High Availability Configuring
 Testing and Tuning RAC
 Implementing RAC in production
Introduction
RAC Installation
 Problems
 Solutions
Problems
 Need to evaluate and purchase cluster hardware
 Need of knowledgeable OS, High Availability,
Network and Storage professionals
 All this takes time
Solutions
 RAC on Single Node
 RAC on Single VMware Node
 RAC on Multiple VMware Nodes
 RAC and Network Block Device
 Other Solutions
All of the solutions presented here are for testing purposes
ONLY. These configurations are not certified or supported
by Oracle Support Services
RAC on Single Node
 Why not?
 Metalink Note:241114.1 - Step-By-Step Installation of
RAC on Linux - Single Node (Oracle9i 9.2.0 with OCFS)
 Key Points



No need of “fencing” configuration
No need of using clustered file system or raw devices
No need of multiple oracle homes (ORACLE_HOME)
RAC on Single Node (cont’d)
 Key Points (cont’d)

One oracle user with 2 or more profiles one for every
instance. (e.g. .rac1, .rac2, …)
zyumix:/# su - oracle
oracle@zyumix:~$ . rac1
oracle@zyumix:~$ echo $ORACLE_SID
rac1
zyumix:/# su - oracle
oracle@zyumix:~$ . rac2
oracle@zyumix:~$ echo $ORACLE_SID
rac2
RAC on Single Node (cont’d)
 Key Points (cont’d)

Oracle Universal Installer needs Clusterware software in
order to install RAC option.
 Disadvantages
RAC on Single Node (cont’d)
Client side
load balancing
Listener
‘A’
Listener
‘B’
Server side
load balancing
Server A
Instance ‘A’
Server B
Instance ‘B’
Database
RAC on Single VMware Node
 Even easier !!!
 The Oracle-on-Linux VMware Cookbook http://www.oracle.com/technology/tech/linux/vmware/cookbook/index.html
An easy, hands-on, step-by-step guide describing how-to
install VMware, Linux (RHEL/SLES) and Oracle RAC
(again on single node)
VMware Workstation (90-day free eval; registration required)
RHEL3 (not free) , SLES8 (not free)
 Disadvantages
RAC on Multiple VMware Nodes
 VMware GSX/ESX Server permits the sharing of plain disks
with multiple virtual machines running on the same host,
provided the disk in question is a SCSI disk.
 This approach is very powerful but complex. You can create
very complex environments – multiple NICs, switches, disks
etc.
 Now there are a number of nodes although virtual
 Disadvantages
RAC and Network Block Device
 This solution allows you to build scalable and high available
database system only with common Intel PCs connected into
Ethernet network.
 In this solution, a standard shared disk subsystem is replaced
by a native Linux technology - Network Block Device
(NBD) that maps remote files to local block devices (e.g.
/dev/nb0) via TCP/IP network. One computer (not necessarily
Linux machine) serves as data storage for all cluster nodes
(Linux machines) instead of expensive disk array.
RAC and Network Block Device
(cont’d)
 With this thing compiled into the kernel, Linux can use a
remote server as one of its block devices. Every time the
client computer wants to read /dev/nd0, it will send a request
to the server via TCP, which will reply with the data
requested.
 The remote resource doesn't need to be a whole disk or even
a partition. It can be a file.
RAC and Network Block Device
(cont’d)
Typical configuration
Simple NBD configuration
RAC and Network Block Device
(cont’d)
 Installation
Both client and server machines are with RHEL3


Download source from http://nbd.sourceforge.net/
As root do






bunzip2 nbd-2.7.3.tar.bz2
tar -xvf nbd-2.7.3.tar
cd nbd-2.7.3
./configure
make
make install
RAC and Network Block Device
(cont’d)
 Creating new empty files at NBD server
[root@rac2 root]# dd if=/dev/zero
of=/u01/oradata/rac/system.01 count=300 bs=1M
300+0 records in
300+0 records out
314572800 bytes transferred in 1.683993 seconds (186801738
bytes/sec)
[root@rac2 root]#
 Running NBD server
Syntax: nbd-server <port> <filename>
[root@rac2 root]# nbd-server 4101 /u01/oradata/rac/system.01
[root@rac2 root]#
RAC and Network Block Device
(cont’d)
 NBD client
NBD client must be run as root (because of kernel parts of
NBD). Before starting NBD client you would have to install
Linux kernel NBD module

Installing ndb module – RHEL3
[root@rac3 root]# rpm -Uvh kernel-unsupported-2.4.214.EL.i686.rpm
warning: kernel-unsupported-2.4.21-4.EL.i686.rpm: V3 DSA
signature: NOKEY, key ID db42a60e
Preparing...
########################################### [100%]
1:kernel-unsupported
########################################### [100%]
[root@rac3 root]#
RAC and Network Block Device
(cont’d)
 NBD client (cont’d)

Loading nbd module
[root@rac3
[root@rac3
[root@rac3
nbd
[root@rac3

dev]# lsmod | grep nbd
dev]# modprobe nbd
dev]# lsmod | grep nbd
16388
0 (unused)
dev]#
running nbd client
Syntax: nbd-client <data server> <port> /dev/nb<n>
[root@rac3 dev]# nbd-client rac2 4101 /dev/nb0
Negotiation: ..size = 307200KB
bs=1024, sz=307200
[root@rac3 dev]#
RAC and Network Block Device
(cont’d)
 Now block devices are configured and it is possible to access
remote data. Oracle Real application clusters need raw access
to shared disk subsystem so mapping raw devices to block
devices is needed. This could by done with standard raw
command.
Syntax: raw /dev/raw/raw<N> /dev/<blockdev>
[root@rac3 root]# raw /dev/raw/raw1 /dev/nb0
/dev/raw/raw1: bound to major 43, minor 0
[root@rac3 root]#
Other Solutions

RAC and FireWire

Build Your Own Oracle RAC 10g Cluster on Linux and
FireWire
http://www.oracle.com/technology/pub/articles/hunter_rac10g.html

RAC and NFS
Locking
 Caching
 Write through cache

HA Configuration
Simplified RAC schema
NET
Database
HA System
Maximum Availability
Architecture
Application Server
Application Server
Dedicated Network
Data Guard
Primary Site
Secondary Site
Extended Distance Clusters
Application Server
Application Server
RAC
Dedicated Network
Virtualization storage layer
What about the Quorum Server???
Primary Site
Secondary Site
Extended Distance Clusters (cont’d)
 Resolving Distance Problems

Application partitioning


Wavelength Division Multiplexing



gc_files_to_locks
Dense Wavelength Division Multiplexing DWDM
Coarse Wavelength Division Multiplexing CWDM
ACTIVE_INSTANCE_COUNT


*.active_instance_count = 1
*.cluster_database_instances = 2
Testing and Tuning RAC
 Introduction
 RAC testing steps
Functional Application Tests
 RAC High Availability tests
 Scalability tests

 Digging into RAC performance problems
Introduction
 Testing isn’t trivial !!!
 Classical testing/tuning methods.

Always tune single instance first!
 Specific RAC issues
 RAC aware tools
RAC testing steps
 Functional Application Tests
 RAC High Availability tests

Be aware about the timeouts!!!
 Scalability tests
RAC testing steps (cont’d)
 Scalability tests (cont’d)
Patterns of application scalability
Performance (TPS, 1/response time)
of one user

nearly static
linear
exponential
constrained
Load ( # users, size of tables)
RAC testing steps (cont’d)
 Scalability tests (cont’d)
Performance -TPS (for all users)
 Good scalability
7000
6000
5000
2 node RAC
4000
3000
Single node
2000
1000
0
4
8
12 16 20 24 28 32 36 40 44 48 52 56 60 64 68 72 76 80 84 88 92 96 100
# of concurrent users
RAC testing steps (cont’d)
 Scalability tests (cont’d)
Performance -TPS (for all users)
 Problem!!!
Possible disk bottleneck
7000
2 node RAC
6000
5000
Single node
4000
3000
2000
1000
0
4
8
12 16 20 24 28 32 36 40 44 48 52 56 60 64 68 72 76 80 84 88 92 96 100
# of concurrent users
RAC testing steps (cont’d)
Performance -TPS (for all users)
 Scalability tests (cont’d)
 Problem!!! Possible interconnect bottleneck
10000
9000
8000
Single node
7000
6000
5000
2 node RAC
4000
3000
2000
1000
0
4
8
12 16 20 24 28 32 36 40 44 48 52 56 60 64 68 72 76 80 84 88 92 96 100
# of concurrent users
Digging into RAC performance
problems
 Interconnect and shared storage are the two most
possible performance problem areas in RAC
 Interconnect “speed”


Throughput
Latency

average latency of a consistent block request. AVG CR BLOCK
RECEIVE TIME should typically be about 15 milliseconds
depending -- on your system configuration and volume
Digging into RAC performance
problems (cont’d)
 Interconnect types
Measureme
nt
SMP
BUS
Memory
Channel
Myrinet Sun SCI Gigabit Hyper
Ethernet Fabric
Infiniband
Latency
(μs)
0.5
3
7 to 9
10
100
20
< 10
CPU
overhead
low
low
low
N/A
high
low
low
Bandwidth
MB/sec
> 500 > 100
~250
~ 70
100
400
1000
Digging into RAC performance
problems (cont’d)
 cluster_interconnects parameter
It provides Oracle with information about additional cluster
interconnects available for use and can be used to load balance
the interconnect traffic to different physical interconnects thus
increasing interconnect bandwith.
When you set CLUSTER_INTERCONNECTS in cluster
configurations, the interconnect high availability features are not
available. In other words, an interconnect failure that is
normally unnoticeable would instead cause an Oracle cluster
failure as Oracle still attempts to access the network interface
which has gone down.
Digging into RAC performance
problems (cont’d)
 STATSPACK reports



The STATSPACK report show statistics ONLY for the
node or instance on which it was run
Top 5 Timed Events
Global Cache Service and Global Enqueue Service
 Note:135714.1 Script to Collect RAC Diagnostic
Information (racdiag.sql)
Digging into RAC performance
problems (cont’d)
BAD PERFORMANCE
Top 5 Timed Events
~~~~~~~~~~~~~~~~~~
% Total
Event
Waits
Time (s) Ela Time
--------------------------------------- ------------ ----------- -------global cache cr request
34,568
958
31.44
buffer busy global CR
6,513
620
20.35
db file sequential read
64,214
455
14.92
latch free
13,542
453
14.88
buffer busy waits
10,971
295
9.69
GOOD PERFORMANCE
Top 5 Timed Events
~~~~~~~~~~~~~~~~~~
% Total
Event
Waits
Time (s) Ela Time
--------------------------------------- ------------ ----------- -------latch free
10,969
666
51.28
buffer busy waits
15,379
159
12.23
CPU time
149
11.47
PL/SQL lock timer
51
105
8.05
db file sequential read
25,163
96
7.36
Digging into RAC performance
problems (cont’d)
Global Cache Service - Workload Characteristics
----------------------------------------------Ave global cache get time (ms):
Ave global cache convert time (ms):
Ave build time for CR block (ms):
Ave flush time for CR block (ms):
Ave send time for CR block (ms):
Ave time to process CR block request (ms):
Ave receive time for CR block (ms):
Ave pin time for current block (ms):
Ave flush time for current block (ms):
Ave send time for current block (ms):
Ave time to process current block request (ms):
Ave receive time for current block (ms):
Global cache hit ratio:
Ratio of current block defers:
% of messages sent for buffer gets:
% of remote buffer gets:
Ratio of I/O for coherence:
Ratio of local vs remote work:
Ratio of fusion vs physical writes:
BAD
GOOD
11.8
51.7
0.7
0.2
0.0
0.9
1.6
0.2
0.0
0.1
0.3
33.4
9.5
0.0
6.7
1.8
1.2
4.3
0.0
2.2
11.2
0.0
0.2
0.2
0.4
0.4
0.2
0.0
0.1
0.3
7.5
3.9
0.0
2.5
0.7
1.3
4.4
0.0
Digging into RAC performance
problems (cont’d)
Global Enqueue Service Statistics
--------------------------------Ave global lock get time (ms):
Ave global lock convert time (ms):
Ratio of global lock gets vs global lock releases:
GCS and GES Messaging statistics
-------------------------------Ave message sent queue time (ms):
Ave message sent queue time on ksxp (ms):
Ave message received queue time (ms):
Ave GCS message process time (ms):
Ave GES message process time (ms):
% of direct sent messages:
% of indirect sent messages:
% of flow controlled messages:
BAD
GOOD
0.2
0.0
1.2
0.0
0.0
1.1
16.5
29.4
1.9
0.1
0.1
49.7
50.3
0.0
1.7
2.5
0.3
0.1
0.0
63.4
36.6
0.0
Implementing RAC in production
 Smooth transition from single instance to RAC

Change ORACLE_HOME

Relinking the RAC Option ON/OFF

CLUSTER_DATABASE = TRUE/FALSE

Start/Stop the second instance

Start gradual movement of clients from one
instance to another
Relinking the RAC Option
1.Login as the Oracle software owner and shutdown all
database instances on all nodes in the cluster.
2. cd $ORACLE_HOME/rdbms/lib
3. make -f ins_rdbms.mk rac_on (rac_off)
If this step did not fail with fatal errors then proceed to step
4.
4. make -f ins_rdbms.mk ioracle
Reference

Metalink Note:211177.1 RAC Survival Kit: Rac On / Rac Off - Relinking the
RAC Option

Metalink Note:183340.1 Frequently Asked Questions About the
CLUSTER_INTERCONNECTS Parameter in 9i.

http://www.fi.muni.cz/~kripac/orac-nbd/

The Oracle-on-Linux VMware Cookbook
http://www.oracle.com/technology/tech/linux/vmware/cookbook/index.html


Build Your Own Oracle RAC 10g Cluster on Linux and FireWire
http://www.oracle.com/technology/pub/articles/hunter_rac10g.html
Note:135714.1 Script to Collect RAC Diagnostic Information (racdiag.sql)
Thank You
zyumbyulev@mobiltel.bg
Download