Oracle9i Real Application Clusters (RAC) on HP-UX

advertisement
Oracle9i Real Application Clusters
(RAC) on HP-UX
Authors:
Rebecca Schlecht, Rainer Marekwia
HP/Oracle Cooperative Technology Center
(http://www.hporaclectc.com/
Co-Author: Sandy Gruver, HP Alliance Team US
Version
9.2d
Contents:
1. Module Objectives
2. Overview: What is Oracle9i Real Applications Clusters?
3. Oracle 9i Real Application Clusters – Cache Fusion technology
4. New HP Cluster Interconnect technology
5. HP/Oracle Hardware and Software Requirements
5.1 General Notes
5.2 System Requirements
5.3 HP-UX Operating System Patches
5.4 Kernel Parameters
5.5 Asynchronous I/O
6. Configure the HP/Oracle 9i Real Application Cluster
6.1 Hardware configuration (Hardware planning, Network and disk layout)
6.2 Configure logical volumes
6.3 Configure HP ServiceGuard cluster
6.4 Create a user who will own the Oracle RAC software
6.5 Install Oracle Software
6.6 Create a Oracle9i RAC database using SQL scripts
6.7 Create a Oracle9i RAC database using DBCA
7. Change to an Oracle9i RAC environment from Single Instance
8. Example of an RAC enabled init.ora file
9. Use HMP Protocol over HyperFabricII
1. Module Objectives
Purpose
This module focuses on what Oracle9i Real Application Clusters (RAC) is and how it can
be properly configured on HP-UX to tolerate failures with minimal downtime. Oracle9i
Real Application Clusters is an important Oracle9i feature that addresses high availability
and scalability issues.
Objectives
Upon completion of this module, you should be able to:






What Oracle9i Real Application Clusters is and how it can be used
Understand the hardware requirements
Understand how to configure the HP Cluster and create the raw devices
Examine and verify the cluster configuration
Understand the creation of the Oracle9i Real Application Clusters database using
DBCA
Understand the steps to migrate an application to Oracle9i Real Application
Clusters
Back to Top
2. Overview: What is Oracle9i Real Applications
Clusters?
Oracle9i Real Application Clusters is a computing environment that harnesses the
processing power of multiple, interconnected computers. Oracle9i Real Application
Clusters software and a collection of hardware known as a "cluster," unites the processing
power of each component to become a single, robust computing environment. A cluster
generally comprises two or more computers, or "nodes."
In Oracle9i Real Application Clusters (RAC) environments, all nodes concurrently
execute transactions against the same database. Oracle9i Real Application Clusters
coordinates each node's access to the shared data to provide consistency and integrity.
Oracle9i Real Application Clusters serves as an important component of robust high
availability solutions. A properly configured Oracle9i Real Application Clusters
environment can tolerate failures with minimal downtime.
Oracle9i Real Application Clusters is also applicable for many other system types. For
example, data warehousing applications accessing read-only data are prime candidates
for Oracle9i Real Application Clusters. In addition, Oracle9i Real Application Clusters
successfully manages increasing numbers of online transaction processing systems as
well as hybrid systems that combine the characteristics of both read-only and read/write
applications.
Harnessing the power of multiple nodes offers obvious advantages. If you divide a large
task into sub-tasks and distribute the sub-tasks among multiple nodes, you can complete
the task faster than if only one node did the work. This type of parallel processing is
clearly more efficient than sequential processing. It also provides increased performance
for processing larger workloads and for accommodating growing user populations.
Oracle9i Real Application Clusters can effectively scale your applications to meet
increasing data processing demands. As you add resources, Oracle9i Real Application
Clusters can exploit them and extend their processing powers beyond the limits of the
individual components.
From a functional perspective RAC is equivalent to single-instance Oracle. What the
RAC environment does offer is significant improvements in terms of availability,
scalability and reliability.
In recent years, the requirement for highly available systems, able to scale on demand,
has fostered the development of more and more robust cluster solutions. Prior to
Oracle9i, HP and Oracle, with the combination of Oracle Parallel Server and HP
ServiceGuard OPS edition, provided cluster solutions that lead the industry in
functionality, high availability, management and services. Now with the release of Oracle
9i Real Application Clusters (RAC) with the new Cache Fusion architecture based on an
ultra-high bandwidth, low latency cluster interconnect technology, RAC cluster solutions
have become more scalable without the need for data and application partitioning.
The information contained in this document covers the installation and configuration of
Oracle Real Application Clusters in a typical environment; a two node HP cluster,
utilizing the HP-UX operating system.
Back to Top
3. Oracle 9i Real Application Clusters – Cache Fusion
technology
Oracle 9i cache fusion utilizes the collection of caches made available by all nodes in the
cluster to satisfy database requests. Requests for a data block are satisfied first by a local
cache, then by a remote cache before a disk read is needed. Similarly, update operations
are performed first via the local node and then the remote node caches in the cluster,
resulting in reduced disk I/O. Disk I/O operations are only done when the data block is
not available in the collective caches or when an update transaction performs a commit
operation.
Oracle 9i cache fusion thus provides Oracle users an expanded database cache for queries
and updates with reduced disk I/O synchronization which overall speeds up database
operations.
However, the improved performance depends greatly on the efficiency of the inter-node
message passing mechanism, which handles the data block transfers between nodes.
The efficiency of inter-node messaging depends on three primary factors:



The number of messages required for each synchronization sequence. Oracle 9i’s
Distributed Lock Manager (DLM) coordinates the fast block transfer between
nodes with two inter- node messages and one intra-node message. If the data is in
a remote cache, an inter-node message is sent to the Lock Manager Daemon
(LMD) on the remote node. The DLM and Cache Fusion processes then update
the in-memory lock structure and send the block to the requesting process.
The frequency of synchronization (the less frequent the better). The cache fusion
architecture reduces the frequency of the inter-node communication by
dynamically migrating locks to a node that shows a frequent access pattern for a
particular data block. This dynamic lock allocation increases the likelihood of
local cache access thus reducing the need for inter-node communication. At a
node level, a cache fusion lock controls access to data blocks from other nodes in
the cluster.
The latency of inter-node communication. This is a critical component in Oracle
9i RAC as it determines the speed of data block transfer between nodes. An
efficient transfer method must utilize minimal CPU resources, support high
availability as well as highly scalable growth without bandwidth constraints.
Back to Top
4. New HP Cluster Interconnect technology

HyperFabric
HyperFabric is a high-speed cluster interconnect fabric that supports both the
industry standard TCP/UDP over IP and HP’s proprietary Hyper Messaging
Protocol (HMP). HyperFabric extends the scalability and reliability of TCP/UDP
by providing transparent load balancing of connection traffic across multiple
network interface cards (NICs) and transparent failover of traffic from one card to
another without invocation of MC/ServiceGaurd. The HyperFabric NIC
incorporates a network processor that implements HP’s Hyper Messaging
Protocol and provides lower latency and lower host CPU utilization for standard
TCP/UDP benchmarks over HyperFabric when compared to gigabit Ethernet.
Hewlett-Packard released HyperFabric in 1998 with a link rate of 2.56 Gbps over
copper. In 2001, Hewlett-Packard released HyperFabric 2 with a link rate of 4.0
Gbps over fiber with support for compatibility with the copper HyperFabric
interface. Both HyperFabric products support clusters up to 64-nodes.

HyperFabric Switches
Hewlett-Packard provides the fastest cluster interconnect via its proprietary
HyperFabric switches, the latest product being HyperFabric 2, which is a new set
of hardware components with fiber connectors to enable low-latency, high
bandwidth system interconnect. With fiber interfaces, HyperFabric 2 provides
faster speed – up to 4Gbps in full duplex over longer distance – up to 200 meters.
HyperFabric 2 also provides excellent scalability by supporting up to 16 hosts via
point-to-point connectivity and up to 64 hosts via fabric switches. It is backward
compatible with previous versions of HyperFabric and available on IA-64, PARISC servers.

Hyper Messaging Protocol (HMP)
Hewlett-Packard, in cooperation with Oracle, has designed a cluster interconnect
product specifically tailored to meet the needs of enterprise class parallel database
applications. HP’s Hyper Messaging Protocol significantly expands on the feature
set provided by TCP/UDP by providing a true Reliable Datagram model for both
remote direct memory access (RDMA) and traditional message semantics.
Coupled with OS bypass capability and the hardware support for protocol offload
provided by HyperFabric, HMP provides high bandwidth, low latency and
extremely low CPU utilization with an interface and feature set optimized for
business critical parallel applications such as Oracle 9i RAC.
Back to Top
5. HP/Oracle Hardware and Software Requirements
5.1 General Notes
Oracle9i Database Release 2 Documentation for HP 9000 Series HP-UX
Oracle9i Real Application Clusters Real Application Clusters Guard I
A95979_02 PDF HTML Configuration Guide for Unix Systems: AIX-Based Systems, Compaq
Tru64 UNIX, HP 9000 Series HP-UX, Linux Intel, and Sun Solaris
Oracle9i Installation Guide Release 2 for UNIX Systems: AIX-Based
A96167-01 PDF HTML Systems, Compaq Tru64 UNIX, HP 9000 Series HP-UX, Linux Intel, and
Sun Solaris
Oracle9i Administrator's Reference for UNIX Systems: AIX-Based
A97297-01 PDF HTML Systems, Compaq Tru64 UNIX, HP 9000 Series HP-UX, Linux Intel, and
Sun Solaris
A97350-02 PDF HTML Oracle9i Release Notes Release 2 (9.2.0.1.0) for HP 9000 Series HP-UX
For additional information and about Oracle9i Release Note Release 1 (9.0.1) for HP
9000 Series HP-UX (Part Number A90357-01) look at
http://docs.oracle.com/HTML_Storage/a90357/toc.htm








Each node uses the HP-UX 11.x operating system software. Issue the command
"$uname -r" at the operating system prompt to verify the release being used.
Oracle9i RAC is only available in 64-bit flavour. To determine if you have a 64bit configuration on an HP-UX 11.0 installation, enter the following command:
$/bin/getconf KERNEL_BITS
Oracle9i 9.0.2 RAC is supported by ServiceGuard OPS Edition 11.13 and 11.14
Starting with ServiceGuard OPS Edition 11.13 16 nodes 9i RAC cluster is
supported with SLVM (B5161FA).
Please note that beginning with version A.11.14.01, which is an Itanium release
only, the product structure of ServiceGuard Extension for RAC (SGeRAC;
formerly ServiceGuard OPS Edition) has changed. SGeRAC is now an add-on
product to the MC/ServiceGuard offering, meaning that it simply can be installed
on top of the existing MC/SG. This new structure will be introduced for PA-RISC
with the next A.11.15 release of MC/SG.
Software mirroring with HP-UX MirrorDisk/UX with SLVM is supported in a 2node configuration only.
Support for the HP Hyperfabric product is provided.
A total of 127 RAC instances per cluster is supported.
5.2 System Requirements

RAM Memory allocation: Minimum 256 MB. Use the following command to
verify the amount of memory installed on your system
# /usr/sbin/dmesg | grep "Physical:"
 Swap Space: Minimum 1 x RAM or 400 MB, whichever is greater. Use the
following command to determine the amount of swap space installed on your
system:
# /usr/sbin/swapinfo -a (requires root privileges)
 CD-ROM drive: A CD-ROM drive capable of reading CD-ROM disks in the ISO
9660 format with RockRidge extensions.
 Available Disk Space: 3 GB
 Temporary Disk Space: The Oracle Universal Installer requires up to 512 MB of
free space in the /tmp directory
 Operating System: HP-UX version 11.0. or 11i (11.11). To determine if you have
a 64-bit configuration on an HP-UX 11.0 or HP-UX 11i installation, enter the
following command
# /bin/getconf KERNEL_BITS

To determine your current operating system information, enter the following
command:
# uname -a



JRE: Oracle applications use JRE 1.1.8
JDK: Oracle HTTP Server Powered by Apache uses JDK 1.2.2.05.
Due to a known HP bug (Doc.id. KBRC00003627), the default HP-UX 64
operating system installation does not create a few required X-library symbolic
links. These links must be created manually before starting Oracle9i installation.
To create these links, you must have superuser privileges, as the links are to be
created in the /usr/lib directory. After enabling superuser privileges, run the
following commands to create the required links:
# cd /usr/lib
# ln -s /usr/lib/libX11.3 libX11.sl
# ln -s /usr/lib/libXIE.2 libXIE.sl
# ln -s /usr/lib/libXext.3 libXext.sl
# ln -s /usr/lib/libXhp11.3 libXhp11.sl
# ln -s /usr/lib/libXi.3 libXi.sl
# ln -s /usr/lib/libXm.4 libXm.sl
# ln -s /usr/lib/libXp.2 libXp.sl
# ln -s /usr/lib/libXt.3 libXt.sl
# ln -s /usr/lib/libXtst.2 libXtst.sl

X Server system software. (Refer to the installation guide for more information on
the X Server System and emulator issues.)
Back to Top
5.3 HP-UX Operating System Patches
11.0 (64bit):








March 2003 HP-UX patch bundle
PHCO_23963
PHCO_24148
PHCO_23919
PHKL_23226
PHNE_24034
PHSS_23440
hyperfabric driver: 11.00.12 (HP-UX 11.0) (Required only if your system has an
older hyperfabric driver version)
11i (64bit):


March 2003 HP-UX patch bundle
Oracle rel-notes request:


PHKL_25506
PHSS_26946
HP aC++ -AA runtime libraries (aCC
PHSS_28436
PHNE_27745
ld(1) and linker tools cumulative patch
HyperFabric B.11.11.0[0-2] cumulative
A.03.37)


patch






























PHSS_27725
MC/SG 11.14
patch history:
[PHKL_25506 included in DART-59-12-02 SUPPORT-PLUS-1202]
PHSS_26946 <- PHSS_24638
HP aC++ AA runtime libraries (aCC A.03.37)
PHSS_28436 <- PHSS_26560 <- PHSS_26263
ld(1) and
linker tools cumulative patch
PHNE_27745 <- PHNE_27144
HyperFabric B.11.11.0[0-2] cumulative patch
PHSS_27725 <- PHSS_27246
MC/SG
11.14
Java patches (22-Nov-2002):
PHCO_24402
libc cumulative header file patch
PHCO_26061
s700_800 11.11 Kernel configuration commands
patch
PHCO_26331
mountall cumulative patch
PHCO_26466
initialised TLS, Psets, Mutex performance
PHCO_27434
libc cumulative patch
PHKL_24751
preserve IPSW W-bit and GR31 lower bits
PHKL_25233
select(2) and poll(2) hang
PHKL_25468
eventport (/dev/poll) pseudo driver
PHKL_25729
signals, threads enhancement, Psets Enablement
PHKL_25993
thread nostop for NFS, rlimit max value fix
PHKL_25994
thread NOSTOP, Psets Enablement
PHKL_26468
priority inversion and thread hang
PHKL_27091
core PM, vPar, Psets Cumulative Patch
PHKL_27094
Psets Enablement Patch, slpq1 perf
PHKL_27096
VxVM, EMC, Psets&vPar, slpq1, earlyKRS
PHKL_27278
Required for large heap on SDK 1.4 VM-JFS ddlock,
mmap,thread perf, user limits
PHKL_27317
Thread NOSTOP, Psets, Thread Abort
PHKL_27502
MO 4k sector size; FIFO; EventPort, perf
PHNE_26388
ONC/NFS General Release/Performance Patch
PHNE_27063
cumulative ARPA Transport patch
PHSS_27425
Required for SDK 1.4 64-bit X/Motif Runtime
Optional Patch: For DSS applications running on machines with more than 16 CPUs, we
recommend installation of the HP-UX patch PHKL_22266. This patch addresses
performance issues with the HP-UX Operating System.
HP provides patch bundles at
http://www.software.hp.com/SUPPORT_PLUS
Individual patches can be downloaded from
http://itresourcecenter.hp.com/
To determine which operating system patches are installed, enter the following
command:
# /usr/sbin/swlist -l patch
To determine if a specific operating system patch has been installed, enter the following
command:
# /usr/sbin/swlist -l patch patch_number
To determine which operating system bundles are installed, enter the following
command:
# /usr/sbin/swlist -l bundle
Back to Top
5.4 Kernel parameters
Kernel Parameter
Setting
Purpose
KSI_ALLOC_MAX
(NPROC * 8)
Defines the system wide limit of queued
signal that can be allocated.
MAXDSIZ
1073741824
bytes
Refers to the maximum data segment size
for 32-bit systems. Setting this value too
low may cause the processes to run out of
memory.
MAXDSIZ_64
2147483648
bytes
Refers to the maximum data segment size
for 64-bit systems. Setting this value too
low may cause the processes to run out of
memory.
MAXSSIZ
134217728 bytes
Defines the maximum stack segment size in
bytes for 32-bit systems.
MAXSSIZ_64BIT
1073741824
Defines the maximum stack segment size in
bytes for 64-bit systems.
MAXSWAPCHUNKS (available
memory)/2
Defines the maximum number of swap
chunks where SWCHUNK is the swap
chunk size (1 KB blocks). SWCHUNK is
2048 by default.
MAXUPRC
(NPROC - 5)
Defines maximum number of user
processes.
MSGMAP
(MSGTQL + 2)
Defines the maximum number of message
map entries.
MSGMNI
NPROC
Defines the number of message queue
identifiers.
MSGSEG
(NPROC * 4)
Defines the number of segments available
for messages.
MSGTQL
NPROC
Defines the number of message headers.
NCALLOUT
(NKTHREAD +
1)
Defines the maximum number of pending
timeouts.
NCSIZE
((8 * NPROC +
2048) +
VX_NCSIZE)
Defines the Directory Name Lookup Cache
(DNLC) space needed for inodes.
VX_NCSIZE is by default 1024.
NFILE
(15 * NPROC +
2048)
Defines the maximum number of open
files.
NFLOCKS
NPROC
Defines the maximum number of files locks
available on the system.
NINODE
(8 * NPROC +
2048)
Defines the maximum number of open
inodes.
NKTHREAD
(((NPROC * 7) /
4) + 16)
Defines the maximum number of kernel
threads supported by the system.
NPROC
4096
Defines the maximum number of processes.
SEMMAP
(SEMMNI + 2)
Defines the maximum number of semaphore
map entries.
SEMMNI
(SEMMNS / 2)
Defines the maximum number of semaphore
sets in the entire system.
SEMMNS
(NPROC * 2) * 2
Sets the number of semaphores in the
system. The default value of SEMMNS is
128, which is, in most cases, too low for
Oracle9i software.
SEMMNU
(NPROC - 4)
Defines the number of semaphore undo
structures.
SEMVMX
32768
Defines the maximum value of a
semaphore.
SHMMAX
Available physical Defines the maximum allowable size of one
memory
shared memory segment.
The SHMMAX setting should be large
enough to hold the entire SGA in one shared
memory segment. A low setting can cause
creation of multiple shared memory
segments which may lead to performance
degradation.
SHMMNI
512
Defines the maximum number of shared
memory segments in the entire system.
SHMSEG
32
Defines the maximum number of shared
memory segments one process can attach.
VPS_CEILING
64
Defines the maximum System-Selected
Page Size in kilobytes.
Note: These are minimum kernel requirements for Oracle9i. If you have previously tuned
your kernel parameters to levels equal to or higher than these values, continue to use the
higher values. A system restart is necessary for kernel changes to take effect.
Back to Top
5.5 Asynchronous I/O
1. Create the /dev/async character device
# /sbin/mknod /dev/async c 101 0x0
# chown oracle:dba /dev/async
# chmod 660 /dev/async
2. Configure the async driver in the kernel using SAM
=> Kernel Configuration
=> Kernel
=> the driver is called 'asyncdsk'
Generate new kernel
Reboot
3. Set HP-UX kernel parameter max_async_ports using SAM. max_async_ports limits
the maximum number of processes that can concurrently use /dev/async. Set this
parameter to the sum of 'processes' from init.ora + number of bakground processes. If
max_async_ports is reached, subsequent processes will use synchronous i/o.
4. Set HP-UX kernel parameter aio_max_ops using SAM. aio_max_ops limits the
maximum number of asynchronous i/o operations that can be queued at any time. Set this
parameter to the default value (2048), and monitor over time using glance
Back to Top
6. Configure the HP/Oracle 9i Real Application Cluster
6.1 Hardware configuration (Hardware planning, Network and disk
layout)
Hardware Planning
In order to provide a high level of availability, a typical cluster uses redundant system
components, for example two or more systems and two or more independent disk
subsystems. This redundancy eliminates single points of failure.
The nodes in an Oracle9i RAC cluster are HP 9000 systems with similar memory
configuration and processor architecture. A node can be any Series 800 model. It is
recommended that both nodes be of similar processing power and memory capacity.
An RAC cluster must have:






Two or more nodes
Redundant high-speed interconnect between the nodes (e.g. Hyperfabric,
Hyperfabric switches)
Redundant network components (Primary and Standby LAN)
Redundant disk storage or RAID0/1 configuration for disk mirroring
A dedicated heartbeat LAN (heartbeat traffic is also carried on the primary and
standby LAN)
Redundant power supplies
Network and Disk Layout
Draw a diagram of your cluster using information gathered from these two sets of
commands. You’ll use this information later in configuring the system, the logical
volumes and the cluster.
1. Use the LAN commands
# lanscan
# ifconfig lanX, and
# netstat
to determine the number of LAN interfaces on each node and the names and
addresses of each LAN card and subnet information.
2. Use the IO command
# ioscan –fnCdisk
to find the disks connected to each node. Note the type of disks installed. List the
hardware addresses and device file names of each disk. Also note which are
shared between nodes.
Network Planning
Minimally, a 9i RAC cluster requires three distinct subnets:
o
o
o
Dedicated cluster heartbeat LAN
Dedicated Distributed Lock Manager (DLM) LAN
User/Data LAN, which will also carry a secondary heartbeat
Because the DLM is now integrated into the Oracle9i kernel, the DLM will use the IP
address associated with the default host name.
The network should be configured in the /etc/rc.config.d/netconf file. Any time you
change the LAN configuration, you need to stop the network and re-start it again:
# /sbin/rc2.d/S340net stop
# /sbin/rc2.d/S340net start
DLM requires a high speed network to handle high bandwidth network traffic. In
the Oracle literature this is referred to as the host interconnect. We recommend
using either Hyperfabric or Gigabit Ethernet for this network.
Remote copy(rcp) needs to be enabled for both the root and oracle accounts on all
nodes to allow remote copy of cluster configuration files.
There are two ways to enable rcp for root. You can choose the one that fits your
site’s security requirements. Include the following lines in either the .rhosts file in
root’s home directory or in the /etc/cmcluster/cmclnodelist file:
node1name root
node2name root
To enable remote copy (rcp) for Oracle include the following lines in the .rhosts
file in the oracle user’s home directory:
node1name oracle
node2name oracle
where node1name and node2name are the names of the two systems in the cluster
and oracle is the user name of the Oracle owner. The rcp only works if for the
respective user a password has been set (root and oracle).
Back to Top
6.2 Configure logical volumes
General Recommendations
When disk drives were 1 or 2-GB at maximum the usual wisdom was to do the following:



Place redo logs and database files onto different drives
Ensure that data and indexes were on separate spindles
Spread the I/O load across as many disk devices as possible
Today with the greatly increased capacity of a single disk mechanism (maximum 181Gb
drives on an XP512) and much faster I/O rates or transfer speeds, these rules must be
revisited.
The real reason for these rules of thumb was to make sure that the I/O load resulting from
an Oracle database would wind up being fairly well spread across all the disk
mechanisms. Before the advent of large capacity disk drives housed in high performance
storage systems, if the same disk drive wound up hosting two or more fairly active
database objects, performance could deteriorate rapidly, especially if any of these objects
needed to be accessed sequentially.
Today, in the era of huge disk arrays, the concept of “separate spindles” is a bit more
vague, as the internal structure of the array is largely hidden from the view of the system
administrator. The smallest independent unit of storage in an XP array is substantially
larger than 1 or 2 GB, which means you have far fewer “spindles” to play with, at a time
when there are more database objects (tables, indexes, etc) to “spread”, so it won’t be
possible to keep all the objects separate. The good news is that the architecture of the XP
array is much more tolerant of multiple simultaneous I/O streams to/from the same disk
mechanism than the previous generation of individual small disks.
Given all these advances in the technology, we have found it best to use a simple method
for laying out an Oracle database on an XP array (under HP-UX) with volume manager
striping of all of the database objects across large numbers of disk mechanisms. The
result is to average out the I/O to a substantial degree. This method does not guarantee
the avoidance of disk hotspots, but we believe it to be a reasonable “first pass” which can
be improved upon with tuning over time. It’s not only a lot faster to implement than a
customized one-object-at-a-time layout, but we believe it to be much more resistant to the
inevitable load fluctuations which occur over the course of a day, month, or year.
The layout approach that we are advocating might be described as “Modified StripeEverything- -Across-Everything”. Our goal is to provide a simple method which will
yield good I/O balance, yet still provide some means of manual adjustment. Oracle
suggests the same strategy. Their name for this strategy is SAME (Stripe and Mirror
Everything).
XP basics: an XP512 can be configured with one to four pairs of disk controller modules
(ACPs). Each array group is controlled by only one of these ACP pairs (it is in the
domain of only one ACP pair). Our suggestion is that you logically “separate” the XP’s
array groups into four to eight sets. Each set should have array groups from all the ACP
domains. Each set of array groups would then be assigned to a single volume group. All
LUNs in the XP array will have paths defined via two distinct host-bus adapters; the
paths should be assigned within each volume group in such a fashion that their primary
path alternates back and forth between these two host-bus adapters. The result of all this:
each volume group will consist of space which is ‘stripable’ across multiple array groups
spread across all the ACP pairs in the array, and any I/O done to these array groups will
be spread evenly across the host-bus adapters on the server.
LVM Steps
1. Disks need to be properly initialized before being added into volume groups by the
pvcreate command. Do the following step for all the disks (LUNs) you want to configure
for your 9i RAC volume group(s):
# pvcreate –f /dev/rdsk/cxtydz ( where x=instance, y=target, and z=unit)
2. Create the volume group directory with the character special file called group:
# mkdir /dev/vg_rac
# mknod /dev/vg_rac/group c 64 0x060000
Note: The minor numbers for the group file should be unique among all the
volume groups on the system.
3. Create PV-LINKs and extend the volume group:
# vgcreate /dev/vg_rac /dev/dsk/c0t1d0 /dev/dsk/c1t0d0
secondary path)
# vgextend /dev/vg_rac /dev/dsk/c1t0d1 /dev/dsk/c0t1d1
.(primary path ...
Continue with vgextend until you have included all the needed disks for the
volume group(s).
4. Create logical volumes for the 9i RAC database with the command
# lvcreate –i 10 –I 1024 –L 100 –n Name /dev/vg_rac
-i: number of disks to stripe across
-I: stripe size in kilobytes
-L: size of logical volume in MB
5. Logical Volume Configuration
It is necessary to define raw devices for each of the following categories of files.
The Oracle Database Configuration Assistant (DBCA) will create a seed database
expecting the following configuration.
By following the naming convention described in the table below, raw partitions
are identified with the database and the raw volume type (the data contained in the
raw volume). Raw volume size is also identified using this method. Note : In the
sample names listed in the table, the string db_name should be replaced with the
actual database name, thread is the thread number of the instance, and lognumb is
the log number within a thread.
Create a Raw Device for: File
Size
SYSTEM tablespace
400
MB
USERS tablespace
120
MB
TEMP tablespace
100
MB
Sample name
db_name_raw_system_400
db_name_raw_user_120
db_name_raw_temp_100
Comments
An undo tablespace per
instance
OEMREPO
500
db_name_thread_raw_undo_500
MB
20 MB db_name_raw_oemrepo_20
INDX tablespace
TOOLS tablespace
DRYSYS tablespace
70 MB db_name_raw_indx_70
12 MB db_name_raw_tools_12
90 MB db_name_raw_dr_90
First control file
Second control file
Two ONLINE redo log
files per instance
Spfile.ora
Srvmconfig
ODM
110
MB
110 MB
120 MB
per file
5 MB
100 MB
100 MB
CWMLITE
100 MB db_name_raw_cwmlite_100
XDB
100 MB db_name_raw_xml_100
EXAMPLE
160 MB db_name_raw_examples_160
thread = instance
number
optional: required for
Oracle Enterprise
Manager repository
optional: only
required Oracle Text
db_name_raw_control01_110
db_name_raw_control02_110
db_name_thread_lognumb_120
db_name_raw_spfile_5
db_name_raw_srvmconf_100
db_name_raw_odm_100
thread = instance
number
optional: only
required Oracle Data
Mining
optional: only
required for OLAP
optional: only
required for XML
DB
optional: only
required for Example
schemas
Note: Automatic Undo Management requires an undo tablespace per instance
therefore you would require a minimum of 2 tablespaces as described above.
6. It is recommended best practice to create symbolic links for each of these raw
files on all systems of your RAC cluster.
# cd /oracle/RAC/ (directory where you want to have the links)
# ln -s /dev/vg_rac/rac_raw_system_400 system
# ln -s /dev/vg_rac/rac_raw_user_120 user
etc.
7. Check to see if your volume groups are properly created and available:
# strings /etc/lvmtab
# vgdisplay –v /dev/vg_rac
8. Change the permission of the database volume group vg_rac to 777, change the
permissions of all raw logical volumes to 660 and the owner to oracle:dba.
# chmod 777 /dev/vg_rac
# chmod 660 /dev/vg_rac/r*
# chown oracle:dba /dev/vg_rac/r*
9. Export the volume group:
De-activate the volume group:
# vgchange –a n /dev/vg_rac
Create the volume group map file:
# vgexport –v –p –s –m mapfile /dev/vg_rac
Copy the mapfile to all the nodes in the cluster:
Syntax:
# rcp mapfile system_name:target_directory
Example: # rcp MyMAPfile nodeB:/tmp/scripts
10. Import the volume group on the second node in the cluster
Create a volume group directory with the character special file called group:
# mkdir /dev/vg_rac
# mknod /dev/vg_rac/group c 64 0x060000
Note: The minor number has to be the same as on the other node.
Import the volume group:
# vgimport –v –s –m mapfile /dev/vg_rac
# chmod 777 /dev/vg_rac
# chmod 660 /dev/vg_rac/r*
# chown oracle:dba /dev/vg_rac/r*
Note: The minor number has to be the same as on the other node.
Check to see if devices are imported:
# strings /etc/lvmtab
Back to Top
6.3 Configure HP ServiceGuard Cluster
After all the LAN cards are installed and configured, and the RAC volume group and the
cluster lock volume group(s) are configured, you can start the cluster configuration. The
following sequence is very important. However, if the RAC volume groups are unknown
at this time, you should be able to configure the cluster minimally with a lock volume
group.
At this time, the cluster lock volume group should have been created. Since in this
cookbook we configure one volume group for the entire RAC cluster vg_rac, we used
vg_rac for the lock volume as well.
1. Activate the lock disk on the configuration node ONLY. Lock volume can only be
activated on the node where the cmapplyconf command is issued so that the lock disk can
be initialized accordingly.
# vgchange -a y /dev/vg_rac
2. Create a cluster configuration template:
# cmquerycl –n nodeA –n nodeB –v –C /etc/cmcluster/rac.asc
3. Edit the cluster configuration file (rac.asc).
Make the necessary changes to this file for your cluster. For example, change the
ClusterName, and adjust the heartbeat interval and node timeout to prevent unexpected
failovers due to DLM traffic.
4. Check the cluster configuration:
# cmcheckconf -v -C rac.asc
5. Create the binary configuration file and distribute the cluster configuration to all the
nodes in the cluster:
# cmapplyconf -v -C rac.asc
Note: the cluster is not started until you run cmrunnode on each node or cmruncl.
6. De-activate the lock disk on the configuration node after cmapplyconf
# vgchange -a n /dev/vg_rac
7. Start the cluster and view it to be sure its up and running. See the next section
for instructions on starting and stopping the Cluster.
Starting the Cluster
1. Start the cluster from any node in the cluster
# cmruncl -v
Or, on each node
# cmrunnode -v
2. Make all RAC volume groups and Cluster Lock volume groups sharable and cluster
aware (not packages) from the cluster configuration node. This has to be done only once.
# vgchange -S y -c y /dev/vg_rac
3. Then on all the nodes, activate the volume group in shared mode in the cluster.
This has to be done each time when you start the cluster.
# vgchange -a s /dev/vg_rac
4. Check the cluster status:
# cmviewcl –v
How to shut down the cluster (not needed here)
1. Shut down the 9i RAC instances (If up and running)
2. On all the nodes, deactivate the volume group in shared mode in the cluster:
# vgchange –a n /dev/vg_rac
3. Halt the cluster from any node in the cluster
# cmhaltcl –v
4. Check the cluster status:
# cmviewcl –v
Download