Oracle9i Real Application Clusters (RAC) on HP-UX Authors: Rebecca Schlecht, Rainer Marekwia HP/Oracle Cooperative Technology Center (http://www.hporaclectc.com/ Co-Author: Sandy Gruver, HP Alliance Team US Version 9.2d Contents: 1. Module Objectives 2. Overview: What is Oracle9i Real Applications Clusters? 3. Oracle 9i Real Application Clusters – Cache Fusion technology 4. New HP Cluster Interconnect technology 5. HP/Oracle Hardware and Software Requirements 5.1 General Notes 5.2 System Requirements 5.3 HP-UX Operating System Patches 5.4 Kernel Parameters 5.5 Asynchronous I/O 6. Configure the HP/Oracle 9i Real Application Cluster 6.1 Hardware configuration (Hardware planning, Network and disk layout) 6.2 Configure logical volumes 6.3 Configure HP ServiceGuard cluster 6.4 Create a user who will own the Oracle RAC software 6.5 Install Oracle Software 6.6 Create a Oracle9i RAC database using SQL scripts 6.7 Create a Oracle9i RAC database using DBCA 7. Change to an Oracle9i RAC environment from Single Instance 8. Example of an RAC enabled init.ora file 9. Use HMP Protocol over HyperFabricII 1. Module Objectives Purpose This module focuses on what Oracle9i Real Application Clusters (RAC) is and how it can be properly configured on HP-UX to tolerate failures with minimal downtime. Oracle9i Real Application Clusters is an important Oracle9i feature that addresses high availability and scalability issues. Objectives Upon completion of this module, you should be able to: What Oracle9i Real Application Clusters is and how it can be used Understand the hardware requirements Understand how to configure the HP Cluster and create the raw devices Examine and verify the cluster configuration Understand the creation of the Oracle9i Real Application Clusters database using DBCA Understand the steps to migrate an application to Oracle9i Real Application Clusters Back to Top 2. Overview: What is Oracle9i Real Applications Clusters? Oracle9i Real Application Clusters is a computing environment that harnesses the processing power of multiple, interconnected computers. Oracle9i Real Application Clusters software and a collection of hardware known as a "cluster," unites the processing power of each component to become a single, robust computing environment. A cluster generally comprises two or more computers, or "nodes." In Oracle9i Real Application Clusters (RAC) environments, all nodes concurrently execute transactions against the same database. Oracle9i Real Application Clusters coordinates each node's access to the shared data to provide consistency and integrity. Oracle9i Real Application Clusters serves as an important component of robust high availability solutions. A properly configured Oracle9i Real Application Clusters environment can tolerate failures with minimal downtime. Oracle9i Real Application Clusters is also applicable for many other system types. For example, data warehousing applications accessing read-only data are prime candidates for Oracle9i Real Application Clusters. In addition, Oracle9i Real Application Clusters successfully manages increasing numbers of online transaction processing systems as well as hybrid systems that combine the characteristics of both read-only and read/write applications. Harnessing the power of multiple nodes offers obvious advantages. If you divide a large task into sub-tasks and distribute the sub-tasks among multiple nodes, you can complete the task faster than if only one node did the work. This type of parallel processing is clearly more efficient than sequential processing. It also provides increased performance for processing larger workloads and for accommodating growing user populations. Oracle9i Real Application Clusters can effectively scale your applications to meet increasing data processing demands. As you add resources, Oracle9i Real Application Clusters can exploit them and extend their processing powers beyond the limits of the individual components. From a functional perspective RAC is equivalent to single-instance Oracle. What the RAC environment does offer is significant improvements in terms of availability, scalability and reliability. In recent years, the requirement for highly available systems, able to scale on demand, has fostered the development of more and more robust cluster solutions. Prior to Oracle9i, HP and Oracle, with the combination of Oracle Parallel Server and HP ServiceGuard OPS edition, provided cluster solutions that lead the industry in functionality, high availability, management and services. Now with the release of Oracle 9i Real Application Clusters (RAC) with the new Cache Fusion architecture based on an ultra-high bandwidth, low latency cluster interconnect technology, RAC cluster solutions have become more scalable without the need for data and application partitioning. The information contained in this document covers the installation and configuration of Oracle Real Application Clusters in a typical environment; a two node HP cluster, utilizing the HP-UX operating system. Back to Top 3. Oracle 9i Real Application Clusters – Cache Fusion technology Oracle 9i cache fusion utilizes the collection of caches made available by all nodes in the cluster to satisfy database requests. Requests for a data block are satisfied first by a local cache, then by a remote cache before a disk read is needed. Similarly, update operations are performed first via the local node and then the remote node caches in the cluster, resulting in reduced disk I/O. Disk I/O operations are only done when the data block is not available in the collective caches or when an update transaction performs a commit operation. Oracle 9i cache fusion thus provides Oracle users an expanded database cache for queries and updates with reduced disk I/O synchronization which overall speeds up database operations. However, the improved performance depends greatly on the efficiency of the inter-node message passing mechanism, which handles the data block transfers between nodes. The efficiency of inter-node messaging depends on three primary factors: The number of messages required for each synchronization sequence. Oracle 9i’s Distributed Lock Manager (DLM) coordinates the fast block transfer between nodes with two inter- node messages and one intra-node message. If the data is in a remote cache, an inter-node message is sent to the Lock Manager Daemon (LMD) on the remote node. The DLM and Cache Fusion processes then update the in-memory lock structure and send the block to the requesting process. The frequency of synchronization (the less frequent the better). The cache fusion architecture reduces the frequency of the inter-node communication by dynamically migrating locks to a node that shows a frequent access pattern for a particular data block. This dynamic lock allocation increases the likelihood of local cache access thus reducing the need for inter-node communication. At a node level, a cache fusion lock controls access to data blocks from other nodes in the cluster. The latency of inter-node communication. This is a critical component in Oracle 9i RAC as it determines the speed of data block transfer between nodes. An efficient transfer method must utilize minimal CPU resources, support high availability as well as highly scalable growth without bandwidth constraints. Back to Top 4. New HP Cluster Interconnect technology HyperFabric HyperFabric is a high-speed cluster interconnect fabric that supports both the industry standard TCP/UDP over IP and HP’s proprietary Hyper Messaging Protocol (HMP). HyperFabric extends the scalability and reliability of TCP/UDP by providing transparent load balancing of connection traffic across multiple network interface cards (NICs) and transparent failover of traffic from one card to another without invocation of MC/ServiceGaurd. The HyperFabric NIC incorporates a network processor that implements HP’s Hyper Messaging Protocol and provides lower latency and lower host CPU utilization for standard TCP/UDP benchmarks over HyperFabric when compared to gigabit Ethernet. Hewlett-Packard released HyperFabric in 1998 with a link rate of 2.56 Gbps over copper. In 2001, Hewlett-Packard released HyperFabric 2 with a link rate of 4.0 Gbps over fiber with support for compatibility with the copper HyperFabric interface. Both HyperFabric products support clusters up to 64-nodes. HyperFabric Switches Hewlett-Packard provides the fastest cluster interconnect via its proprietary HyperFabric switches, the latest product being HyperFabric 2, which is a new set of hardware components with fiber connectors to enable low-latency, high bandwidth system interconnect. With fiber interfaces, HyperFabric 2 provides faster speed – up to 4Gbps in full duplex over longer distance – up to 200 meters. HyperFabric 2 also provides excellent scalability by supporting up to 16 hosts via point-to-point connectivity and up to 64 hosts via fabric switches. It is backward compatible with previous versions of HyperFabric and available on IA-64, PARISC servers. Hyper Messaging Protocol (HMP) Hewlett-Packard, in cooperation with Oracle, has designed a cluster interconnect product specifically tailored to meet the needs of enterprise class parallel database applications. HP’s Hyper Messaging Protocol significantly expands on the feature set provided by TCP/UDP by providing a true Reliable Datagram model for both remote direct memory access (RDMA) and traditional message semantics. Coupled with OS bypass capability and the hardware support for protocol offload provided by HyperFabric, HMP provides high bandwidth, low latency and extremely low CPU utilization with an interface and feature set optimized for business critical parallel applications such as Oracle 9i RAC. Back to Top 5. HP/Oracle Hardware and Software Requirements 5.1 General Notes Oracle9i Database Release 2 Documentation for HP 9000 Series HP-UX Oracle9i Real Application Clusters Real Application Clusters Guard I A95979_02 PDF HTML Configuration Guide for Unix Systems: AIX-Based Systems, Compaq Tru64 UNIX, HP 9000 Series HP-UX, Linux Intel, and Sun Solaris Oracle9i Installation Guide Release 2 for UNIX Systems: AIX-Based A96167-01 PDF HTML Systems, Compaq Tru64 UNIX, HP 9000 Series HP-UX, Linux Intel, and Sun Solaris Oracle9i Administrator's Reference for UNIX Systems: AIX-Based A97297-01 PDF HTML Systems, Compaq Tru64 UNIX, HP 9000 Series HP-UX, Linux Intel, and Sun Solaris A97350-02 PDF HTML Oracle9i Release Notes Release 2 (9.2.0.1.0) for HP 9000 Series HP-UX For additional information and about Oracle9i Release Note Release 1 (9.0.1) for HP 9000 Series HP-UX (Part Number A90357-01) look at http://docs.oracle.com/HTML_Storage/a90357/toc.htm Each node uses the HP-UX 11.x operating system software. Issue the command "$uname -r" at the operating system prompt to verify the release being used. Oracle9i RAC is only available in 64-bit flavour. To determine if you have a 64bit configuration on an HP-UX 11.0 installation, enter the following command: $/bin/getconf KERNEL_BITS Oracle9i 9.0.2 RAC is supported by ServiceGuard OPS Edition 11.13 and 11.14 Starting with ServiceGuard OPS Edition 11.13 16 nodes 9i RAC cluster is supported with SLVM (B5161FA). Please note that beginning with version A.11.14.01, which is an Itanium release only, the product structure of ServiceGuard Extension for RAC (SGeRAC; formerly ServiceGuard OPS Edition) has changed. SGeRAC is now an add-on product to the MC/ServiceGuard offering, meaning that it simply can be installed on top of the existing MC/SG. This new structure will be introduced for PA-RISC with the next A.11.15 release of MC/SG. Software mirroring with HP-UX MirrorDisk/UX with SLVM is supported in a 2node configuration only. Support for the HP Hyperfabric product is provided. A total of 127 RAC instances per cluster is supported. 5.2 System Requirements RAM Memory allocation: Minimum 256 MB. Use the following command to verify the amount of memory installed on your system # /usr/sbin/dmesg | grep "Physical:" Swap Space: Minimum 1 x RAM or 400 MB, whichever is greater. Use the following command to determine the amount of swap space installed on your system: # /usr/sbin/swapinfo -a (requires root privileges) CD-ROM drive: A CD-ROM drive capable of reading CD-ROM disks in the ISO 9660 format with RockRidge extensions. Available Disk Space: 3 GB Temporary Disk Space: The Oracle Universal Installer requires up to 512 MB of free space in the /tmp directory Operating System: HP-UX version 11.0. or 11i (11.11). To determine if you have a 64-bit configuration on an HP-UX 11.0 or HP-UX 11i installation, enter the following command # /bin/getconf KERNEL_BITS To determine your current operating system information, enter the following command: # uname -a JRE: Oracle applications use JRE 1.1.8 JDK: Oracle HTTP Server Powered by Apache uses JDK 1.2.2.05. Due to a known HP bug (Doc.id. KBRC00003627), the default HP-UX 64 operating system installation does not create a few required X-library symbolic links. These links must be created manually before starting Oracle9i installation. To create these links, you must have superuser privileges, as the links are to be created in the /usr/lib directory. After enabling superuser privileges, run the following commands to create the required links: # cd /usr/lib # ln -s /usr/lib/libX11.3 libX11.sl # ln -s /usr/lib/libXIE.2 libXIE.sl # ln -s /usr/lib/libXext.3 libXext.sl # ln -s /usr/lib/libXhp11.3 libXhp11.sl # ln -s /usr/lib/libXi.3 libXi.sl # ln -s /usr/lib/libXm.4 libXm.sl # ln -s /usr/lib/libXp.2 libXp.sl # ln -s /usr/lib/libXt.3 libXt.sl # ln -s /usr/lib/libXtst.2 libXtst.sl X Server system software. (Refer to the installation guide for more information on the X Server System and emulator issues.) Back to Top 5.3 HP-UX Operating System Patches 11.0 (64bit): March 2003 HP-UX patch bundle PHCO_23963 PHCO_24148 PHCO_23919 PHKL_23226 PHNE_24034 PHSS_23440 hyperfabric driver: 11.00.12 (HP-UX 11.0) (Required only if your system has an older hyperfabric driver version) 11i (64bit): March 2003 HP-UX patch bundle Oracle rel-notes request: PHKL_25506 PHSS_26946 HP aC++ -AA runtime libraries (aCC PHSS_28436 PHNE_27745 ld(1) and linker tools cumulative patch HyperFabric B.11.11.0[0-2] cumulative A.03.37) patch PHSS_27725 MC/SG 11.14 patch history: [PHKL_25506 included in DART-59-12-02 SUPPORT-PLUS-1202] PHSS_26946 <- PHSS_24638 HP aC++ AA runtime libraries (aCC A.03.37) PHSS_28436 <- PHSS_26560 <- PHSS_26263 ld(1) and linker tools cumulative patch PHNE_27745 <- PHNE_27144 HyperFabric B.11.11.0[0-2] cumulative patch PHSS_27725 <- PHSS_27246 MC/SG 11.14 Java patches (22-Nov-2002): PHCO_24402 libc cumulative header file patch PHCO_26061 s700_800 11.11 Kernel configuration commands patch PHCO_26331 mountall cumulative patch PHCO_26466 initialised TLS, Psets, Mutex performance PHCO_27434 libc cumulative patch PHKL_24751 preserve IPSW W-bit and GR31 lower bits PHKL_25233 select(2) and poll(2) hang PHKL_25468 eventport (/dev/poll) pseudo driver PHKL_25729 signals, threads enhancement, Psets Enablement PHKL_25993 thread nostop for NFS, rlimit max value fix PHKL_25994 thread NOSTOP, Psets Enablement PHKL_26468 priority inversion and thread hang PHKL_27091 core PM, vPar, Psets Cumulative Patch PHKL_27094 Psets Enablement Patch, slpq1 perf PHKL_27096 VxVM, EMC, Psets&vPar, slpq1, earlyKRS PHKL_27278 Required for large heap on SDK 1.4 VM-JFS ddlock, mmap,thread perf, user limits PHKL_27317 Thread NOSTOP, Psets, Thread Abort PHKL_27502 MO 4k sector size; FIFO; EventPort, perf PHNE_26388 ONC/NFS General Release/Performance Patch PHNE_27063 cumulative ARPA Transport patch PHSS_27425 Required for SDK 1.4 64-bit X/Motif Runtime Optional Patch: For DSS applications running on machines with more than 16 CPUs, we recommend installation of the HP-UX patch PHKL_22266. This patch addresses performance issues with the HP-UX Operating System. HP provides patch bundles at http://www.software.hp.com/SUPPORT_PLUS Individual patches can be downloaded from http://itresourcecenter.hp.com/ To determine which operating system patches are installed, enter the following command: # /usr/sbin/swlist -l patch To determine if a specific operating system patch has been installed, enter the following command: # /usr/sbin/swlist -l patch patch_number To determine which operating system bundles are installed, enter the following command: # /usr/sbin/swlist -l bundle Back to Top 5.4 Kernel parameters Kernel Parameter Setting Purpose KSI_ALLOC_MAX (NPROC * 8) Defines the system wide limit of queued signal that can be allocated. MAXDSIZ 1073741824 bytes Refers to the maximum data segment size for 32-bit systems. Setting this value too low may cause the processes to run out of memory. MAXDSIZ_64 2147483648 bytes Refers to the maximum data segment size for 64-bit systems. Setting this value too low may cause the processes to run out of memory. MAXSSIZ 134217728 bytes Defines the maximum stack segment size in bytes for 32-bit systems. MAXSSIZ_64BIT 1073741824 Defines the maximum stack segment size in bytes for 64-bit systems. MAXSWAPCHUNKS (available memory)/2 Defines the maximum number of swap chunks where SWCHUNK is the swap chunk size (1 KB blocks). SWCHUNK is 2048 by default. MAXUPRC (NPROC - 5) Defines maximum number of user processes. MSGMAP (MSGTQL + 2) Defines the maximum number of message map entries. MSGMNI NPROC Defines the number of message queue identifiers. MSGSEG (NPROC * 4) Defines the number of segments available for messages. MSGTQL NPROC Defines the number of message headers. NCALLOUT (NKTHREAD + 1) Defines the maximum number of pending timeouts. NCSIZE ((8 * NPROC + 2048) + VX_NCSIZE) Defines the Directory Name Lookup Cache (DNLC) space needed for inodes. VX_NCSIZE is by default 1024. NFILE (15 * NPROC + 2048) Defines the maximum number of open files. NFLOCKS NPROC Defines the maximum number of files locks available on the system. NINODE (8 * NPROC + 2048) Defines the maximum number of open inodes. NKTHREAD (((NPROC * 7) / 4) + 16) Defines the maximum number of kernel threads supported by the system. NPROC 4096 Defines the maximum number of processes. SEMMAP (SEMMNI + 2) Defines the maximum number of semaphore map entries. SEMMNI (SEMMNS / 2) Defines the maximum number of semaphore sets in the entire system. SEMMNS (NPROC * 2) * 2 Sets the number of semaphores in the system. The default value of SEMMNS is 128, which is, in most cases, too low for Oracle9i software. SEMMNU (NPROC - 4) Defines the number of semaphore undo structures. SEMVMX 32768 Defines the maximum value of a semaphore. SHMMAX Available physical Defines the maximum allowable size of one memory shared memory segment. The SHMMAX setting should be large enough to hold the entire SGA in one shared memory segment. A low setting can cause creation of multiple shared memory segments which may lead to performance degradation. SHMMNI 512 Defines the maximum number of shared memory segments in the entire system. SHMSEG 32 Defines the maximum number of shared memory segments one process can attach. VPS_CEILING 64 Defines the maximum System-Selected Page Size in kilobytes. Note: These are minimum kernel requirements for Oracle9i. If you have previously tuned your kernel parameters to levels equal to or higher than these values, continue to use the higher values. A system restart is necessary for kernel changes to take effect. Back to Top 5.5 Asynchronous I/O 1. Create the /dev/async character device # /sbin/mknod /dev/async c 101 0x0 # chown oracle:dba /dev/async # chmod 660 /dev/async 2. Configure the async driver in the kernel using SAM => Kernel Configuration => Kernel => the driver is called 'asyncdsk' Generate new kernel Reboot 3. Set HP-UX kernel parameter max_async_ports using SAM. max_async_ports limits the maximum number of processes that can concurrently use /dev/async. Set this parameter to the sum of 'processes' from init.ora + number of bakground processes. If max_async_ports is reached, subsequent processes will use synchronous i/o. 4. Set HP-UX kernel parameter aio_max_ops using SAM. aio_max_ops limits the maximum number of asynchronous i/o operations that can be queued at any time. Set this parameter to the default value (2048), and monitor over time using glance Back to Top 6. Configure the HP/Oracle 9i Real Application Cluster 6.1 Hardware configuration (Hardware planning, Network and disk layout) Hardware Planning In order to provide a high level of availability, a typical cluster uses redundant system components, for example two or more systems and two or more independent disk subsystems. This redundancy eliminates single points of failure. The nodes in an Oracle9i RAC cluster are HP 9000 systems with similar memory configuration and processor architecture. A node can be any Series 800 model. It is recommended that both nodes be of similar processing power and memory capacity. An RAC cluster must have: Two or more nodes Redundant high-speed interconnect between the nodes (e.g. Hyperfabric, Hyperfabric switches) Redundant network components (Primary and Standby LAN) Redundant disk storage or RAID0/1 configuration for disk mirroring A dedicated heartbeat LAN (heartbeat traffic is also carried on the primary and standby LAN) Redundant power supplies Network and Disk Layout Draw a diagram of your cluster using information gathered from these two sets of commands. You’ll use this information later in configuring the system, the logical volumes and the cluster. 1. Use the LAN commands # lanscan # ifconfig lanX, and # netstat to determine the number of LAN interfaces on each node and the names and addresses of each LAN card and subnet information. 2. Use the IO command # ioscan –fnCdisk to find the disks connected to each node. Note the type of disks installed. List the hardware addresses and device file names of each disk. Also note which are shared between nodes. Network Planning Minimally, a 9i RAC cluster requires three distinct subnets: o o o Dedicated cluster heartbeat LAN Dedicated Distributed Lock Manager (DLM) LAN User/Data LAN, which will also carry a secondary heartbeat Because the DLM is now integrated into the Oracle9i kernel, the DLM will use the IP address associated with the default host name. The network should be configured in the /etc/rc.config.d/netconf file. Any time you change the LAN configuration, you need to stop the network and re-start it again: # /sbin/rc2.d/S340net stop # /sbin/rc2.d/S340net start DLM requires a high speed network to handle high bandwidth network traffic. In the Oracle literature this is referred to as the host interconnect. We recommend using either Hyperfabric or Gigabit Ethernet for this network. Remote copy(rcp) needs to be enabled for both the root and oracle accounts on all nodes to allow remote copy of cluster configuration files. There are two ways to enable rcp for root. You can choose the one that fits your site’s security requirements. Include the following lines in either the .rhosts file in root’s home directory or in the /etc/cmcluster/cmclnodelist file: node1name root node2name root To enable remote copy (rcp) for Oracle include the following lines in the .rhosts file in the oracle user’s home directory: node1name oracle node2name oracle where node1name and node2name are the names of the two systems in the cluster and oracle is the user name of the Oracle owner. The rcp only works if for the respective user a password has been set (root and oracle). Back to Top 6.2 Configure logical volumes General Recommendations When disk drives were 1 or 2-GB at maximum the usual wisdom was to do the following: Place redo logs and database files onto different drives Ensure that data and indexes were on separate spindles Spread the I/O load across as many disk devices as possible Today with the greatly increased capacity of a single disk mechanism (maximum 181Gb drives on an XP512) and much faster I/O rates or transfer speeds, these rules must be revisited. The real reason for these rules of thumb was to make sure that the I/O load resulting from an Oracle database would wind up being fairly well spread across all the disk mechanisms. Before the advent of large capacity disk drives housed in high performance storage systems, if the same disk drive wound up hosting two or more fairly active database objects, performance could deteriorate rapidly, especially if any of these objects needed to be accessed sequentially. Today, in the era of huge disk arrays, the concept of “separate spindles” is a bit more vague, as the internal structure of the array is largely hidden from the view of the system administrator. The smallest independent unit of storage in an XP array is substantially larger than 1 or 2 GB, which means you have far fewer “spindles” to play with, at a time when there are more database objects (tables, indexes, etc) to “spread”, so it won’t be possible to keep all the objects separate. The good news is that the architecture of the XP array is much more tolerant of multiple simultaneous I/O streams to/from the same disk mechanism than the previous generation of individual small disks. Given all these advances in the technology, we have found it best to use a simple method for laying out an Oracle database on an XP array (under HP-UX) with volume manager striping of all of the database objects across large numbers of disk mechanisms. The result is to average out the I/O to a substantial degree. This method does not guarantee the avoidance of disk hotspots, but we believe it to be a reasonable “first pass” which can be improved upon with tuning over time. It’s not only a lot faster to implement than a customized one-object-at-a-time layout, but we believe it to be much more resistant to the inevitable load fluctuations which occur over the course of a day, month, or year. The layout approach that we are advocating might be described as “Modified StripeEverything- -Across-Everything”. Our goal is to provide a simple method which will yield good I/O balance, yet still provide some means of manual adjustment. Oracle suggests the same strategy. Their name for this strategy is SAME (Stripe and Mirror Everything). XP basics: an XP512 can be configured with one to four pairs of disk controller modules (ACPs). Each array group is controlled by only one of these ACP pairs (it is in the domain of only one ACP pair). Our suggestion is that you logically “separate” the XP’s array groups into four to eight sets. Each set should have array groups from all the ACP domains. Each set of array groups would then be assigned to a single volume group. All LUNs in the XP array will have paths defined via two distinct host-bus adapters; the paths should be assigned within each volume group in such a fashion that their primary path alternates back and forth between these two host-bus adapters. The result of all this: each volume group will consist of space which is ‘stripable’ across multiple array groups spread across all the ACP pairs in the array, and any I/O done to these array groups will be spread evenly across the host-bus adapters on the server. LVM Steps 1. Disks need to be properly initialized before being added into volume groups by the pvcreate command. Do the following step for all the disks (LUNs) you want to configure for your 9i RAC volume group(s): # pvcreate –f /dev/rdsk/cxtydz ( where x=instance, y=target, and z=unit) 2. Create the volume group directory with the character special file called group: # mkdir /dev/vg_rac # mknod /dev/vg_rac/group c 64 0x060000 Note: The minor numbers for the group file should be unique among all the volume groups on the system. 3. Create PV-LINKs and extend the volume group: # vgcreate /dev/vg_rac /dev/dsk/c0t1d0 /dev/dsk/c1t0d0 secondary path) # vgextend /dev/vg_rac /dev/dsk/c1t0d1 /dev/dsk/c0t1d1 .(primary path ... Continue with vgextend until you have included all the needed disks for the volume group(s). 4. Create logical volumes for the 9i RAC database with the command # lvcreate –i 10 –I 1024 –L 100 –n Name /dev/vg_rac -i: number of disks to stripe across -I: stripe size in kilobytes -L: size of logical volume in MB 5. Logical Volume Configuration It is necessary to define raw devices for each of the following categories of files. The Oracle Database Configuration Assistant (DBCA) will create a seed database expecting the following configuration. By following the naming convention described in the table below, raw partitions are identified with the database and the raw volume type (the data contained in the raw volume). Raw volume size is also identified using this method. Note : In the sample names listed in the table, the string db_name should be replaced with the actual database name, thread is the thread number of the instance, and lognumb is the log number within a thread. Create a Raw Device for: File Size SYSTEM tablespace 400 MB USERS tablespace 120 MB TEMP tablespace 100 MB Sample name db_name_raw_system_400 db_name_raw_user_120 db_name_raw_temp_100 Comments An undo tablespace per instance OEMREPO 500 db_name_thread_raw_undo_500 MB 20 MB db_name_raw_oemrepo_20 INDX tablespace TOOLS tablespace DRYSYS tablespace 70 MB db_name_raw_indx_70 12 MB db_name_raw_tools_12 90 MB db_name_raw_dr_90 First control file Second control file Two ONLINE redo log files per instance Spfile.ora Srvmconfig ODM 110 MB 110 MB 120 MB per file 5 MB 100 MB 100 MB CWMLITE 100 MB db_name_raw_cwmlite_100 XDB 100 MB db_name_raw_xml_100 EXAMPLE 160 MB db_name_raw_examples_160 thread = instance number optional: required for Oracle Enterprise Manager repository optional: only required Oracle Text db_name_raw_control01_110 db_name_raw_control02_110 db_name_thread_lognumb_120 db_name_raw_spfile_5 db_name_raw_srvmconf_100 db_name_raw_odm_100 thread = instance number optional: only required Oracle Data Mining optional: only required for OLAP optional: only required for XML DB optional: only required for Example schemas Note: Automatic Undo Management requires an undo tablespace per instance therefore you would require a minimum of 2 tablespaces as described above. 6. It is recommended best practice to create symbolic links for each of these raw files on all systems of your RAC cluster. # cd /oracle/RAC/ (directory where you want to have the links) # ln -s /dev/vg_rac/rac_raw_system_400 system # ln -s /dev/vg_rac/rac_raw_user_120 user etc. 7. Check to see if your volume groups are properly created and available: # strings /etc/lvmtab # vgdisplay –v /dev/vg_rac 8. Change the permission of the database volume group vg_rac to 777, change the permissions of all raw logical volumes to 660 and the owner to oracle:dba. # chmod 777 /dev/vg_rac # chmod 660 /dev/vg_rac/r* # chown oracle:dba /dev/vg_rac/r* 9. Export the volume group: De-activate the volume group: # vgchange –a n /dev/vg_rac Create the volume group map file: # vgexport –v –p –s –m mapfile /dev/vg_rac Copy the mapfile to all the nodes in the cluster: Syntax: # rcp mapfile system_name:target_directory Example: # rcp MyMAPfile nodeB:/tmp/scripts 10. Import the volume group on the second node in the cluster Create a volume group directory with the character special file called group: # mkdir /dev/vg_rac # mknod /dev/vg_rac/group c 64 0x060000 Note: The minor number has to be the same as on the other node. Import the volume group: # vgimport –v –s –m mapfile /dev/vg_rac # chmod 777 /dev/vg_rac # chmod 660 /dev/vg_rac/r* # chown oracle:dba /dev/vg_rac/r* Note: The minor number has to be the same as on the other node. Check to see if devices are imported: # strings /etc/lvmtab Back to Top 6.3 Configure HP ServiceGuard Cluster After all the LAN cards are installed and configured, and the RAC volume group and the cluster lock volume group(s) are configured, you can start the cluster configuration. The following sequence is very important. However, if the RAC volume groups are unknown at this time, you should be able to configure the cluster minimally with a lock volume group. At this time, the cluster lock volume group should have been created. Since in this cookbook we configure one volume group for the entire RAC cluster vg_rac, we used vg_rac for the lock volume as well. 1. Activate the lock disk on the configuration node ONLY. Lock volume can only be activated on the node where the cmapplyconf command is issued so that the lock disk can be initialized accordingly. # vgchange -a y /dev/vg_rac 2. Create a cluster configuration template: # cmquerycl –n nodeA –n nodeB –v –C /etc/cmcluster/rac.asc 3. Edit the cluster configuration file (rac.asc). Make the necessary changes to this file for your cluster. For example, change the ClusterName, and adjust the heartbeat interval and node timeout to prevent unexpected failovers due to DLM traffic. 4. Check the cluster configuration: # cmcheckconf -v -C rac.asc 5. Create the binary configuration file and distribute the cluster configuration to all the nodes in the cluster: # cmapplyconf -v -C rac.asc Note: the cluster is not started until you run cmrunnode on each node or cmruncl. 6. De-activate the lock disk on the configuration node after cmapplyconf # vgchange -a n /dev/vg_rac 7. Start the cluster and view it to be sure its up and running. See the next section for instructions on starting and stopping the Cluster. Starting the Cluster 1. Start the cluster from any node in the cluster # cmruncl -v Or, on each node # cmrunnode -v 2. Make all RAC volume groups and Cluster Lock volume groups sharable and cluster aware (not packages) from the cluster configuration node. This has to be done only once. # vgchange -S y -c y /dev/vg_rac 3. Then on all the nodes, activate the volume group in shared mode in the cluster. This has to be done each time when you start the cluster. # vgchange -a s /dev/vg_rac 4. Check the cluster status: # cmviewcl –v How to shut down the cluster (not needed here) 1. Shut down the 9i RAC instances (If up and running) 2. On all the nodes, deactivate the volume group in shared mode in the cluster: # vgchange –a n /dev/vg_rac 3. Halt the cluster from any node in the cluster # cmhaltcl –v 4. Check the cluster status: # cmviewcl –v