Oracle9i RAC Architecture

advertisement
High Availability and Scalability
Technologies
An Oracle9i RAC Solution
Presented by:
Arquimedes Smith
Oracle9i RAC Architecture
Real Application Cluster (RAC) is a powerful new feature in Oracle9i
Database that can greatly enhance an application’s scalability and availability.
Oracle9i RAC is an Oracle database that has two or more instances accessing a
shared database via cluster technology. A cluster is a group of machines (or
nodes) that work together to perform the same task. To support this
architecture, two or more machines that host the database instance are linked
by a high-speed interconnect to form the cluster. The interconnect is a
physical network used as a means of communications between each node of
the cluster.
Objectives:

Identify the components of hardware clusters and Oracle9i Real
Application Clusters Architecture.

Identify the functions of the Global Cache Service, the Global Enqueue
Service, and the Global Resource Directory.

Describe the concepts and architecture of Cache Fusion.

Define when and how dynamic resource remastering occurs.

Explain database recovery for Cache Fusion.
Real Application Clusters Concepts
Oracle9i Real Application Clusters allow multiple instances to execute against the same
database. The typical installation involves a cluster of servers with access to the same disks.
The nodes that actually run instances form a proper subset of the cluster. The cluster nodes
are connected by an interconnect that allows them to share disks and run the Cluster Group
Services (CGS), the Global Cache Service, and the Global Enqueue Service.
A node is defined as the collection of processors, shared memory and disks that runs an
instance. A node may have more than one CPU, in either an SMP or a NUMA
configuration. The node monitor is part of the vendor-provided cluster management
software (CMS) that monitors the health of processes running in the cluster, and is used by
CGS to control the membership of instances in Real Application Clusters.
A node-to-instance mapping defines which instances run on which nodes. For example, it
can specify that instance RACA runs on host 1, and instance RACB runs on host 2. This
mapping is stored in text configuration files on UNIX, and in the registry on Windows.
Benefits of Real Application Clusters
By using multiple instances running on their own nodes against the same database, Real
Application Clusters provide the following advantages over single instance databases:

Applications have higher availability because they can be accessed from any instance:
an instance failure on one node does not prevent work from continuing on one or more
surviving instances.

More database users can be supported when a single node reaches its capacity to
support additional sessions.

Some processing, particularly operations that can be executed with parallel
components, is completed faster when the work is spread across multiple nodes.

Work can scale (more work can be completed in the same amount of time) when each
instance can be optimized to support a maximum workload.
Block Transfers in Real Application Clusters
In Oracle8i, a Consistent Read Cache Fusion server was introduced to Parallel Server. This
removed the need to force disk writes to satisfy queries and greatly simplified application
design for Oracle Parallel Server. Consistent Read Cache Fusion allowed a read consistent
block image to be transferred from the buffer cache of the writing instance to the cache of
the reading instance using the cluster interconnect.
This contrasted with previous releases that required the current block to be forced to disk
for the reading instance, followed, in most cases, by forcing writes of rollback segment
blocks to produce a read consistent copy of the block in the reading instance. Consistent
Read Cache Fusion was also known as Cache Fusion, Phase 1, or Write-Read Cache
Fusion. The latter name refers to the fact that blocks changed in one instance (write
activity) have a read consistent image sent across the interconnect to satisfy a query (read
activity).
In Oracle9i, Real Application Clusters replace Oracle Parallel Server and provide full
Cache Fusion. With Cache Fusion, modified blocks currently in the buffer cache of one
instance are transferred to another instance across the cluster interconnect rather than by
forcing disk writes of the blocks. This is true for blocks required for changes by the second
instance (write-write transfers) as well as for queries on the second instance (write-read
transfers). The mechanism also allows read-read and read-write transfers, which reduces
the need to read blocks from disk.
Cache Fusion Model
The Global Resource Directory contains the current status of resources shared by the
instances. Its contents are maintained by the Global Cache and Global Enqueue
Services using messages containing sufficient information to ensure that the current
block image can be located. These messages also identify block copies being retained
by an instance for use by the recovery mechanisms and include sequence information to
identify the order of changes made to that block since it was read from disk.
The mode in which a block resource is held (NULL, Shared, or Exclusive), as well as
the resource role (local or global) and past image history, are maintained by the Global
Resource Directory. Details of resource roles and past images are provided later in this
lesson. The Global Cache Service is responsible for assigning resource modes and
roles, updating their status, locating the most current block image when necessary, and
informing holders of past images when those images are no longer needed.
The information managed and maintained by the Global Resource Directory allows
Real Application Clusters to minimize the time taken to return to normal processing
following an instance failure or cluster reconfiguration. Additionally, this information
allows the LMS process to migrate a block resource master from its original location.
The Global Cache Service migrates the block resource record to the requesting instance
when a single instance appears to make exclusive, but repeated, use of the block.
Global Cache Service Resource Modes
A block resource can be held by an instance in one of three modes:

NULL (N)
The NULL mode is the default status for each instance. It indicates that the instance does
not currently hold the resource in any mode.

Shared (S)
The shared resource mode is required for an instance to read the contents of a block,
typically to satisfy a query. Multiple instances can hold a shared mode resource on the
same block concurrently.

Exclusive (X)
The exclusive resource mode allows an instance to change the contents of a block
covered by the resource. When an instance holds a resource in exclusive mode, all other
instances must hold it in NULL mode.
Global Cache Service Resource Roles
Global Cache Service block resources are held in either a local or a global role. When a
resource is held with a local role, it behaves very similarly to a PCM lock in Oracle
Parallel Server. That is, an exclusive mode resource can only be held in one instance at a
time and that no other instance can hold that resource in any mode whereas multiple
instances can hold the resource in shared mode. Also, the Global Resource Directory does
not have to retain any information about a resource being held in NULL mode by an
instance.
Global roles allow these restrictions to be broken. Specifically, an instance can use a
global resource for a consistent read while it is concurrently held in exclusive mode by
another instance. Also, two instances can hold dirty copies of a block concurrently
although only one of them can have the resource in exclusive mode. Clean blocks held
only in shared mode on multiple instances do not need a global role resource.
Also, global block resource information can be stored in the Global Resource Directory to
manage the history of block transfers even if the resource mode is NULL. With local
resources, the Global Cache Service discards resource allocation information for instances
which downgrade a resource to NULL mode.
Fast Real Application Clusters Reconfiguration
Many cluster hardware vendors use a disk-based quorum system that allows each node to
determine which other nodes are currently active members of the cluster. These systems
also allow a node to remove itself from the cluster or to remove other nodes from the
cluster. The latter is accomplished through a type of voting system, managed through the
shared quorum disk, that allows nodes to determine which will remain active if one of
more of them become disconnected from the cluster interconnect.
Real Application Clusters implement a similar disk-based system to determine which
instances are currently active and which are not. The system uses a heartbeat mechanism
to perform frequent checks against the disk to determine the status of each instance, and
also uses voting system to exclude instances that can no longer be reached by one or more
active instances
At each heartbeat, every member instance gives its impression of the other members’
availability. It they all agree, nothing further is done until the next heartbeat. If two or
more instances report a different instance configuration from each other (for example,
because the cluster interconnect is broken between a pair of nodes), then one member
arbitrates among the different membership configurations.Once this configuration is
tested, the arbitrating instance uses the shared disk to publish the proposed configuration
to the other instances. All active instances then examine the published configuration, and,
if necessary, terminate themselves
Background Processes
Although some of the background processes used by Real Application Clusters have the
same names as those in Oracle Parallel Server, their functions and activities are different
from those in previous Oracle cluster-enabled software releases. These differences are
discussed below.

BSP: The Block Server Process was introduced in Oracle8i Parallel Server to
support Read Consistent Cache Fusion. It does not exist in a Real Application Clusters
database where these activities are performed by the LMS process.

LMS: This process was known as the Lock Manager Server Process in Oracle
Parallel Server. The new LMS process in Real Application Clusters executes as the Global
Cache Service Process.

LMON: This process was known as the Lock Manager Monitor in Oracle Parallel
Server. While Real Application Clusters databases use the same process name, the process
is defined as the Global Enqueue Service Monitor.

LMD: There was a Lock Manager process, called LMD0, in Oracle Parallel Server,
with a zero at the end of the name implying that multiple copies of the process may be
allowed. This process is not used by Real Application Clusters but a new process, called
LMD, executes as the Global Enqueue Service.
Instance
RAC1
Instance
RAC2
PGA
Private
SQL
Area
Cursors
SQL
Area
LMS
SGA
Global Resource
Directory
PGA
Database Buffer
Cache
Shared Pool
Data
Dictionary
Cache
Java Pool
Library
Cache
Large Pool
LMON
LMD
LGWR
DBWR
Redo Log
Files
Redo Log Buffer
Streams
Pool
LCK
Misc.
CKPT
Private
SQL
Area
Cursors
SQL
Area
LMS
Data files and
Control files
SGA
Global Resource
Directory
Database Buffer
Cache
Shared Pool
Data
Dictionary
Cache
Java Pool
LMON
Library
Cache
Large Pool
Redo Log Buffer
Streams
Pool
LMD
LCK
DBWR
LGWR
Misc.
CKPT
Redo Log
Files
Background processes in RAC
Background processes of a single instance, as in RAC configuration, each instance will
have a set of these background processes.
The following characteristics are unique to a RAC implementation as opposed to a single
instance configuration:

RAC is a configuration with multiple instances of Oracle running on many nodes.

Multiple instances of Oracle share a single physical database.

Multiple instances reside on different nodes and communicate with each other via a
cluster interconnect.

Instances may come and leave the cluster dynamically, provided the number is
within MAX_INSTANCES value defined in the parameter file.

Instances share a common database that comprises common data files and control
file.

Each instance participating the clustered configuration will have individual log files,
rollback segments and undo tablespaces.

All instances participating the clustered configuration can simultaneously execute
transactions against the common shared database.

Instances participating in the clustered configuration communicate via the cluster
interconnect using a new technology called the cache fusion technology.
Shared Initialization Parameter Files
In previous Oracle cluster-enabled software releases, you had to have a separate
initialization file for each instance in order to assign different parameter values to the
instances. However, certain parameters had to have the same value for every instance in a
clustered database. To simplify the management of the instance-specific and common
database parameters, the IFILE parameter was commonly used. This parameter would
point to a file containing the common parameter values and was included in each of the
individual instance parameter files.
In Oracle9i, you can store the parameters for all the instances belonging to an Oracle Real
Application Clusters database in a single file. This simplifies the management of the
instances because you only have one file to maintain. It is also easier to avoid making
mistakes, such as changing a value in one instance’s file but not in another’s, if all the
parameters are listed in one place.
To allow values for different instances to share the same file, parameter entries that are
specific to a particular instance are prefixed with the instance name using a dot notation.
For example, to assign different sort area sizes to two instances, PROD1and PROD2, you
could include the following entries in your parameter file:
prod1.sort_area_size = 1048576
prod2.sort_area_size = 524288
Shared Initialization Parameter File
When you put the parameters for all your instances in a single initialization file, you need
this file to be available to the process that starts up each instance. If your instances are
started automatically as part of the system startup routines, you need a copy of the file on
each node in the cluster. However, in Oracle9i, you can store the parameters for all the
instances in a special binary file known as a server parameter file (SPFILE). By storing
your SPFILE on a shared cluster disk partition, the parameter file becomes available to
any node in the cluster. Therefore, you only need to keep and maintain one copy of this
file.
The cluster in the slide consists of two nodes. Node 1 defines /hdisk1 as its
$ORACLE_HOME directory and uses the name ORAC1 for its instance. Node 2 is using
/hdisk1 for its $ORACLE_HOME and has ORAC2 for its instance’s name. A partition on
the raw device, called /dev/rdisk1/spfile, holds the binary file, SPFILE.
Each node has an instance-specific initialization file configured, containing just one entry:
SPFILE = /dev/rdisk1/spfile
This entry simply points to the raw partition holding the server parameter file containing
the entries that are common to both instances as well as the instance-specific entries.
The example shows two of the entries in SPFILE. These entries contain parameters that
follow the naming standard suggested for instances: using the SID, defined at the
operating system level, for the INSTANCE_NAME. Because instance names are unique to
each instance, they use dot notation to include the instance name with the parameter name:
orac1.instance_name = orac1
orac2.instance_name = orac2
Instance Naming
In addition to the instance name assigned at the operating system level, using the
ORACLE_SID environment variable, Oracle Parallel Server instances could be named with
the INSTANCE_NAME parameter. However, there were limitations with this identification
method:

The INSTANCE_NAME values did not have to be unique in different instances of the
same database

On some platforms, the ORACLE_SID value could be the same for all instances of
the same database.
This meant that the instance names could not be used reliably by management tools to
identify a particular instance.
In Oracle9i, each instance of a Real Application Clusters database is required to have a
unique name assigned with the SID. Unique instance names enable system-management
tools to identify instances to the user with the instance names, with the assurance that these
names are unique. Unique names also allow the instances associated with the same database
to share an initialization file through the use the SID as a parameter prefix as described
earlier.
Unique Instance Numbers
Oracle Parallel Server instances chose an unused instance number if a multi-instance
database instance started up without specifying a value for the INSTANCE_NUMBER
parameter. For example, if neither of the parameter files in a two-instance cluster database
included an INSTANCE_NUMBER parameter, then the first instance started would become
instance one and the second to start would become instance two. The instance numbers
depended solely on the startup order and either instance could be identified as one or as
two.
Further, if the THREAD parameter were given a value for each these instances, the instance
numbers and thread numbers for the instances could be different.This could be confusing
when querying the dynamic performance tables and trying to manage the two instances.
In Real Application Clusters, the default value for the INSTANCE_NUMBER parameter is
set to 1. If you try to start two instances without specifying non-default values for one of
them, the second instance will not start and you will receive an error message. To start
successfully, each instance is required to specify a unique number in its
INSTANCE_NUMBER parameter.
Although this requirement for unique instance numbers will not prevent you from using
different values for the INSTANCE_NUMBER and THREAD parameters for an instance,
you are more likely to change them both if you are editing the file parameter file.
Instance Names and Numbers
There are three different databases shown in the example below: PROD, DEV, and TEST.
PROD has four instances, PROD1, PROD2, PROD3, and PROD4, each running on one
of the four available nodes. DEV has two instances, DEV1 and DEV2, running on node A
and node C respectively. Similarly, TEST has two instance, TEST1 on node B and
TEST2 on node D.
Instance Names and Numbers
The recommended naming and numbering for these instances would be as follows:
1. Number the redo threads for each instance 1, 2, 3, 4, and so on. That is, your redo thread
numbers should start at 1 and increment by 1. The slide shows the thread numbers assigned to
each of the eight instances used in the example.
2. Set the ORACLE_SID to be the database name plus its redo thread number as a suffix. For
example, on node A, you would use ORACLE_SID = PROD1 for the PROD database
instance and ORACLE_SID = DEV1 for the DEV database instance.
3. Set the THREAD parameter to match the thread number you chose for the instance and
which you should have reflected in the instance’s ORACLE_SID value. For example, the
THREAD value for the PROD database instance on node C should be 3.
4. Set the INSTANCE_NUMBER parameter value to be the same as the THREAD parameter
value for each instance. For example, the INSTANCE_NUMBER for the PROD database
instance on node C should also be set to 3, the value assigned to THREAD in item 3.
5. Set the INSTANCE_NAME parameters for each instance to match the SID name in the
parameter file. For example, the INSTANCE_NAME value for the TEST database instance
on node D should be PROD4.
Following these recommendations, the parameter file for the DEV database would contain the
following entries:
dev1.thread = 1
dev2.thread = 2
dev1.instance_number = 1
dev2.instance_number = 2
dev1.instance_name = dev1
dev2.instance_name = dev2
SRVCTL Commands
SRVCTL is a tool that lets you manage your Real Application Clusters environment from
the command line. It replaces and extends the capabilities of the Oracle Parallel Server
tool, called OPSCTL, which supported only the START and STOP subcommands. In
comparison to OPSCTL, the START and STOP subcommands have been extended with
additional options and new subcommands have been added to support more functionality.
Download