Jack Cai, Oracle
Oracle Real Application Clusters 10g (RAC) is much easier to use than previous releases. Numerous enhancements
have been made to improve manageability and usability. For example, the full stack installation option, the automatic
storage manager, the policy based workload manager, and the much enhanced Oracle Enterprise Manager are all
geared towards making RAC appeal to a broader market. One of the key components in this effort is the
enhancement to cluster diagnosability and verification.
The diagnostics improvements of Oracle Database 10g make diagnosing cluster databases as easy as diagnosing single
instance databases. The flexibility of clusters does not get in the way of diagnosability. The verification improvements
will help users eliminate problems with cluster installations. Starting with the Oracle9i Database and continuing with
Oracle Database 10g, RAC has been constantly enhanced with improvements to provide full stack diagnosability,
monitoring and verification capabilities. The goal is to prevent mistakes from happening before RAC is installed and
to pinpoint root causes of runtime problems as fast as possible. Ultimately, these improvements will make RAC much
easier to manage and will reduce unnecessary database downtime.
In this paper, we will go over the key enhancements, including tracing instrumentation, tools for first pass analysis,
trace data analysis and clusterization of current diagnostic and tracing tools. We will also go over the cluster
verification framework that ensures a proper system configuration for RAC.
The intent of this paper is to showcase the systematic approach Oracle is taking towards diagnostics and verification,
and to demonstrate how much easier Oracle Real Application Clusters is to use and manage. However, users should
be cautioned against attempting to use the diagnostics/debugging tools on their own. Oracle recommends that users
only run the diagnostics facilities under the direction of Oracle support personnel.
Diagnostics of Oracle Database is getting better. Oracle9i Real Application Clusters laid a solid foundation for cluster
diagnostics. A new DIAG daemon process was introduced to handle diagnostics specific tasks, and a memory based
mechanism was introduced to capture trace logs. These two changes greatly enhanced the performance of trace
(diagnostics information) generation. They reduced system overhead for trace generation while increasing trace
flexibility and versatility. Trace instrumentation was mainly implemented for RAC related system components in
Oracle 9i. Oracle Real Application Clusters 10g continues with trace instrumentation on more generic database system
components, at the same time, adds new capabilities with clusterization of oradebug and hang analyzer, a new offline
trace loader, and a new trace navigation tool.
The goal of the diagnostic enhancements is to enable sufficient information to be generated for first pass failure
analysis with minimal overhead to the database server. The diagnostic architecture in Oracle9i and Oracle 10g makes
the goal much more attainable. Figure 1 shows the overall diagnostics architecture.
This architecture enables trace processing to incur very low overhead. The DIAG daemons do not interfere the
normal operations of the database server. The DIAG daemon process was introduced in Oracle 9i Database to
manage all diagnostics related activities, acting as a broker between online debug tools and regular database processes.
All debugging commands are issued through the DIAG daemon on the same node to reach their intended targets.
This DIAG daemon then coordinates with DIAG daemons on other nodes of the same cluster to complete the
commands. Activities such as setting trace levels, archiving the in-memory trace logs to files, taking memory/crash
dumps are done by the DIAG daemons. This way, the normal operations of the database server are not affected since
the DIAG daemons run independently of each database instance’s normal operations, resulting in very little overhead
to the database server.
Also, the separation of trace generation and trace archiving makes trace generation much more efficient and faster.
This architecture utilizes in-memory buffers to capture trace information instead of writing it out to files directly. All
trace information is written into in-memory buffers within the System Global Area (SGA) instead of being written
into files directly (SGA is the shared memory buffer of the Oracle Database server). Offline tools then transform the
archived logs into human readable formats, load them into database for query, or display them with the GUI interface
of Trace Navigation tool.
Instance 1
Instance 2
Instance 3
On-line Tools
DIAG Process
DIAG Process
DIAG Process
Trace Navigation
Trace Loader
Off-line Tools
Trace Files
Figure 1. Real Application Clusters Diagnostics Architecture
At a more detailed level, as shown in Figure 2, all trace information generated is written into circular buffers within
the SGA. The information can be queried through X$ Views, namely, X$TRACE, and X$TRACE_EVENTS views
through normal SQL queries. Trace information is maintained on a per process basis. Each process has its own
circular memory buffer to store the trace information.
Figure 2. Single Instance View
The tracing mechanism is very flexible. All trace information is “event” based. Here an event refers to anything of
interest to be traced. For example, an event can be a group of activities related to memory allocation/deallocation, a
group of actions related to a particular SQL command, or a CacheFusion activity. Simply put, an event can be
anything a programmer wants it to be in this context. There are 1000 event ids, each id can have up to 256 operation
codes (opcodes), and each id can have up to 256 levels of output details. Setting the trace level to the 0 allows a
database server to generate minimal amount of trace during production time.
The low overhead and flexibility of this architecture allows more detailed trace information to be generated in
production environment. Developers now have more leeway to generate critical trace information for diagnosing a
problem without worrying too much about the performance impact to the database server. This makes it much easier
to trace instrument more and more critical components of the database server. Should something occur, the trace
generated will allow Oracle support to have immediate access to the trace log and can start the diagnostics process
immediately. The level 0 trace should be able to provide enough information for first pass failure diagnostics for the
most cases, greatly reducing the need to reproduce problems. When further information is needed, trace level can be
set higher so that the database server can generate more diagnostics information. The overhead to the database server
will be higher but only on the events and processes traced. Since trace control is on a per event and per process basis,
the overhead is usually within a manageable range.
The following sections discuss more details about trace instrumentation, trace loader and trace navigation.
With the Oracle Database 10g, cluster trace instrumentation now covers more critical components of the database
server and is more customer focused. Oracle9i RAC trace instrumentation focuses on activities related to RAC
specific components, such as interconnect and CacheFusion. Oracle Database 10g expands trace instrumentation to
more generic RDBMS components. To better understand the most likely cause of problems reported by customers,
Oracle development worked closely with Oracle Support to conduct a detailed analysis of customer problem reports.
We found that the majority of issues reported were concentrated on a very small list of components. Oracle focuses
its trace instrumentation efforts on these components so that commonly encountered problems are addressed more
Trace management is very simple. A single command ALTER TRACING within the sqlplus tool is used to issue
related tracing commands, including setting trace details and archiving trace logs. All commands follow a simple
>ALTER TRACING <command>
can be ENABLE <event-string>, DISABLE <event-spec> or FLUSH <proc-spec>
For example, to turn on trace for event 12345 at level 5 for process id 32, one can issue the command:
>alter tracing enable 12345:5:32
To disable background tracing, one can issue the command:
>alter tracing disable 12345:5:BGS
To archive trace logs related to process 32, one can issue the command:
>alter tracing flush 32
Debugging tools, specifically oradebug and hang analyzer, are clusterized in Oracle Database 10g so that they can
perform cluster wide debugging. Rather than treating a cluster as a collection of unrelated views from the instances,
they view a single system image of the entire cluster.
The clusterization of oradebug allows Oracle support analysts to capture information related to the whole cluster.
oradebug commands can be directed to the entire RAC cluster or just a subset of instances. This is critical when
debugging in a cluster. Since problems occurring on one instance may stem from issues on other instances within the
cluster, the ability to correlate them with a single cluster view is very important in helping to find the source of a
Similarly, hang analyzer is clusterized. With this capability, server hangs or starvation can be detected much faster. For
example, a session that is blocked on one node can be caused by some resource constraints on another node within
the cluster. This can be easily detected with the clusterized hang analyzer.
As shown below in Figure 3, the trace loader takes the archived trace logs from all instances of a cluster database,
converts them to text formatted files, arranges them according to cluster time sequence, and load them into a separate
Oracle database for archiving. Notice the trace loader operates in an offline mode, i.e., it does not interact with the
database server that generates the trace logs.
Raw Trace Files
(in binary or text format)
Data Conversion
Data Loading
Converted Trace Files
(in text format)
Target Database
Figure 3. Offline Trace Loader
One main benefit of archiving trace logs into the database with trace loader is that users can search trace information
using the powerful query capabilities of the Oracle Database server, a much more desirable tool compared with using
plain text editors. In addition, the database archive can be used as a repository to track issues related to a particular
cluster. Finally, the log database can be used as a platform independent repository for various clusters.
Trace loader is simple to use. It is a command line tool that asks for input source and database login credentials if
loading to a database repository is requested. In the example below, the trace loader appends converted logs from
f1.trw into the repository identified by scott/tiger.
%trcldr mode=append ifile=f1.trw userid=scott/tiger
The trace navigation tool greatly helps Oracle development as well as support personnel to track down problems. It
makes analyzing trace logs so much easier than pouring over a collection of text messages. The GUI interface works
directly on the trace log files for the entire cluster. It arranges trace logs in cluster wide time sequence. It provides two
main options for displaying trace log data in the GUI. The first one is the Coloring and Query pane as shown in
Figure 4. The second one is the Wait-For Tree pane as shown in Figure 5.
In the Coloring and Query pane, trace log display can be colored to differentiate various attributes or protocols that
are being tracked. Users can specify or designate a color for a specific protocol within a trace log archive. Trace logs
with the same color can be stepped through sequentially, either backward or forward, across multiple trace log files.
Users can prune the trace logs to show traces with only the desired colors. There are well defined queries that can be
executed based on the keyword selected. The query results can be colored as well. For example, users can check
which phase a particular protocol is in, where in the trace it begins and ends, and its memory usage from the queries.
Figure 4. Trace Navigation Tool: Coloring and Query
The wait-for tree pane gives a visual presentation of resource dependencies. It makes it much easier to identify a
resource that is blocking a group of processes and the process that hold that resource. It presents dependencies with a
tree-like representation. This is especially usefully in aiding starvation or deadlock diagnostics. Figure 5 shows a group
of processes in instance 2, 3 and 4 that are blocked on some processes in instance 1, which in turns are blocked on
some processes in instance 2, 3 and 4.
Figure 5. Wait-for Tree
Cluster verification is another key element in making RAC easier to use and manage. The framework will eliminate a
lot of RAC installation problems due to incorrect underlying cluster system configurations. Since cluster systems are
still not as common as regular servers, a large percentage of RAC installation and configuration issues are attributed
to incorrect configuration of the underlying cluster hardware and system software since users are not as familiar with
them. Some problems prevent proper RAC installations, some show up later as RAC problems. For example, there
was one case where the public network was incorrectly configured for the private interconnect of a cluster. RAC
installation went well and even functioned well in a production environment with a light workload. When the
workload started to increase, however, the database server encountered various performance issues. When the
interconnect configuration problem was finally corrected, RAC worked just as well with a heavy load. Issues like this
causes a lot of stress for both users and Oracle Support and more critically, create unnecessary system down time.
The cluster verification framework is designed to address issues like those above as well as other issues related to
cluster configuration at both the system level and the database level. It takes a systematic approach to checking and
verifying proper cluster configuration at all levels of the complete product stack. The approach of cluster verification
is to weed out configurations mistakes through pre and post checks of installation and configuration steps, and
ongoing checking of database server components.
As shown in Figure 6, the cluster verification framework is utilized by the cluvfy command line tool or by other tools
such as the Oracle Universal Installer and various configuration tools through a set of verification APIs. Layer 2
verification APIs are task-oriented whereas layer 1 APIs are used to carry out verification actions. Vendors’ APIs and
OSD (Operating System Dependent) APIs are invoked whenever appropriate to carry out platform level verification.
The command line tool will carry out verification tasks according to profiles that are defined by customers or by
defaults defined by Oracle. It can be run independently of the database.
Cluster Verification Utility
( Command line tool )
Layer 2 Cluster Verification APIs
(cluster verification tasks)
Layer 1
Cluster verification APIs
Vendor API
(cvu libs)
Operating System
Figure 6. Cluster Verification Framework Architecture
The cluster verification framework verifies the full stack. It checks all components the make up the cluster database,
including cluster hardware, storage, network interconnect, as well as various components of the cluster database itself.
Verification is non-instrusive. It will not alter the state of the system being verified so it is safe to run verification
tasks to check the health of a system at any time. Even though most problems happen at initial installation, ongoing
verification is still very important for situations such as the one mentioned earlier, as well as configuration changes
after initial installations.
To be able to perform verification effectively at the system level, Oracle is working very closely with platform vendors
to leverage their verification capabilities. A vendor independent API layer, similar to the OSD layer, will be created to
call all the platform dependent capabilities.
Verification tasks are grouped into two verification categories, the stage verification and the component verification.
Both categories of verification will call on verification tasks to perform verification according to their respective
Stage refers to the various steps that are performed during a complete RAC deployment process, for example, the
setup of interconnect, the setup of storage system, or the layout of datafiles. Stage verification refers to the cluster
verification tasks performed prior to the start of each stage, and the verification tasks after the completion of each
stage to ensure each deployment step is done properly. The pre-stage verification performs a well-defined set of
checks to be carried out before the installation process enters a stage. It verifies that all the pre-requisite steps for the
stage have been done successfully and verifies that the cluster is in the desired state. The post-stage verification
performs a well-defined set of checks to be carried out after the installation process completes a stage. It verifies that
the set of operations in the stage have been accomplished, and it verifies that the cluster is left in a consistent state for
the next stage.
Examples of stage verification are:
Hardware and OS setup Cluster File System setup
Cluster Manager setup
Cluster Services setup
RDBMS installation
Oracle Net Configuration
Database Configuration
Node Addition
Storage Addition
Network Modification
Verification also treats RAC as a collection of coherent components, such as a server, an interconnect, a storage
system, the Oracle software stack, the OS software stack, etc. Component verification performs checks on these
components individually. If a component is an aggregate component of sub-components, verification will be
performed on those sub-components. Component verification checks on various aspect of a component, including
availability and integrity.
The verification framework includes a diagnostic mode. Under this mode, it will attempt to pinpoint the source of the
failure when a verification task fails. This mode is available for both stage and component verifications.
Examples of command line tool usage:
%cluvfy stage –post HWOS
This command performs a post-stage check for the hardware and OS stage
%cluvfy comp sys –p RDBMS
This command performs verification on system components for RDBMS installation.
The diagnostics facilities are mainly aim at helping Oracle support and development organizations resolve customer
problems faster. It improves system availability by reducing down time. Users should not attempt to change trace
levels or issue debug commands without supervision from Oracle support especially on production systems, as it may
unnecessarily impact database server performance and potentially cause other problems.
Cluster verification framework is part of Oracle Database 10g, but the command line tool will be made available in a
maintenance release after Oracle Database 10g becomes generally available. DBAs can and are encouraged to use the
cluvfy tool to perform verification on their own.
Real Application Clusters is getting easier to use and manage. One important aspect is the enhancements to the
diagnostics and verification capabilities. The systematic approach to managing diagnostics and verification should
reduce time customers spend on resolving problems with Oracle Support, and should help customers eliminate
problems due to installation/configuration issues.