Database DIAGNOSTICS AND CLUSTER VERIFICATION OF ORACLE REAL APPLICATION CLUSTERS 10g Jack Cai, Oracle INTRODUCTION Oracle Real Application Clusters 10g (RAC) is much easier to use than previous releases. Numerous enhancements have been made to improve manageability and usability. For example, the full stack installation option, the automatic storage manager, the policy based workload manager, and the much enhanced Oracle Enterprise Manager are all geared towards making RAC appeal to a broader market. One of the key components in this effort is the enhancement to cluster diagnosability and verification. The diagnostics improvements of Oracle Database 10g make diagnosing cluster databases as easy as diagnosing single instance databases. The flexibility of clusters does not get in the way of diagnosability. The verification improvements will help users eliminate problems with cluster installations. Starting with the Oracle9i Database and continuing with Oracle Database 10g, RAC has been constantly enhanced with improvements to provide full stack diagnosability, monitoring and verification capabilities. The goal is to prevent mistakes from happening before RAC is installed and to pinpoint root causes of runtime problems as fast as possible. Ultimately, these improvements will make RAC much easier to manage and will reduce unnecessary database downtime. In this paper, we will go over the key enhancements, including tracing instrumentation, tools for first pass analysis, trace data analysis and clusterization of current diagnostic and tracing tools. We will also go over the cluster verification framework that ensures a proper system configuration for RAC. The intent of this paper is to showcase the systematic approach Oracle is taking towards diagnostics and verification, and to demonstrate how much easier Oracle Real Application Clusters is to use and manage. However, users should be cautioned against attempting to use the diagnostics/debugging tools on their own. Oracle recommends that users only run the diagnostics facilities under the direction of Oracle support personnel. DIAGNOSTICS ENHANCEMENTS Diagnostics of Oracle Database is getting better. Oracle9i Real Application Clusters laid a solid foundation for cluster diagnostics. A new DIAG daemon process was introduced to handle diagnostics specific tasks, and a memory based mechanism was introduced to capture trace logs. These two changes greatly enhanced the performance of trace (diagnostics information) generation. They reduced system overhead for trace generation while increasing trace flexibility and versatility. Trace instrumentation was mainly implemented for RAC related system components in Oracle 9i. Oracle Real Application Clusters 10g continues with trace instrumentation on more generic database system components, at the same time, adds new capabilities with clusterization of oradebug and hang analyzer, a new offline trace loader, and a new trace navigation tool. CLUSTER DIAGNOSTICS ARCHITECTURE The goal of the diagnostic enhancements is to enable sufficient information to be generated for first pass failure analysis with minimal overhead to the database server. The diagnostic architecture in Oracle9i and Oracle 10g makes the goal much more attainable. Figure 1 shows the overall diagnostics architecture. This architecture enables trace processing to incur very low overhead. The DIAG daemons do not interfere the normal operations of the database server. The DIAG daemon process was introduced in Oracle 9i Database to manage all diagnostics related activities, acting as a broker between online debug tools and regular database processes. All debugging commands are issued through the DIAG daemon on the same node to reach their intended targets. This DIAG daemon then coordinates with DIAG daemons on other nodes of the same cluster to complete the commands. Activities such as setting trace levels, archiving the in-memory trace logs to files, taking memory/crash 40248 Database dumps are done by the DIAG daemons. This way, the normal operations of the database server are not affected since the DIAG daemons run independently of each database instance’s normal operations, resulting in very little overhead to the database server. Also, the separation of trace generation and trace archiving makes trace generation much more efficient and faster. This architecture utilizes in-memory buffers to capture trace information instead of writing it out to files directly. All trace information is written into in-memory buffers within the System Global Area (SGA) instead of being written into files directly (SGA is the shared memory buffer of the Oracle Database server). Offline tools then transform the archived logs into human readable formats, load them into database for query, or display them with the GUI interface of Trace Navigation tool. Instance 1 Instance 2 Instance 3 On-line Tools Oradebug X$VIEWs SGA DIAG Process SGA DIAG Process SGA DIAG Process Trace Navigation Tool Trace Loader Off-line Tools Trace Files Figure 1. Real Application Clusters Diagnostics Architecture At a more detailed level, as shown in Figure 2, all trace information generated is written into circular buffers within the SGA. The information can be queried through X$ Views, namely, X$TRACE, and X$TRACE_EVENTS views through normal SQL queries. Trace information is maintained on a per process basis. Each process has its own circular memory buffer to store the trace information. 40248 Database Figure 2. Single Instance View The tracing mechanism is very flexible. All trace information is “event” based. Here an event refers to anything of interest to be traced. For example, an event can be a group of activities related to memory allocation/deallocation, a group of actions related to a particular SQL command, or a CacheFusion activity. Simply put, an event can be anything a programmer wants it to be in this context. There are 1000 event ids, each id can have up to 256 operation codes (opcodes), and each id can have up to 256 levels of output details. Setting the trace level to the 0 allows a database server to generate minimal amount of trace during production time. The low overhead and flexibility of this architecture allows more detailed trace information to be generated in production environment. Developers now have more leeway to generate critical trace information for diagnosing a problem without worrying too much about the performance impact to the database server. This makes it much easier to trace instrument more and more critical components of the database server. Should something occur, the trace generated will allow Oracle support to have immediate access to the trace log and can start the diagnostics process immediately. The level 0 trace should be able to provide enough information for first pass failure diagnostics for the most cases, greatly reducing the need to reproduce problems. When further information is needed, trace level can be set higher so that the database server can generate more diagnostics information. The overhead to the database server will be higher but only on the events and processes traced. Since trace control is on a per event and per process basis, the overhead is usually within a manageable range. The following sections discuss more details about trace instrumentation, trace loader and trace navigation. CLUSTER TRACE INTRUMENTATION With the Oracle Database 10g, cluster trace instrumentation now covers more critical components of the database server and is more customer focused. Oracle9i RAC trace instrumentation focuses on activities related to RAC specific components, such as interconnect and CacheFusion. Oracle Database 10g expands trace instrumentation to more generic RDBMS components. To better understand the most likely cause of problems reported by customers, Oracle development worked closely with Oracle Support to conduct a detailed analysis of customer problem reports. We found that the majority of issues reported were concentrated on a very small list of components. Oracle focuses its trace instrumentation efforts on these components so that commonly encountered problems are addressed more quickly. Trace management is very simple. A single command ALTER TRACING within the sqlplus tool is used to issue related tracing commands, including setting trace details and archiving trace logs. All commands follow a simple syntax: 40248 Database >ALTER TRACING <command> <command> can be ENABLE <event-string>, DISABLE <event-spec> or FLUSH <proc-spec> For example, to turn on trace for event 12345 at level 5 for process id 32, one can issue the command: >alter tracing enable 12345:5:32 To disable background tracing, one can issue the command: >alter tracing disable 12345:5:BGS To archive trace logs related to process 32, one can issue the command: >alter tracing flush 32 CLUSTERIZATION OF DEBUGGING TOOLS Debugging tools, specifically oradebug and hang analyzer, are clusterized in Oracle Database 10g so that they can perform cluster wide debugging. Rather than treating a cluster as a collection of unrelated views from the instances, they view a single system image of the entire cluster. The clusterization of oradebug allows Oracle support analysts to capture information related to the whole cluster. oradebug commands can be directed to the entire RAC cluster or just a subset of instances. This is critical when debugging in a cluster. Since problems occurring on one instance may stem from issues on other instances within the cluster, the ability to correlate them with a single cluster view is very important in helping to find the source of a problem. Similarly, hang analyzer is clusterized. With this capability, server hangs or starvation can be detected much faster. For example, a session that is blocked on one node can be caused by some resource constraints on another node within the cluster. This can be easily detected with the clusterized hang analyzer. OFFLINE TRACE LOADER As shown below in Figure 3, the trace loader takes the archived trace logs from all instances of a cluster database, converts them to text formatted files, arranges them according to cluster time sequence, and load them into a separate Oracle database for archiving. Notice the trace loader operates in an offline mode, i.e., it does not interact with the database server that generates the trace logs. Trace File Raw Trace Files (in binary or text format) Data Conversion Trace Loader Trace File Data Loading Trace Repository Trace Loader Converted Trace Files (in text format) Target Database Figure 3. Offline Trace Loader One main benefit of archiving trace logs into the database with trace loader is that users can search trace information using the powerful query capabilities of the Oracle Database server, a much more desirable tool compared with using plain text editors. In addition, the database archive can be used as a repository to track issues related to a particular cluster. Finally, the log database can be used as a platform independent repository for various clusters. Trace loader is simple to use. It is a command line tool that asks for input source and database login credentials if loading to a database repository is requested. In the example below, the trace loader appends converted logs from f1.trw into the repository identified by scott/tiger. 40248 Database %trcldr mode=append ifile=f1.trw userid=scott/tiger TRACE NAVIGATION TOOL The trace navigation tool greatly helps Oracle development as well as support personnel to track down problems. It makes analyzing trace logs so much easier than pouring over a collection of text messages. The GUI interface works directly on the trace log files for the entire cluster. It arranges trace logs in cluster wide time sequence. It provides two main options for displaying trace log data in the GUI. The first one is the Coloring and Query pane as shown in Figure 4. The second one is the Wait-For Tree pane as shown in Figure 5. In the Coloring and Query pane, trace log display can be colored to differentiate various attributes or protocols that are being tracked. Users can specify or designate a color for a specific protocol within a trace log archive. Trace logs with the same color can be stepped through sequentially, either backward or forward, across multiple trace log files. Users can prune the trace logs to show traces with only the desired colors. There are well defined queries that can be executed based on the keyword selected. The query results can be colored as well. For example, users can check which phase a particular protocol is in, where in the trace it begins and ends, and its memory usage from the queries. Figure 4. Trace Navigation Tool: Coloring and Query The wait-for tree pane gives a visual presentation of resource dependencies. It makes it much easier to identify a resource that is blocking a group of processes and the process that hold that resource. It presents dependencies with a tree-like representation. This is especially usefully in aiding starvation or deadlock diagnostics. Figure 5 shows a group of processes in instance 2, 3 and 4 that are blocked on some processes in instance 1, which in turns are blocked on some processes in instance 2, 3 and 4. 40248 Database Figure 5. Wait-for Tree CLUSTER VERIFICATION FRAMEWORK Cluster verification is another key element in making RAC easier to use and manage. The framework will eliminate a lot of RAC installation problems due to incorrect underlying cluster system configurations. Since cluster systems are still not as common as regular servers, a large percentage of RAC installation and configuration issues are attributed to incorrect configuration of the underlying cluster hardware and system software since users are not as familiar with them. Some problems prevent proper RAC installations, some show up later as RAC problems. For example, there was one case where the public network was incorrectly configured for the private interconnect of a cluster. RAC installation went well and even functioned well in a production environment with a light workload. When the workload started to increase, however, the database server encountered various performance issues. When the interconnect configuration problem was finally corrected, RAC worked just as well with a heavy load. Issues like this causes a lot of stress for both users and Oracle Support and more critically, create unnecessary system down time. The cluster verification framework is designed to address issues like those above as well as other issues related to cluster configuration at both the system level and the database level. It takes a systematic approach to checking and verifying proper cluster configuration at all levels of the complete product stack. The approach of cluster verification is to weed out configurations mistakes through pre and post checks of installation and configuration steps, and ongoing checking of database server components. As shown in Figure 6, the cluster verification framework is utilized by the cluvfy command line tool or by other tools such as the Oracle Universal Installer and various configuration tools through a set of verification APIs. Layer 2 verification APIs are task-oriented whereas layer 1 APIs are used to carry out verification actions. Vendors’ APIs and OSD (Operating System Dependent) APIs are invoked whenever appropriate to carry out platform level verification. The command line tool will carry out verification tasks according to profiles that are defined by customers or by defaults defined by Oracle. It can be run independently of the database. 40248 Database Cluster Verification Utility ( Command line tool ) Profiles Layer 2 Cluster Verification APIs access Public Verification APIs (cluster verification tasks) Layer 1 Cluster verification APIs Other tools OSD Cluster Verification Framework Vendor API (cvu libs) Operating System Figure 6. Cluster Verification Framework Architecture The cluster verification framework verifies the full stack. It checks all components the make up the cluster database, including cluster hardware, storage, network interconnect, as well as various components of the cluster database itself. Verification is non-instrusive. It will not alter the state of the system being verified so it is safe to run verification tasks to check the health of a system at any time. Even though most problems happen at initial installation, ongoing verification is still very important for situations such as the one mentioned earlier, as well as configuration changes after initial installations. To be able to perform verification effectively at the system level, Oracle is working very closely with platform vendors to leverage their verification capabilities. A vendor independent API layer, similar to the OSD layer, will be created to call all the platform dependent capabilities. Verification tasks are grouped into two verification categories, the stage verification and the component verification. Both categories of verification will call on verification tasks to perform verification according to their respective profiles. STAGE VERIFICATION Stage refers to the various steps that are performed during a complete RAC deployment process, for example, the setup of interconnect, the setup of storage system, or the layout of datafiles. Stage verification refers to the cluster verification tasks performed prior to the start of each stage, and the verification tasks after the completion of each stage to ensure each deployment step is done properly. The pre-stage verification performs a well-defined set of checks to be carried out before the installation process enters a stage. It verifies that all the pre-requisite steps for the stage have been done successfully and verifies that the cluster is in the desired state. The post-stage verification performs a well-defined set of checks to be carried out after the installation process completes a stage. It verifies that the set of operations in the stage have been accomplished, and it verifies that the cluster is left in a consistent state for the next stage. Examples of stage verification are: Hardware and OS setup Cluster File System setup Cluster Manager setup Cluster Services setup 40248 Database RDBMS installation Oracle Net Configuration Database Configuration Node Addition Storage Addition Network Modification COMPONENT VERIFICATION Verification also treats RAC as a collection of coherent components, such as a server, an interconnect, a storage system, the Oracle software stack, the OS software stack, etc. Component verification performs checks on these components individually. If a component is an aggregate component of sub-components, verification will be performed on those sub-components. Component verification checks on various aspect of a component, including availability and integrity. DIAGOSTIC MODE The verification framework includes a diagnostic mode. Under this mode, it will attempt to pinpoint the source of the failure when a verification task fails. This mode is available for both stage and component verifications. EXAMPLES OF COMMAND LINE TOOL cluvfy Examples of command line tool usage: %cluvfy stage –post HWOS This command performs a post-stage check for the hardware and OS stage %cluvfy comp sys –p RDBMS This command performs verification on system components for RDBMS installation. HOW TO TAKE ADVANTAGES OF THESE ENHANCEMENTS The diagnostics facilities are mainly aim at helping Oracle support and development organizations resolve customer problems faster. It improves system availability by reducing down time. Users should not attempt to change trace levels or issue debug commands without supervision from Oracle support especially on production systems, as it may unnecessarily impact database server performance and potentially cause other problems. Cluster verification framework is part of Oracle Database 10g, but the command line tool will be made available in a maintenance release after Oracle Database 10g becomes generally available. DBAs can and are encouraged to use the cluvfy tool to perform verification on their own. CONCLUSION Real Application Clusters is getting easier to use and manage. One important aspect is the enhancements to the diagnostics and verification capabilities. The systematic approach to managing diagnostics and verification should reduce time customers spend on resolving problems with Oracle Support, and should help customers eliminate problems due to installation/configuration issues. 40248