Oracle8 on Dell PowerEdge Servers Introduction: Selecting an appropriate hardware platform, Server Operating System and a Database Server to fulfill the needs of any organization requires tremendous amount of investment in terms of time, research, analysis of data, applications, and user demands. Dell Computer Corporation provides a solution by delivering one of the best combination of hardware and database – Oracle8 on PowerEdge Servers. The next phase in the assessment process is to select the most appropriate PowerEdge server in terms of user workload, type of application, and availability requirements from the line of available server products. The next decision that has to be made is regarding the most appropriate operating system and the database server product options. After the decision about all of the above issues has been made, the implementation and maintenance phase requires hardware and database tuning and optimization according to the needs of the application and the user requirements. This paper is intended for systems engineers, technical sales personnel, and technical support analysts. It investigates various Oracle8 server tuning and optimization issues. It can also be used as a guideline on choosing an appropriate Windows NT (Windows NT Server and Windows NT Server Enterprise Edition) and Oracle8 (Oracle8 Server and Oracle8 Enterprise Edition) combination to run on a Dell PowerEdge Server. We start with a brief comparison between Windows NT Server 4.0 and Windows NT/SE. Then we will compare Oracle8 Server with Oracle8 Enterprise Edition followed by some features of Windows NT specific Oracle architecture. The rest of this document will be dedicated to the performance tuning and optimization issues for running Oracle8 databases on Dell’s PowerEdge Servers on Windows NT. 1 Windows NT Server 4.0 and Windows NT Server Enterprise Edition: Windows NT Server Enterprise Edition is the latest edition to the Microsoft Windows NT Operating system family of products. It builds on existing features of Windows NT Server to provide extended scalability, interoperability, availability and manageability features. Windows NT Server with its 32-bit architecture allows a maximum of 4 GB of addressable memory. Out of this 4 GB of RAM, 2GB is dedicated to the kernel and only 2GB is available for applications. Windows NT/SE with its 4 GB Memory Tuning (4GT) feature, reduces the kernel memory to 1GB, making up to 3GB of memory available for applications. Also, Windows NT/SE is licensed for use on SMP servers with up to 8 processors as compared to standard Windows NT server, which is licensed for only 4 processors on SMP machines. Below is a chart, which compares various features of Microsoft Windows NT server and Windows NT Server Enterprise Editions: Feature NT Server NT/SE Multipurpose OS w/ file, print, internet/intranet, & application support Yes Yes Multi-threaded micro-kernel architecture; 4GB of total addressable memory Yes Yes Maximum number of processors licensed for an SMP server 4 8 Maximum amount of application memory (i.e., not assigned to kernel) 2 GB 3 GB High-availability Microsoft Cluster Server (2-server cluster with automatic failover and single-system image management) No Yes Standard cross-platform Win32 APIs, support for COM/DCOM object standards Yes Yes Microsoft Transaction Server (component based transaction monitor) Yes Yes (including high-availability on Microsoft Cluster Server cluster) Microsoft Message Queue Server (highperformance store-and-forward queuing for reliable communication between distributed components) Standard version Enhanced version 2 Comparison between Oracle8 Server and Oracle8 EE: Oracle8 Server: Oracle8 Server is equivalent to Oracle workgroup Server in the previous releases of Oracle. It is intended for smaller and cost effective implementations with easy to use functionality. Still Oracle8 Server and Oracle8 EE are based on the same code, so it is easy to migrate from Oracle8 to Oracle8 EE in case the business needs expand in the future. Oracle8 is focused at fulfilling workgroup and lower end departmental needs of small businesses. Even for large operations, Oracle8 and oracle8 EE can coexist and transparently communicate in a distributed environment. Oracle8 EE: Oracle8 EE contains many high-end features, which makes it suitable for large, enterprise level systems. It can support tens of thousands of users with virtually no limits on the type or amount of data stored. Its high availability features can be used to support 72452 operations. It also supports applications based on both the relational and the object-relational model. Some features of Oracle8 EE are only available as separately licensed options. Following is a list of options, which are only available for Oracle8 EE. Option Objects Option Partitioning Option Advanced Networking Option Enterprise Manager Performance Packs Parallel Server Option Description The Objects option allows data to be represented, accessed, manipulated, and stored as business objects. No CREATE TYPE, ALTER TYPE, or CREATE OR REPLACE TYPE statements are possible without this option. The partitioning option allows the definition of partitions of tables and indexes. This option is required for parallel index scans and parallel DML. No CREATE PARTITION statements are possible without this option This option provides client/server, server/server network security using encryption and data integrity checking as well as enhanced user authentication services. The performance packs, which include Diagnostic pack, Tuning Pack, and Change Management Pack provide advanced set of tools for managing database environments. It allows multiple nodes of a loosely coupled (Shared Disk) system to share access to a single database for increased scalability and security. To find out what options are installed and available, run the following query: Svmgr30> select * from V$OPTION; The query will respond with a value of TRUE for installed option and a value for FALSE for options not available at your database server. 3 The following chart elaborates various features of Oracle8 EE as compared to Oracle8 Server.Feature Enterprise Manager Performance Packs Advanced Backup and recovery features Oracle8 Oracle8 EE Yes Yes No Yes Yes Yes Incremental backup and recovery Failsafe for Oracle8 Bit-mapped indexes Parallel Operations No Yes Yes Yes No Yes No Yes Programming Interfaces Yes Yes Object Features Yes Yes Distributed Features Yes Yes Advanced Replication No Yes Comments Included both with Oracle8 EE and Oracle8 server. Only the Diagnostic pack comes free with the purchase of Oracle8 EE. This includes server-managed backup and recovery, recovery backup for online backup, online recovery, and Legato Storage Manager. Comes as a separate installable product. These features include Parallel Execution, Parallel load, Parallel Query, Parallel DML, Parallel Index Scans, Parallel Bitmap Star Joins, Parallel Index Build, and Parallel Analyze. These features require the Partitioning Option, which is only available with Oracle8 EE. Including Oracle Call Interface, Objects for OLE, ODBC driver, Pro*C/C++ Including Object references (REFs), object collections, nested tables, variable arrays, and object views These include distributed queries, distributed transactions, two phase commit, heterogeneous services, basic replication, read only snapshots, subquery subsetting, primary key based snapshots, internal triggers, and replicated LOBs. Advanced replication features include updateable snapshots, multimaster replication, conflict detection and resolution, replication manager, parallel propagation, and minimized communication. 4 Oracle Names Yes Yes Connection Manager No Yes Connection Pooling Yes Yes Connection Multiplexing Yes Yes Mutiprotocol Connectivity No Yes Security Server Yes Yes Advanced Queuing No Yes Reverse key Indexes Yes Yes Password management Yes Yes Index-Organized Tables Yes Yes Stored procedures and triggers Yes Yes INSTEAD OF triggers Yes Yes External Procedures Yes Yes National Yes Language Support Yes LOB support Yes Yes ConText Cartridge Yes Yes Video Cartridge Yes Yes Image Cartridge No Yes This Cartridge offers full text retrieval Provides Image storage, retrieval and image format conversion capabilities through Object Types 5 VIR Cartridge No Yes Visual Information Retrieval Time Series Cartridge No Yes Provides storage and retrieval of timestamped data through Object Types. Spatial Data Cartridge No Yes This Cartridge is designed to store, retrieve and spatial data easily for Geographic Information System Users. Oracle8 Database Features on NT: Maximum Database Block Size: 16384 bytes Maximum Database Blocks per Oracle datafile = 4 million Maximum Oracle datafile size = 64 GB Maximum number of Oracle datafiles per database = Depends on the database block size Database Block Size Maximum number of datafiles Block Size = 2KB Max. datafiles per database = 20,000 Block Size = 4 KB Max. datafiles per database = 40,000 Block Size = 8 KB and above Max. datafiles per database = 65,536 Maximum Database Size = 41015 = 4 Peta Bytes Oracle8 Architecture Overview: Oracle8 for Windows NT is a 32-bit implementation written on Microsoft 32 API. Oracle on Windows NT runs as a single process, multithreaded architecture and it fully conforms to Windows NT memory model. Fig. 1 shows a model of Oracle8 instance. The Oracle8 Instance on Windows NT consists of the System Global Area (SGA) and background threads. All Oracle8 threads share the single Oracle8 process address space. Following is the description of memory areas of which SGA consists of. Major memory structures in SGA: Database Buffer Cache: The database buffer cache holds copies of data blocks read from datafiles. All users concurrently connected to the system share access to the database buffer cache. An optimal number of database buffers reduce disk I/O and improve performance. The initSID.ora parameter DB_BLOCK_BUFFERS determines the size of the database buffer cache. 6 Redo Log Buffer: Is a circular buffer in SGA, containing information about changes made to the database. The contents are flushed to the redo log files by the LGWR background thread. The initSID.ora parameter LOG_BUFFERS sets the size of redo log buffer in the SGA. Shared Pool: Is the area in the SGA that contains memory constructs such as the data dictionary cache, library cache and shared SQL area. This area contains a parsed form of SQL statements so, the similar statements can be re-executed without reparsing. The parameter SHARED_POOL_SIZE determines the shared area size in bytes. Oracle8 Background Threads on NT: Following is a description of some important background Oracle threads on Windows NT: Database Writer (DBWR) thread: The database writer is responsible for writing the modified data block from the buffer cache to the SGA. Under Oracle Performance Monitor utility, it is represented as thread number 3. Log Writer (LGWR) thread: The log writer writes the redo log entries from the redo buffer cache to the online redo log files on the disk. It is represented as thread number 4 under performance monitor. Checkpoint (CKPT) thread: Optionally started to update the System Change Number in the data files and the control files. If this thread is not started then the LGWR thread assumes the responsibility of CKPT. For performance enhancement this thread should be enabled via the initSID.ora parameter CHECKPOINT_PROCESS = TRUE. System Monitor (SMON) thread: This thread performs instance recovery at instance startup. It also reclaims temporary segment space and coalesces contiguous areas of free space in data files. It is thread number 5 under process monitor. Processor Monitor (PMON) thread: Process monitor cleans up abnormally terminated connections, rolls back uncommitted transactions, frees SGA resources and releases locks held by terminated processes. It is shown as thread number 2 under process monitor. Recovery (RECO) thread: This thread resolves distributed transactions in case there is a network or system failure. This one is thread number 6 in performance monitor utility. Archiver (ARCH) thread: It copies the online redo log files to another location on disk specified in the initSID.ora file from where they can be written to a tape device. 7 System Global Area SMON Instance Lock Area Large Pool UGA PMON I/O Buffer Area Shared Pool Library Cache Snnnn Redo Log Buffer Pnnnn GMS RECO LMON SNPnn LMD0 LCKn IOnnn ARCH TRWR Database Buffer Cache Data Dictionary Cache Shared SQL Area Dnnnn INSTANCE DBWR CKPT LGWR DATABASE SERVER DATA PGA USER Parameter Password File File INIT.ORA orapwSID USER INDEX TEMP ROLLBACK Datafiles Control SYSTEM Files Redo Logs Redo Logs Group 1 Group 2 Archived Logs Alert File Trace Files Oracle Storage Architecture: Fig. 2 depicts how Oracle stores data. Database Partitioned Tables/Indexes Tablespaces Datafiles Segments Extents Database Blocks Operating System Blocks 8 Oracle Database: An Oracle database is a collection of data that is treated as a unit. The general purpose of a database is to store and retrieve related information efficiently. An Oracle database has logical structures and physical structures. Logical Database Structures: The logical database structure includes tablespaces, schema objects, segments, extents and data blocks. Following is a description of each of these logical database structures: Tablespaces: The logical storage units of an Oracle database are called tablespaces. An Oracle database consists of one or more tablespaces. A tablespaces belonging to a database in turn can consist of one or more datafiles on the disk. The combined size of a tablespace’s datafiles is the total storage capacity of the tablespace. Database ORACLE SYSTEM Tablespace USR_DATA tablespace Sys1.ora Sys2.ora Usr1.ora Usr2.ora 50MB 30MB 50MB 50MB Datafile Datafile Datafile Datafile Schema and Schema Objects: Schema is a collection of objects in a database like tables, views, sequences, stored procedures, synonyms, indexes, clusters and database links. Schema objects can belong to different tablespaces. Data Blocks: This is the basic unit of data storage in an Oracle database. One data block consists of a specific number of bytes of physical database space on disk. One database Block should be a multiple of operating system block size. A data block size is specified at the creation of an Oracle database and cannot be change unless the database is recreated. Extents: The next level of logical database space is called an extent. An extent is a specified in number of contiguous data blocks. 9 Segments: The logical database storage level above an extent is called a segment. A segment is a collection of extents allocated for a certain database object. For example table data, index data, rollback data and temporary data are stored in data segments, index segments, rollback segment and temporary segments respectively. Physical Database Structures: The physical structures of an Oracle database consist of datafiles, redo log files, and control files. Datafiles: Every Oracle database has one or more physical datafiles. The datafiles contain all the database’s data. The data of the logical database structures such as tables and indexes is physically stored in datafiles belonging to an Oracle database. One or more datafiles form a logical unit of database storage called a tablespace. A datafile can only belong to one tablespace in a database. Redo log files: Every Oracle database has a set of two or more redo log files. The primary function of redo log files is to store all changes made to the data. In case of a system or media failure, the changes made to the data can be applied from the redo log files and the work is never lost. Control Files: Every Oracle database has a control file. The control file contains entries that specify the physical structures of the database. It contains information such as, database name, names and locations of database’s datafiles and redo log files, and time stamp of the database creation. Major Contributors to performance Bottlenecks on NT: Following is a description of four major resources on an NT system that need tuning in order to enhance system performance: 1) System Memory tuning: In any production system, memory is limited. Processes contend for available memory which in turn forces the operating system to perform paging and swapping. The main memory area that needs tuning on any Oracle system is the System global Area or SGA. Tuning the System Global Area: The SGA should always be contained in the main memory. Otherwise the database would not start. To view the total memory allocated to the SGA, at the server manager prompt, type: Svrmgr30> show SGA; 10 The query returns the sizes in bytes of the following values: 1. Total system Global Area 2. Fixed Size 3. Variable Size 4. Database Buffers 5. Redo Buffers Where: Total System Global Area = Fixed Size + Variable Size + Database Buffers + Redo buffers. The ‘Fixed Size’ is determined by the installed Oracle products and options and does not change as long you do not add or remove some of the installed products. The ‘Variable Size’ is determined from the initSID.ora file parameters such as the values for SHARED_POOL_SIZE, PROCESSES, SESSIONS, and TRANSACTIONS at startup. The value of ‘Total System Global Area’ should always be less than the total physical memory available on your machine. The major contributors to the size of SGA are the following initSID.ora parameters: DB_BLOCK_SIZE DB_BLOCK_BUFFERS LOG_BUFFERS SHARED_POOL_SIZE On Windows NT/E, the total address space available to applications is 3GB. Any physical memory more than this value is a waste on windows NT. The address space available for Win32 applications (including Oracle) is only 2GB. The other 1GB of memory is used for system DLL’s. In any Oracle on NT implementation, the SGA, memory taken by all client connections and any other win32 applications should fit into this 2GB virtual address space. The next major release of Windows NT might increase this limit to 4GB for win32 applications. Oracle SGA exists as virtual memory. Because, NT does not allocate any physical memory to any process unless it is actually used, Oracle does not have its SGA in the real memory when it is started. This causes slower performance at startup because page faults (reading from the page file in 4KB page sizes to the real memory) occur in order to allocate pages of SGA to the physical RAM. To avoid that set initSID.ora parameter: PRE_PAGE_SGA=TRUE This would force Oracle to touch all the pages of the SGA so that they all get committed in the physical memory. This would result in slow startup but better performance for regular Oracle activity. Tuning the Database Buffer cache: The database buffer cache size is determined by the initSID.ora parameter DB_BLOCK_BUFFERS. This value accompanied by the value of DB_BLOCK_SIZE determines the size of the database buffer cache. Database Buffer Cache (Bytes) = DB_BLOCK_BUFFERS * DB_BLOCK_SIZE 11 This buffer holds copies of blocks of tables, indexes, and rollback segments. The hit ratio indicates what percentage of data Oracle can find in memory. The hit ratio is calculated by: (Logical Reads – Physical Reads) / (Logical Reads) Where: Logical Reads = db block gets + consistent gets The following query can be used to calculate the hit ratio: Svrmgr30> select (((cur.value + con.value) – (phy.value))/(cur.value+con.value))*100 “Hit Ratio” from v$SYSSTAT phy, v$SYSSTAT cur, v$SYSSTAT con Where cur.name = ’db block gets’ AND con.name = ’consistent gets’ AND phy.name = ’physical reads; If Oracle Enterprise manager Diagnostics pack is available, you can use Performance Manager tool to view the current hit ratio. The hit ratio should be greater than 80% during normal database operation. If using RAW devices, the hit ratio should be above 90%. If the hit ratio is below 80% increase the value of the DB_BLOCK_BUFFERS. Methodology to Increase DB_BLOCK_BUFEERS: Suppose you ran the above query and find out that your hit ratio is below 80%. Should you increase the value of DB_BLOCK_BUFFERS and by what value? When the buffer cache is too large, it impacts the total memory resources on your system, which in turn can result in paging and swapping. If the hit ratio is below 80% you must increase the DB_BLOCK_BUFFERS without impacting the overall memory resources. There is a procedure to increase the value of DB_BLOCK_BUFFERS to attain optimal hit ratio. In the following example, suppose you want to increase the value of your DB_BLOCK_BUFFERS from 200 to 250: 1. In the initSID.ora set DB_BLOCK_LRU_EXTENDED_STATISTICS = 50 2. Restart the database for normal activity. There will be some overhead associated with the collection of these statistics. Therefore, do not forget to remove the above entry from the initSID.ora file after tuning of the database buffer cache is done. 3. Next you determine the number of additional cache hits if you had actually increased the value of DB_BLOCK_BUFFERS by querying the X$KCBRBH table: Svrmgr30>select sum(count) ach From X$KCBRBH where indx < 50; Assuming that the query returns a value of 3. This number represents the value of the additional cache hits incurred when buffers are added to the cache. 4. After calculating the additional cache hits incurred on adding the buffers, we calculate the impact on the cache-hit ratio. We query the V$SYSSTAT view to calculate the change in the hit ratio due to the additional cache hits. In the following query we subtract the additional cache hits (ach) value from the physical reads and recalculate the cache hit ratio: Svrmgr30> select 12 (((cur.value + con.value) – (phy.value - ach))/(cur.value+con.value))*100 “Hit Ratio” from v$SYSSTAT phy, v$SYSSTAT cur, v$SYSSTAT con Where cur.name = ’db block gets’ AND con.name = ’consistent gets’ AND phy.name = ’physical reads; If the resulting hit ratio goes above 80% then you should add the extra buffers. But, if the new hit ratio stays the same or there is no significant increase in the hit ratio, try and experiment with some other values of the extra buffers. Choosing Database Block Size: The default value of Oracle block size on Windows NT is 2K. The optimum block size for you system depends on the type of applications that you intend to run against the database. For OLTP applications, a smaller block size is better. Generally, for OLTP applications, the Oracle block size should be kept equal to the average row size in your database. The average row length can be found out by computing statistics for the tables in your database. On the other hand, for DSS systems, which involves a lot of full table scans, a higher value of the block size like 8K is beneficial and improves performance. The block size of 8KB with the value of DB_FILES_MULTIBLOCK_READ_COUNT of 8 causes 64KB size of disk I/Os which provide optimal performance for DSS systems on NT. The reason being that on NT, the maximum transfer size per I/O is 64KB. This results in reading 64KB of data to the buffer cache for each read from the Oracle datafiles. For full table scans, this causes a significant performance gain. Tuning the Log Buffer Area: The initSID.ora parameter LOG_BUFFERS reserves space for the redo log buffer. As mentioned previously, this buffer contains information about data blocks, which are being changed by the users. If the value of this buffer is too small, then the LGWR process has to continuously flush its contents to the online redo log files which results in a lot of disk I/O. The LGWR process always starts to write to disk when this buffer begins to fill. Usually the log buffer is small in comparison with the total SGA size and a small increase can significantly enhance throughput. If the size of the redo log buffer were small, the user process would have to wait for space in the redo log buffer. This wait can be monitored by the following query: Svrmgr30> select name, value from V$SYSSTAT Where name = ‘redo buffer allocation retries’; This query should return a value close to zero during normal database activity. Increase the size of the redo log buffer to bring this value as close to zero as possible. On PowerEdge servers having multiple CPU’s you have the option to increase the number of processes that can write to the redo log buffer simultaneously to enhance performance. The initSID.ora parameter LOG_SIMULTANEOUS_COPIES which determines the number of redo copy latches, has a default value which is equal to the number of CPU’s on your machine. You should change this parameter and set it to: LOG_SIMULTANEOUS_COPIES = 2 * Number of CPU’s This would reduce contention on the redo copy latch. 13 Tuning the Shared Pool Area: The initSID.ora parameter SHARED_POOL_SIZE determines the shared pool size in bytes. This area in the SGA contains a parsed form of SQL statements so that the similar statements can be re-executed without reparsing. The size of this parameter should be made as large as possible without constraining the physical memory limits on your system. Remember that the total SGA should easily fit into the real memory. To find out the amount of memory available in the shared pool, use the following query: Svrmgr30> select * from V$SGASTAT where name = ‘free memory’; Other statistics about the shared pool can be obtained by using Performance manager tool available in the Oracle Enterprise Manager Diagnostics pack. 2) Tuning CPU Resources: The CPU is one of the most important components the computer system for which processes contend. CPU contention occurs when several applications try to acquire CPU resources concurrently. In case the real memory on your system is limited, the CPU then spends considerable time paging or swapping. Oracle Performance Monitor tool or Oracle Performance manager available on the Enterprise Manager Diagnostics Pack CD can be used to monitor important CPU activity. Following are some of the important statistics that should be monitored on your server. Processor:%Processor Time: The %Processor time is expressed as a percentage of the elapsed time that a processor is busy executing a non-idle thread. It can also be viewed as the fraction of time spent by processor doing useful work. This value should be less than 90%. If the value exceeds 90%, the CPU is a bottleneck on your system. This statistic can either be obtained by using Oracle Enterprise manager Diagnostic pack’s Performance manager application or by Oracle Performance Monitor. System:%Total Processor Time: On SMP machines, the %Total Processor Time should be less than 50%. It is the average percentage of time that all the processors on the system are busy executing non-idle threads. On a multiprocessor system if all of the processors are always busy executing non-idle threads, this is 100%. If 1/4th of the processors are 100% busy, this is 25%. On each processor this value is equal to the sum of %user time and the %privileged time. This statistic can either be obtained by using Oracle Enterprise manager Diagnostic pack’s Performance manager application or by Oracle Performance Monitor. 14 System: Processor Queue Length: It is the instantaneous length of the processor queue in units of threads. All processes use a single queue in which threads wait for processor cycles. This length does not include threads, which are currently executing. A sustained processor length greater than 2 generally indicates processor congestion. This statistic can also be monitored using either of the above mentioned tools. 3) Tuning Disk I/O: In any Oracle Implementation, I/O must be evenly balanced across disk drives and disk controllers for optimal performance. Generally, for DSS systems, disk striping and for OLTP systems, a combination of striping and mirroring improves performance. Many of the disk I/O related issues are also solved by tuning your SQL statements to eliminate unnecessary reads from data blocks. The maximum transfer size on a Windows NT system is 64KB per I/O. All reads or writes on NT would be either that size or smaller. From Oracle perspective, a DB_BLOCK_SIZE of 8KB with the value of DB_FILES_MULTIBLICK_READ_COUNT of 8 gives the size of a single disk I/O as 64KB which is also equal to the SCSI controller data bus width and provides the optimal I/O performance. There are two kinds of I/O activities involved in an Oracle system. Sequential I/O: Redo Log Files: Oracle redo log files are always accessed sequentially. They should be placed on disks with minimal disk access other than by the LGWR process. Also, the controller hosting the disk containing redo log file should not have any hardware striping or striping with parity or write back cache enabled. Oracle needs to write to redo log file without any delay or interruption for best performance and data protection reasons. The best practice is to isolate the disks containing redo log files from any other kind of disk activity. If archiving is enabled, the disks containing the archived redo log files should also be dedicated to avoid contention for disk access between LGWR and ARCH threads. Random I/O: Datafiles: In any OLTP or DSS environment, Oracle datafiles are accessed randomly. The best combination of ease of use and I/O load balancing occurs with hardware based RAID0 or disk striping which can be achieved by using Dell’s PowerEdge Raid Controller (PERC). The disk array controller ensures that the load is balanced equally across disks in a stripe set. The total I/O for a single disk is the sum of Reads and Writes. The total disk I/O should be closely monitored and should not exceed the I/O capacity of your disk. Fig.3 shows RAID0 configuration with an array controller. 15 To Host Machine Array Controller, with Cache Disk1 Data Seg. 1 Size n bytes Disk2 Data Seg. 2 Size n bytes Disk3 Data Seg. 3 Size n bytes Disk4 Data Seg. 4 Size n bytes Data Seg. 5 Size n bytes Data Seg. 6 Size n bytes Data Seg. 7 Size n bytes Data Seg. 8 Size n bytes PowerEdge Expandable Raid Controller (PERC): Dell’s PowerEdge Raid Controller resides on PCI bus and supports two Ultra/Wide SCSI channels with data transfer rates up to 40 MB/sec per channel. It is one of the most versatile RAID controllers available today in the market. Despite other features, it allows RAID logical arrays to be expanded simply by installing additional physical drives. There are two models of the PERC card available. One model is Falcon, which has a standard battery backup feature, and by default it has write back cache enabled. The other, PERC 2/SC (Quartz II), does not have a battery backup and only supports write through cache. One of the most significant features of the PERC card, related to Oracle is the option of writeback cache up to 32 MB. The PERC has a standard backup battery module. In case the voltage drops below a certain level or there is a power failure, the battery backup module provides the memory refresh cycles necessary to retain the contents of the cache. The back battery module can provide the memory refresh cycles from 17 to 72 hours depending upon the amount of cache installed. If the voltage returns to an acceptable level, the module switches the power source back to the controller card. Write Back Cache: In this caching scheme, the controller responds with a “done” signal after the data has been written to the cache and before the writing of the data to the disk is complete. In parity based systems, there is increased read and write activity to all disks in a RAID volume. The reason is, in a parity-based system multiple writes are actually performed for each write 16 operation because the parity information has to be continuously updated. This causes performance degradation in parity based systems like RAID3 and RAID5. To avoid the performance problems in parity based systems, make sure that your disk controller has a large cache. The larger the cache is, the better the performance will be for both reads and writes against the RAID set. The biggest issue with this caching scheme is the scenario of losing the controller, which might have data in its cache still unwritten to the disk. In the case of disk failure or power failure, the backup battery module would preserve the data until the disk is replaced or the power is restored, but the issue of losing the controller itself is a big problem in this caching scheme. Although, the PERC card has a very high reliability rating and the probability of losing the controller is almost negligible. The maximum amount of time for which data is allowed to reside in the controller cache is also configurable through BIOS configuration utility for the PERC. Write Through Cache: On PERC 2/SC, the default is write through cache. The write through caching scheme causes the controller to write to disk directly, each write request. The controller does not respond with a “done” signal until the data makes to the disk. Although, this causes slower performance in case of RAID3 and RAID5, it ensures data integrity and protection. With write back cache disabled you can still take advantage of the controller cache in terms of disk reads. Selecting a Caching Scheme: If maximum performance is required, the cache should be set to write back. On the other hand, if you cannot afford even a minute chance of data loss, do not change the caching scheme to write back. You can change the cache policy of the PERC by using BIOS configuration utility, PERC manager or Power Console. To change the cache policy perform the following steps: 1. Select <Ctrl><M> at the prompt. 2. Select Objects. 3. Select Logical Drives and change the Write Policy to write-back or write-through. Balancing I/O across Disks: There are several basic rules that must be followed when designing a physical database layout in order to avoid contention between database files. In any I/O contention problem, make sure that the following rules are not being violated. The rules refer to separation of files across drives. Files should be placed on different I/O channels to avoid the possibility of the controller becoming an I/O bottleneck. Place each online redo log file on a separate disk. If you are running your database in ARCHIVELOG mode, the online redo log files will be archived. During high database activity there might be contention between ARCH and LGWR background threads for I/O. Do not place online redo log files on the same disk with the Oracle datafiles. The online redo log files which are accessed in a sequential, write only fashion, should be isolated to their own controller volume. Do not place redo log files on parity based RAID devices. You can either use RAID1 for hardware mirroring or use Oracle to mirror the files. 17 Spread your datafile across as many disks as possible to balance random I/O. For data warehousing applications, which feature large batch writes and number of small queries, parity based systems like RAID5 are suitable. Small write operation perform very poorly in parity based systems, since the entire block and its associated parity has to be read and written for even a small write. On the other hand, read operations perform well in RAID5 configurations because the entire block is read into the controller cache each time data from the block is requested, minimizing the time required for subsequent reads from the same block. This operation is ideal for DSS systems, which involve large full table scans. For OLTP applications, which demand maximum throughput, and involve large number of small read and write operations, consider data striping without parity for maximum performance. For fault tolerance, use the combination of RAID0 + RAID1 as depicted in Fig. 4. Place datafiles containing your tables and the datafiles containing the indexes on separate disks to avoid contention during queries. (keep in mind that each RAID device containing multiple disks is one logical drive) Separate the datafiles containing rollback segments from the table and index datafiles. Separate datafiles belonging to the temporary tablespaces from the table, index and rollback segment datafiles. Separate the datafiles belonging to the SYSTEM tablespace from the rest of the datafiles in your database to optimize data dictionary access. Use multiple page files on separate disks. Paging on NT is multithreaded, so multiple page files on separate disks with separate controllers can facilitate concurrent I/O and result in improved performance and overall throughput. NT OS RAID1 (Mirroring) RAID0 RAID0 Array Controller, with Cache Disk1 Disk2 Logical Disk 1 Disk3 Disk4 Array Controller, with Cache Disk1 Disk2 Disk3 Disk4 Logical Disk 2 (Copy of Logical Disk 1) 18 Detecting Disk I/O Bottlenecks: Using Oracle’s performance monitor or if you have Enterprise manager Diagnostic pack, using performance manager, monitor the following statistics to detect disk I/O bottlenecks: Logical Disk Transfers/sec: During peak database usage time, monitor the value of disk transfer per sec. This value is the sum of disk reads/sec and disk writes/sec or the total disk I/O per second. Make sure that this value does not exceed the individual physical disk I/O capacity in the logical volume. Logical Disk Queue Length: The Logical disk queue length is the number of I/O requests outstanding on the logical volume at the time the performance data is collected. This is an instantaneous length, not an average over the time interval. If there is a sustained load on the disk volume, this value will be consistently high. A value consistently greater than 3 indicates a disk I/O bottleneck. Physical Disk Transfer/Sec: This reading should also be taken during peak database activity. This value will reveal the number of I/Os for each physical disk. Make sure that that the I/O rate does not exceed the physical disk recommendations. This value is also the sum of physical reads per seconds and physical writes per second. Physical Disk Queue Length: Same as the logical disk queue length except it is for each physical disk in the logical disk volume. Make sure that this value also stays below 3 during normal database usage. 4) System Optimization: Windows NT server by default, is optimized for file services. For this reason, NT reserves memory for file system caching. On the other hand, Oracle bypasses NT file system caching and uses its own caching scheme to read and write from datafiles. Optimizing NT for Networking Applications: For optimal database performance, NT should be tuned to maximize throughput for network applications. From the Control Panel->Network->Services->Server->Properties, choose: ‘Maximize throughput for network applications’ This option tunes the server cache for network applications for client connections to the database. It frees memory held by NT file cache so that the Oracle80 process working set can utilize more of it. The current working set for the Oracle80 process can be viewed from the task manager under processes tab. Lowering the Performance Boost for foreground Applications: 19 Oracle process oracle80 runs as a service on Windows NT. This causes foreground applications to take priority over Oracle threads in default Windows NT environment. In default NT environment, the foreground applications threads get priority over the other processes. Lowering down the performance boost for foreground applications through Control Panel->System>Performance, can change this behavior. Conclusion: Any Oracle implementation on NT is highly customizable according to the type of intended applications. Tuning your database and system involves vigorous analysis of data, type of application (DSS or OLTP), user population, throughput requirements, future scalability, availability, and the available hardware resources. This document only supplements and summarizes vast resources of information available on performance tuning issues from Oracle documentation and from the experiences of Oracle consultants. Tuning an Oracle database is not a one step process; it is an iterative process and involves close monitoring of user activity, types of transactions, hardware upgrades and the amount of data stored in your database. All of these factors play a very important role in database tuning process. These factors can also change over time. So, one has to closely monitor and adjust tuning parameters according to those changes. Glossary: Background Threads: Objects within the main Oracle80 process, which execute program instructions. This enables the main process to run different parts of its program on different processors simultaneously. Database Application: Program module used to interact with the database server. In a client server environment, the client executes the database application that interacts with the database server. Database Server: A database Server is an information management system. It provides facilities to store large amounts of data and makes it accessible to multiple users, which can manipulate it concurrently while preserving the security, integrity, failure recovery, and high performance. Data Dictionary: Oracle internal tables and views used by DBMS to store information about the database structures. Each Oracle instance has an instance ID associated with it called System ID or SID. Decision Support System (DSS): Database application modules concerned primarily with running large number of complex queries against a DBMS to ascertain useful business specific statistics. Initialization parameter file: Text file that contains parameters and information to initialize an Oracle database and instance at startup. Instance: Combination of System Global Area (SGA) and background threads. Oracle instance has to be up and running before starting the database. 20 Online Transaction Processing (OLTP): database Applications used to provide support for online transaction processing which involve a large number of small, concurrent data manipulation operations involving inserts, updates and deletes. Oracle Database: An Oracle database is a collection of data that is treated as a unit. The database has logical (Tablespaces, Schema Objects, Segments, data blocks) as well as physical structures (datafiles, redo log files, control files, parameter file). Oracle Enterprise Manager (OEM): An Oracle tool with a comprehensive suite of systems management applications with a GUI interface to manage, tunes, and diagnose Oracle products. Oracle performance packs, which consist of Diagnostic pack, Tuning pack and Change management pack can be added to the basic OEM applications for enhanced functionality. Oracle8 Server: An object-relational database management system (DBMS) that provides comprehensive information management capabilities of a database server. Oracle8 Server EE: An object-relational DBMS with enhanced features for enterprise level business needs, enabling it to support terabytes of data and tens of thousands of users concurrently. Oracle Performance Manager: An Oracle tool, part of the Oracle Enterprise manager Diagnostic pack, which can be used to monitor an Oracle server. Oracle System Sizer for Dell PowerEdge Servers: Oracle System Sizer is a tool that helps you choose the Dell PowerEdge Server that best suits your application and database environment. Paging: The process of moving infrequently used parts of a process’s working memory from the physical memory to the page file on the hard disk in 4KB page sizes. Page File: A storage area on hard disk allocated to contain infrequently used parts of a process’s working memory. Physical Memory: The computer's primary storage area for program instructions and data. Each location in physical memory is identified by a number called a memory address. RAID: Redundant Array of inexpensive Drives. RAID0: Disk striping without parity. RAID1: Disk mirroring. RAID3: Disk striping with parity (non-distributed). RAID5: Disk striping with distributed parity. RAID10: Combination of disk striping and mirroring. Server Process: Oracle creates server processes or on NT, the shadow processes to handle requests from connected user processes. 21 SID: Uniquely identifies an Oracle instance. Users need to specify the SID of the instance when connecting to a database opened by that instance. Swapping: The process of moving an entire process working set memory out of the physical memory to the swap file. Windows NT does not swap. In case of limited memory, it uses paging techniques to move infrequently used parts of a process’s working from the physical memory to the page file in 4KB page sizes. System Global Area (SGA): A shared memory region that contains data and control information for one Oracle instance. User Process: The user process is created and maintained to execute the software code of an Oracle tool or an application program. On NT, the user process communicates with its dedicated shadow process on the server instead of directly accessing the SGA. Virtual Memory: A method used by Windows NT server operating system to increase the addressable RAM by using page file on the hard disk drive. Windows NT Server: Server Operating System that provides facilities to run a range of server applications including Database Servers. In a client server architecture, Oracle DBMS runs on NT server as a service to which multiple users can connect using client database applications. Windows NT Server/E: Server Operating System with enhanced features making it suitable to run enterprise level business and other server applications. For More Information: Oracle Corporation, Oracle8 Server Tuning Release 8.0, Oracle Corporation, 1997, Oracle Part Number A54638-01. Oracle Corporation, Oracle8 Server Concepts Release 8.0, Volume 1, Oracle Corporation, 1997, Oracle Part Number A54646-01. Oracle Corporation, Oracle8 Server Reference Release 8.0, Oracle Corporation, 1997, Oracle Part Number A54645-01. Corey, Michael and Abbey, Michael, Oracle8 Tuning: Fine-Tune Oracle for Maximum Performance and Productivity, Oracle Press, 1998. Arnoff, Eyal and Loney, Kevin, Advanced Oracle Tuning and Administration, Oracle Press, 1997. 1998 Dell Computer Corp. All rights reserved. Dell and PowerEdge are registered trademarks of Dell Computer Corporation. Oracle is a registered trademark of Oracle Corporation. Windows and Windows NT are registered trademarks of Microsoft Corporation. 22