Comparing the High Availability Features of DB2 UDB 8.2 and SQL Server 2005 Authors: Michael Otey and Denielle Otey Published: April 2005 Summary: SQL Server 2005 is a better choice for high-availability than IBM UDB 8.2. SQL Server 2005 provides more high-availability options and they are easier to use than the high-availability features in DB2 UDB 8.2. This paper compares the high-availability features found in SQL Server 2005 and IBM UDB 8.2. Copyright This is a preliminary document and may be changed substantially prior to final commercial release of the software described herein. The information contained in this document represents the current view of Microsoft Corporation on the issues discussed as of the date of publication. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information presented after the date of publication. This White Paper is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS DOCUMENT. Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this document may be reproduced, stored in or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of Microsoft Corporation. Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this document. Except as expressly provided in any written license agreement from Microsoft, the furnishing of this document does not give you any license to these patents, trademarks, copyrights, or other intellectual property. Unless otherwise noted, the example companies, organizations, products, domain names, e-mail addresses, logos, people, places, and events depicted herein are fictitious, and no association with any real company, organization, product, domain name, e-mail address, logo, person, place, or event is intended or should be inferred. 2005 Microsoft Corporation. All rights reserved. Microsoft and ActiveX are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries. The names of actual companies and products mentioned herein may be the trademarks of their respective owners. Table of Contents Introduction 1 Planned Downtime 1 Online Operations 1 Backup and Restore 2 Database Maintenance 4 Table and Index Reorganization Column Redefining 4 6 Dynamic Server Configuration Bulk Loading of Data 6 9 Dynamically Add Server Memory Unplanned Downtime Database Restore 11 11 12 DB2 UDB 8.2—Restoring a Database 12 SQL Server 2005—Restoring a Database Summary of Database Restore 13 15 Data Protection and Disaster Recovery 15 DB2 UDB 8.2—High Availability Disaster Recovery SQL Server 2005 Database Mirroring Conclusion 16 19 22 About the Authors 22 i Introduction One of the most important objectives for today’s database administrator (DBA) is to ensure that users have timely access to the information they need. The most sophisticated and powerful servers and the best application coding practices won’t make any difference to application users if they are unable to access the resources they need to do their jobs. If the database and servers are unavailable, the cost for the business can be huge, not only in terms of dollars and cents, but also in terms of reputation and the ability to provide goods or services. Many enterprises today require 24-hour a day, 7-day a week, 365-day of the year availability of their database resources, making the task of creating a high-availability environment a challenging and complex undertaking. Reducing downtime is an essential step to increasing availability. To create a highly available database environment, you need to evaluate the database’s capabilities so that you can address those issues that can impact both planned and unplanned downtime. This white paper compares the high-availability features and capabilities of DB2 UDB 8.2 and Microsoft® SQL Server™ 2005. Planned Downtime Routine database maintenance is an essential part of good database management and administration. A good operations plan will take into account the need for database maintenance actions, such as: Index maintenance: Creating, deleting, and reorganizing indexes. Performance tuning: Changing configuration parameters. Back ups: Regularly backing up data. Schema changes: Adding and dropping columns, changing the name and type of columns. Data changes: Reorganizing and defragmenting tables. Server changes: Adding memory as required. Online Operations Many routine database maintenance tasks have the potential to cause downtime, because these maintenance operations may interfere with applications that users need. For example, a particular maintenance operation might require that the database be restarted for the changes to take effect. Other operations might require exclusive access to certain database objects thereby locking out applications. The key to ensuring the uninterrupted availability of data while maintenance operations are being performed is to make sure that the operations are done online. This means that the maintenance operations should: Lock as few resources as possible. Ideally, the operation would not lock any resources at all. The only database objects that are locked should be just those objects that are directly affected by the maintenance operation. 1 Not require a server restart for the changes to take effect. In the next section we’ll look at some of the tools that can address the issue of planned downtime. We will look at tools for backup and restore, database maintenance, the ability to alter server configuration parameters, the effect of bulk loading data, and the ability to dynamically add RAM and disk storage to the server. Backup and Restore Planning and implementing a basic backup-and-recovery plan is critical to ensuring the availability of system data. When developing a backup and recovery plan, it’s important to compare the tradeoffs between the ability to provide complete, quick recovery and the amount of disk space that is required to store changes to the database. A good recovery plan should include practicing the restore operation to ensure that the plan you’ve decided on meets your organization’s needs. DB2 UDB 8.2—Backup and Restore DB2 UDB 8.2 offers three types of backup methods: Full, Incremental, and Delta (or Differential). Full backup. Saves the entire database and all included tablespaces. Incremental backup. Saves all of the database changes that have occurred since the previous incremental backup, plus all of the database data changes that have taken place since the last full backup. This method is also called a cumulative backup. Delta (or Differential) backup. This is a noncumulative backup. The Delta backup only contains database changes that have occurred since the last backup of any kind: Full, Incremental, or Delta. DB2 UDB supports online and offline backup-and-restore capabilities that can provide the complete recovery of a database. Offline backup requires that all users and applications be disconnected from the database that is being backed up, while online backup can take place with the users connected to the database. Offline backup and recovery uses circular transaction logging, which only supports basic database-level backup and restoration. Online backup and restore uses archive logging, which supports point-in-time recovery. Online backs occur at either the tablespace or the database level. SQL Server 2005—Backup and Restore SQL Server’s backup-and-restore capabilities depend on the recovery model that the server employs. SQL Server 2005 has three recovery models: Full, Simple, and BulkLogged. Full Recovery model. This is the most comprehensive recovery model. All data modifications are logged to the transaction log files and they remain there until a backup of the database or database log occurs. Of the three recovery models, the Full Recovery model requires the most space but also provides the most complete data protection. All data is logged and is recoverable to the specific point of failure. Recovery is performed by first restoring the last backup, then applying the log files up to the point of data corruption. SQL Server uses the Full Recovery model by default. 2 Simple Recovery model. This model uses the least logging overhead, as transactions are erased from the log immediately after they are committed to the database. With the Simple Recovery model, all data modifications made since the last backup are lost. The Bulk-Logged Recovery model. This model lies in the middle of these two extremes. It provides good performance and lower logging overhead than the Full Recovery model. With the Bulk-Logged Recovery model, some of the transactions requiring the highest amounts of storage such as CREATE INDEX, Bulk Load, SELECT INTO, WRITETEXT, and UPDATETEXT are not logged. In a recovery situation, these transactions are lost but typical OLTP database transactions are fully recoverable. After a recovery model for a database has been chosen, the next step is to decide on a database backup plan. Data may be backed up to disk or to tape or other media. The fastest method for backing up and recovering data is to perform disk backups. To protect against drive failures, disk backups should always be directed to a separate drive and, if possible, a separate controller from your database data. SQL Server supports three types of database backup: full, differential, and log backup. Full database backup. Creates a complete copy of the database, including tables, stored procedures, and all other objects contained in the database. This backup plan offers a known point from which to begin the restoration process. Differential backup. Copies only the database changes that have occurred since the last full database backup. Frequent differential backups can minimize the time it takes to bring your database up to the last current transaction. Log backup. Makes a backup file of all items in the transaction log. These may be applied after the last differential backup has been restored. All backups can be performed while the database is online and available. DB2 UDB 8.2—Point-in-time recovery The DB2 database allows point-in-time recovery of a database from a backup. To return a database to a previous point in time, you need to use archive logging. Then restore the database and roll forward through the logs to a point in time just before the data corruption occurred. All of the tables in the database are affected and recovered to the same point in time. There is no way to recover only a single table. SQL Server 2005—Point-in-time recovery SQL Server 2005 transactional point-in-time recovery allows an entire database to be restored to any given point in time. Transaction logs contain a serial record of all of the changes that have been made in a database since the transaction log was last backed up. You can use transaction log backups to recover a SQL Server database to any specific point in time. For example, if a user inadvertently deleted some records at 02:00, you could use SQL Server 2005 transaction log backup to recover the database to 01:59—just before the point at which the records were deleted. 3 When you restore a SQL Server transaction log, all of the transactions contained in the log are rolled forward. Each transaction log record not only contains the data changes that are made in the database, but also a timestamp that indicates when the transaction occurred. When the specified time is encountered or the end of the transaction log is reached during the restore operation, the database will be in the precise state that it was in at the time just prior to the application error. This enables you to quickly recover from the data corruption error. The unrestricted ability in SQL Server 2005to perform online backups makes it easier for the administrator to implement a backup plan with less impact to operations. While both DB2 UDB 8.2 and SQL Server 2005 both support point-in-time recovery of database objects, SQL Server 2005 has more flexible logging capabilities that provide the database administrator with more control over how much data is captured in the transaction logs. Database Maintenance The ability to perform routine maintenance on a database while it is online and available to users is critical to reducing or preventing planned downtime. The following table summarizes SQL Server 2005 and DB2 UDB 8.2 features for performing online server and database maintenance operations. On-line Operation SQL Server 2005 DB2 UDB 8.2 Table/Index Reorganization Yes Partial Dynamic Sever Configuration Yes Partial Schema/Column Redefinition Yes Partial Hot-Add Memory running on Windows Server Yes No Bulk Insert Row-level locking Row-level locking Table and Index Reorganization Table and index reorganization is a routine database maintenance task that affects highavailability. This section compares DB2 UDB online table and index reorganization with SQL Server 2005 online index reorganization. DB2 UDB 8.2—Table/index reorganization When an index on a table is created, you can specify whether online index defragmentation is allowed. Read/write access to the table is allowed during most of the index build process. Any changes to the table are synchronized to the new index, and write access to the table is temporarily blocked to allow index completion. 4 Two methods are provided by DB2 to reorganize a table: classic (or offline) and in-place (or online). The classic method of reorganizing a table is the faster method, and indexes are rebuilt automatically after the table reorganization takes place. The original table is reorganized into a shadow copy of the table, and when the operation is complete, the shadow copy replaces the original copy of the table. Read-only applications may access the original copy of the table, except during the last stages of the reorganization when the shadow copy replaces the original copy of the table and the indexes are rebuilt. During this time the table is unavailable. The length of time required for this last step depends on the size of the table and the number of indexes. Even though the offline method may be faster, there are a few drawbacks to this method. LOB or LONG data types are not automatically reorganized and the shadow copy of the table could take up to twice as much space as the original copy. Also, the reorganization can only be stopped by someone who has the authority to execute the FORCE command for the controlling application. If the reorganization fails, it needs to be restarted from the beginning on any node where it failed. The in-place method of reorganizing a table is slower and does not do an automatic index rebuild, but it does allow applications to access the table during the reorganization. This method allows the reorganization process to be halted and then resumed by anyone with the proper authority using the schema and table name. There are several drawbacks to this method. Only type-2 tables and tables without extended indexes can be reorganized online. In addition, online reorganization requires increased log space, as the reorganization logs all of its activities to allow recovery after an unexpected failure. SQL Server 2005—Index reorganization SQL Server 2005 provides a command option for index creation, alteration, and maintenance: CREATE INDEX with ONLINE. The online index option allows simultaneous modifications to the underlying table or index while the index is in the process of being rebuilt. In earlier versions of SQL Server, index tasks (such as rebuilding) held exclusive locks on the table data and associated indexes; this prevented queries or updates to the data until the index operation was complete. The online index feature eliminates the need for exclusive object locks because it keeps two copies of the index: one that the applications continue to use and a second, temporary copy that is used to rebuild the index. SQL Server 2005 maintains both copies of the indexes while the operation is being performed. When the rebuild is complete, the old index is dropped and the new index is put into place. 5 Column Redefining Altering tables and columns is a routine database maintenance task that can affect the availability of those tables. DB2 UDB 8.2—Column redefining DB2 permits redefining a table using the ALTER TABLE command, although there are limitations. You may add columns to a table and change the length of columns that are defined as VARCHAR. Any columns added to a table are appended to the end of the list of columns in the table. If you need to delete, rename, or change the type of a column in a table, the table will need to be dropped and recreated, or you may create a view on the table. So the entire table becomes unavailable when columns are being deleted, renamed or changed. SQL Server 2005—Column redefining SQL Server 2005 also uses the ALTER TABLE command to redefine columns in a table. The ALTER TABLE statement allows several changes to the structure of a table including changing of the table name, adding and deleting columns, moving column positions in the table, renaming columns, and changing the type of column. Changes specified in the ALTER TABLE statement take effect immediately. If the changes made to the table require modifications to the rows of the table, the ALTER TABLE command updates the rows in the table. ALTER TABLE obtains a brief schema modify lock on the table during the change. The modifications made to the table during the ALTER TABLE operation are logged and fully recoverable. Dynamic Server Configuration The ability to change database and server configuration properties without interrupting user activity is a vital capability for ensuring high-availability while performing routine maintenance. DB2 UDB 8.2—Dynamic Database Manager Configuration Parameters DB2 allows you to change certain configuration parameters in a DB2 database or instance while that database is running, accepting connections, or processing transactions. The following table lists the configuration parameters that may not be changed dynamically online. Note Changes to these parameters require that the database server has to be restarted for the changes to take effect. 6 agent_stack_sz Agent stack size agentpri Priority of agents aslheapsz Application support layer heap size audit_buf_sz Audit buffer size authentication Authentication type clnt_krb_plugin Client Kerberos plug-in clnt_pw_plugin Client userid-password plug-in datalinks Enable Data Links support dir_cache Directory cache support discover Discovery mode federated Federated database system support fenced_pool Maximum number of fenced processes group_plugin Group plug-in instance_memory Instance memory intra_parallel Enable intra-partition parallelism java_heap_sz Maximum Java interpreter heap size jdk_path SDK for Java installation path keepfenced Keep fenced process local_gssplugin GSS API plug-in for local instance level authorization max_connections Maximum number of client connections max_coordagents Maximum number of coordinating agents max_time_diff Maximum time difference among nodes maxagents Maximum number of agents maxcagents Maximum number of concurrent agents maxtotfilop Maximum total files open min_priv_mem Minimum committed private memory mon_heap_sz Database system monitor heap size nname NetBIOS workstation name num_initagents Initial number of agents in pool num_initfenced Initial number of fenced processes num_poolagents Agent pool size 7 numdb Maximum number of concurrently active databases including host and iSeries databases priv_mem_thresh Private memory threshold query_heap_sz Query heap size resync_interval Transaction resync interval rqrioblk Client I/O block size sheapthres Sort heap threshold spm_log_file_sz Sync point manager log file size spm_log_path Sync point manager log file path spm_max_resync Sync point manager resync agent limit spm_name Sync point manager name srvcon_auth Authentication type for incoming connections at the server srvcon_gssplugin_list List of GSS API plug-ins for incoming connections at the server srv_plugin_mode Server plug-in mode srvcon_pw_plugin Userid-password plug-in for incoming connections at the server svcename TCP/IP service name sysadm_group System administration authority group name sysctrl_group System control authority group name sysmaint_group System maintenance authority group name sysmon_group System monitor authority group name tm_database Transaction manager database name tp_mon_name Transaction processor monitor name tpname APPC transaction program name trust_allclnts Trust all clients trust_clntauth Trusted clients authentication 8 SQL Server 2005—Dynamic Server Configuration Parameters To facilitate online server and database maintenance, most SQL Server 2005 system properties can be configured dynamically. There are very few server parameters that require a server restart. They are listed in the following table. Important Changes to these parameters require that the database server has to be restarted for the changes to take effect. affinity mask Increase performance on symmetric multiprocessor (SMP) systems awe enabled Support up to a maximum of 64-GB of physical memory c2 audit mode Review attempts to access statements and objects fill factor How full to make each new index page lightweight pooling Excessive context switching reduction option locks Set maximum number of available locks max worker threads Configure number of worker threads available media retention Length of time to retain each backup medium open objects Set maximum number of db objects open at one time priority boost Set scheduling priority remote access Control executing SPs from remote servers scan for startup procs Scan for automatic execution of SPs set working set size Reserve physical memory space user connections Maximum number of simultaneous user connections SQL Server has a clear advantage over DB2 in the area of dynamic configuration. Bulk Loading of Data Bulk loading data into the database is a database maintenance activity that has the potential to impact the availability of the database. 9 Bulk Loading Data with DB2 UDB DB2 provides two primary utilities for performing bulk inserts of data into database tables: Import and Load. Load utility. This is a higher performance tool than the DB2 Import utility. It is designed primarily for moving large amounts of data into a newly created table. Prior to UDB version 8, the Load utility required exclusive access to the tablespace. In version 8, the Load utility was enhanced to allow concurrent access to tables in the tablespace. However, Load still places an exclusive lock on the table that is currently being loaded. Import utility. While not as high a performance tool as the Load utility, the Import utility is more flexible. Import supports running in either online mode or offline mode. Offline mode requires exclusive access to the target table. Online mode permits other users to have concurrent access to the table. Online mode enforces data integrity via row-level locking. Bulk Loading Data with SQL Server 2005 SQL Server 2005 provides two primary utilities that are used to perform bulk inserts: Bulk Copy Program (BCP) and SQL Server Integration Services. BCP. A command-line utility that can perform both data import and export functions between a SQL Server table and a flat file. BCP holds a table-level lock while it is performing bulk copy operations. Integration Services. A graphical tool for extract, transform, and load (ETL) processes that can both import and export data as well as perform data manipulations. Integration Services provides a complete workflow engine for performing sophisticated data transfers and transformations. By default, the Integration Services Bulk Insert task uses row-level locking and permits concurrent access to the target table by other users but can specify table locking for better performance. In addition to these two utilities, SQL Server 2005 also supports the Bulk INSERT Transact-SQL statement. The behavior of the Bulk Insert statement is controlled by the tablelockedonbulkload table option, which by default allows concurrent access, but this can be overridden by the command’s TABLOCK parameter. Key Observation: SQL Server 2005 has more options for bulk loading data into the database than does DB2 UDB 8.2. In addition, SQL Server Integration Services provides a more powerful and flexible ETL tool that is capable of performing bulk data loading while the database is online. 10 Dynamically Add Server Memory DB2 allows the dynamic resizing of a number of Database Manager and Database configuration parameters. DB2 UDB 8.2 enables dynamic changing for buffer pools, sort heap, and database heap without the need to restart the server or any of the databases in that instance. However, DB2 UDB 8.2 still has many parameters that require a server restart before they take effect. In addition, DB2 cannot dynamically add server memory while running on the Microsoft Windows Server System™. The SQL Server 2005 Hot-Plug Memory feature allows you to dynamically add RAM to your server system while the system is running. This feature requires support from the underlying operating system and hardware platform, so Microsoft Windows Server™ 2003 needs to be running on the system. SQL Server 2005 can dynamically recognize the additional RAM without a reboot. In addition to Hot-Plug Memory, SQL Server 2005 supports the dynamic configuration of CPU and I/O affinity. These values may be changed and the effects take place immediately, facilitating a high degree of availability as the server does not need to be restarted. Changes to Address Windowing Extensions (AWE) are also dynamic, so adjustments to the physical AWE size take effect at once. Key Observation The SQL Server 2005 ability to hot-add memory gives it a significant advantage over DB2 when running on Windows Server 2003. Unplanned Downtime Unplanned downtime (as opposed to planned downtime), refers to situations where data becomes unavailable to applications as a result of something failing or going wrong unexpectedly. Examples of events that cause unplanned downtime include: The server goes down because the database server software crashed or the operating system stopped responding or the hardware failed. The disk subsystem failed. A disaster wipes out the server and the disk subsystem. An administrator makes a mistake and drops tables. An application user makes a mistake and enters incorrect data or deletes data. This section discusses the features and functionality offered by both the databases to address unplanned downtime. 11 Database Restore When restoring the server after a database crash, in many cases backup and recovery will be the primary technology to recover the lost data and bring your server online. The specific steps and time required to restore a database varies based on a number of factors including the type of backups that you have performed and the size and activity of the database. One thing that you can count on is that the restore operation always takes longer than the backup operation. It is vital to complete the restore operation as quickly as possible in order to get the server back online. The time needed to complete the restoration process depends on several factors including the size of the database being restored, the database’s activity level, the backup media used, and whether the backup device employed data compression. The backup media can play a large role in the speed of the recovery process. Disk-based recovery is significantly faster than restoring from media. DB2 UDB 8.2—Restoring a Database The following procedure summarizes the steps required to perform a typical database restore operation in DB2 UDB 8.2 using the restore dialog boxes presented by the Control Center. To restore a database in DB2 UDB: 1. Confirm details of database to restore. 2. Select to restore the entire database or only tablespaces. 3. Select database the backup image to use. 4. Set containers for redirected restore. 5. Select restore only or roll forward restore. 6. Select the final state of the database. 7. Choose restore options. 8. Select performance options. 9. Set an operation schedule. 10. Restore the database. 12 SQL Server 2005—Restoring a Database The following procedure summaries the steps required to perform a typical database restore operation on SQL Server 2005 using SQL Server Management Studio. To restore a database in SQL Server 2005: 1. If the database is accessible, back up the current transaction log using the NO_TRUNCATE option. 2. Close all connections to the database. 3. In SQL Server Management Studio, open the Object Brower. Right-click Database, and select the Restore Database option. 4. Select the database to restore, the point in time, and the restore source. Clearly, restoring a database is easier with SQL Server than with DB2. SQL Server 2005—File Group Restore While you’ll typically want to restore the entire database in the event of a database failure, when only part of the database has been corrupted you don’t always need to restore the entire database. The SQL Server 2005 File Group Restore feature allows you to perform fine-grained restore operations so that you can selectively restore portions of a database. In SQL Server 2000, the level of availability was the database and all of the different filegroups that were used for that database’s disk storage needed to be intact before the database could be accessed following a restore operation. In SQL Server 2005, the unit of availability has been changed from the database to the filegroup. This change reduces the time it takes for the database to become available to users as well as enabling partial database restores. With SQL Server 2005, only the data currently being restored is unavailable. The data contained on other filegroups is accessible. SQL Server 2005 enables the restoration of a filegroup at a time, or even a page or group of pages at a time if the principal filegroup is ready. SQL Server 2005—Fast Recovery One of the factors that increases the time it takes for a database to become available is the time it takes to recover the saved transactions. With SQL Server 2000, each time a database is restored, that database must go through a period of recovery where any transactions that are in the log are applied to the data files. This portion of the recovery process is often called the redo phase as transactions are being reapplied. Next, all of the undo operations are performed and any incomplete transactions in the log are rolled back. The database is not available until all of the redo and undo operations are complete. If a production database went down while it was active, there could be a lengthy delay before that database was available again as all log entries are processed. SQL Server 2005 changes this behavior by introducing the Fast Recovery feature. Fast Recovery enables a database to become available immediately after the redo portion of the recovery process is complete. There is no need to wait for the undo portion to complete. Figure 1 illustrates the difference in availability between SQL Server 2000 and SQL Server 2005. 13 Figure 1: Fast Recovery With SQL Server 2005, when the database is restarted following a restore operation, all of the redo transactions in the log are processed and then the database becomes available while the undo entries are still being processed. For uncommitted transactions, SQL Server still holds the locks used by those transactions so the transactions will remain consistent although the database is in use. The net result is decreased downtime. Note DB2 doesn’t provide a Fast Restore capability. As a result the recovery of long running transactions can delay the availability of the database. SQL Server 2005—Database Snapshots Database snapshots are another SQL Server 2005 feature that can provide increased data availability. Database snapshots are essentially a read-only copy of a database at a specific point in time. Database snapshots are designed for reporting purposes or for creating a backup copy of the database at a certain point in time that you can use to revert a production database back to a prior state. When a database snapshot is created, SQL Server creates a metadata copy of the database at that particular point in time. Subsequent updates that are applied to the original database are not reflected in the database snapshot. An application can connect to a database snapshot exactly as if it were database. From the server’s standpoint, the creation of a database snapshot is an inexpensive operation as the server only uses metadata in conjunction to create the snapshot. A database snapshot doesn’t contain a complete copy of a database. The database snapshot shares the same data pages as the original database so it requires very little additional disk space. In fact, it only requires data storage for those data pages that are updated since the creation of the snapshot. 14 This feature uses copy-on-write technology. When a change is made to one of the data pages in the source database, a copy of that page is saved for the database snapshot. Then SQL Server updates the data page just as it would if no snapshot were in place. Note DB2 doesn’t provide a Database Snapshot capability. Summary of Database Restore SQL Server 2005 provides several advantages over DB2 UDB 8.2 in the features that are available to address unplanned downtime. SQL Server’s database recovery steps are simpler and more straight-forward, thereby reducing the possibility of inducing errors in the restore process. In addition, SQL Server 2005 provides several additional features like Fast Recovery and database snapshots that have to no DB2 equivalent. Data Protection and Disaster Recovery Availability is the primary goal for Information Technology workers, and one of the most important concerns in implementing a high-availability environment is providing protection against server and database failure. A variety of different factors can result in server and database failure. For instance, a failed disk drive is one of the most common causes of server downtime. Unplanned server downtime can also be caused by a software component like an operating system, application, or device driver fault that locks up the system. While protecting against server-wide events is one consideration, unplanned system downtime can also stem from problems at the database level. For example, an operator error could result in the deletion of production data or an application error could cause database corruption, rendering the application unusable. In this case, the server is still active but the database is not accessible. Both SQL Server 2005 and DB2 UDB 8.2 possess features that enable data protection and facilitate disaster recovery. Microsoft SQL Server 2005 provides a feature called Database Mirroring that protects against database failure by maintaining a copy of one or more databases on a remote server. Likewise, IBM’s DB2 UDB 8.2 has a feature called High Availability Disaster Recovery (HADR) that is designed to provide database protection by copying databases between multiple servers. In this section we will compare the SQL Server 2005 Database Mirroring capability with the DB2 UDB HADR feature. First we will provide a brief overview of how each feature works and then identify the strengths and limitations of each system. Next, we will look at the management features of the two features and compare the ease of configuration and setup. Then we’ll take a look at management capabilities, showing what needs to be done on the client in order to take advantage of the feature. Next we’ll look at the ability to switch between the main database and the standby/mirror database, and the ability to manually initiate a failover. 15 DB2 UDB 8.2—High Availability Disaster Recovery Available as a part of the DB2 UDB Enterprise Server Edition, the new High Availability Disaster Recovery (HADR) feature is IBM’s primary solution for disaster protection. To protect against server and database failures, HADR uses the TCP/IP protocol to send transactions from the source (primary) database to a standby database. Using TCP/IP to send transactions from the source to the standby server enables the primary and standby server to be located in geographically separated locations. If the primary server becomes unavailable, the remote standby database can automatically assume the role of the primary database. When the primary database is later available for use, you can initiate a failback to enable the primary server to reassume the role of principle database server. While the primary and the standby servers do not need to be identical, they must use the same bit size (32-bit or 64-bit). You can see an overview of the components used to implement HADR in the following figure. Figure 2: DB2 HADR HADR requires two servers: a primary or source server and a standby server. The primary server is designated as the principle database server. It provides database resources and handles the client connection for normal operations. The secondary server maintains a copy of the primary server’s mirrored database. HADR works by sending transactions from the primary server to the secondary server. It supports three nodes of operation: Synchronous, Near Synchronous, and Asynchronous. Synchronous mode. This mode provides the highest level of protection from data loss. In Synchronous mode, log writes are only considered successful if they have been written to the primary server and the primary server has received a successful write acknowledgement from the standby server. 16 Near Synchronous mode. In this mode, log writes are considered successful when they have been written to the primary server and the primary server has received an acknowledgement that the transaction has been written to main memory on the standby server. Asynchronous mode. This mode provides the lowest level of data protection but provides the best performance. In Asynchronous mode, transaction log writes are considered successful when they have been written to the primary server’s transaction log and to the primary server’s TCP/IP stack. To enable clients to automatically reconnect to the standby server, HADR provides an automatic client reroute feature. The automatic client reroute feature is implemented in the DB2 client layer, which means it requires no changes to the application. If the connection to the primary server fails, then DB2’s automatic client reroute feature makes one attempt to reconnect to the primary server. If the reconnect attempt fails, then the DB2 client layer issues a subsequent connection attempt to the standby database. Implementing DB2 UDB 8.2 HADR DB2 includes a wizard to help you set up and configure HADR on your databases. To access the wizard, start the Control Center and select a database. On the Tools menu, select Wizards. Then select Set Up High Availability Disaster Recovery (HADR) Databases. The HADR Wizard will begin and step you through the setup and configuration process. The following procedure summaries the steps required to implement HADR on DB2 UDB 8.2 using the HADR Wizard. To implement HADR: 1. Set up and configure archive logging for the primary database. 2. Select a standby system, instance, and database. 3. Copy backup image files. 4. Restore the standby database. 5. Copy objects. 6. Set communications parameters. 7. Configure databases for client reroute. 8. Specify synchronization mode for peer state log writing. 17 First, the primary database needs to be selected and archive logging needs to be configured. When a database is created, circular logging is enabled by default. To allow for online backups, tablespace backups, point-in-time recovery, and to continue HADR setup for the primary database, logging for the database needs to be changed and configured to archive logging. Next, you will need to set the number of primary and secondary log files and their respective sizes. Following this, you select a location for log files and then decide whether to have a mirrored copy of your log files saved to a separate location. To finish the change to archive logging from circular logging, a backup of the database needs to be performed. The wizard steps you through selecting a backup media and recommends options that affect the performance of this offline backup. The options recommended include buffer parallelism, number of buffers and buffer sizes, as well as whether to quiesce the database before backing up, and whether to compress the database image. After all choices are made, the database is taken offline and a full backup is performed. The next step in setting up HADR is to select a standby system, instance, and database. Here, the wizard prompts you with a list of DB2 systems from which you select and sign on to a standby system. You then need to select an instance name, or if one has not already been cataloged with your primary system, one may be added to the catalog by clicking the Add button. You will be prompted with the ‘Add Instance’ dialog box, where you input an instance name, instance node name, operating system, protocol, and protocol options. After a system name and instance name have been specified, the standby database initialization options need to be chosen. You have three choices for standby database initialization: use a backup to initialize the standby database, use an existing database as the standby, use SAN split mirror to initialize the standby database. The next step is to specify a backup image of the primary database to restore to the standby database. Once a backup image is selected, a restore to the standby database is executed. The Restore Database on Standby System dialog box prompts you to add a database alias name that allows applications running on your workstation to identify the database. You are also offered an option to copy the backup image to the standby system’s file directory or tape media before the restore of the database is performed to the standby system. Objects that are stored externally, such as user-defined functions and stored procedures, are not included in the backup image. The HADR wizard next prompts you to manually move these objects to the standby system separately. After all options have been set, a Summary dialog box is displayed so you can verify your selections and start HADR on the databases. A successful start up of HADR is shown in the following figure. 18 Figure 3: HADR Setup Complete SQL Server 2005 Database Mirroring Unlike HADR, SQL Server Database Mirroring provides automatic failover. The new SQL Server 2005 Database Mirroring feature provides protection from both server and database failure by mirroring the contents of a database on a secondary instance of SQL Server. Database Mirroring gives SQL Server 2005 a nearly instant standby capability by providing database-level failover. When Database Mirroring is enabled and the primary database fails, the mirrored database on the standby SQL Server system becomes available within a few seconds. Database Mirroring functions at the database level, not at the server level. In other words, it is used to protect a single database or it can be set up for multiple databases, but it is not designed to protect all of the databases and server functions on a SQL Server instance. Database Mirroring works with all of the standard hardware that supports SQL Server and it enables zero data loss in the event of a database failure. The secondary database will always be updated with the current transaction that’s being processed on the primary database server. Since it’s implemented over IP, there is virtually no distance limitation for Database Mirroring. The impact of running Database Mirroring to transaction throughput is zero to minimal. The following figure provides on overview of the components used to implement Database Mirroring with SQL Server 2005. 19 Figure 4: Database Mirroring Database Mirroring is implemented using three SQL Server instances. There is a primary server, a secondary server, and an optional witness. The primary server is the SQL Server system that is designated to be the principle database server, providing the database resources and handling the client connection for normal operations. The secondary server maintains a copy of the primary server’s mirrored database. It should be noted that the secondary (mirroring) server is not limited to just providing backup services. The secondary server can also be actively supporting other database applications. The role of the witness is to act as an independent third party whose responsibility is to cast the deciding vote as to which system assumes the role of the primary server. If the witness cannot contact the primary server, then it promotes the secondary server to become the new primary server. The witness is required for automatic failover. Database Mirroring has two modes: synchronous and asynchronous. Synchronous mode. Synchronous mode provides the maximum level of data protection as well as automatic failover. With synchronized mirroring, the principal server sends the transaction log data to the mirror server and waits for a response. The mirror server writes the data to its transaction log and returns an acknowledgement to the principal server. The principal server then commits the transaction after receiving the acknowledgment. If the primary server becomes unavailable and there is a witness present, then automatic failover is initiated. The mirror server assumes the role of principal, and the mirrored database becomes the new principal database. If there is no witness and the principle server fails, then the DBA must manually initiate a failover. 20 Asynchronous mode. Asynchronous mode provides higher performance than synchronous mode but also offers lower data protection. With asynchronous mode, the principal server first commits each new transaction and then forwards the transaction to the mirror server. The principal server does not wait for an acknowledgement from the mirror server. Because the primary server has already committed the transaction, the mirror server is always a small step behind the principal server. For availability, Database Mirroring works in conjunction with a feature in the Microsoft Data Access Components (MDAC) known as Transparent Client Redirection to enable client systems to seamlessly switch between the primary and secondary server. The MDAC middleware is aware of both the primary and the secondary servers. The MDAC middleware acquires the secondary server’s name when the application makes the initial connection to the primary server. If the connection to the primary server fails, then the MDAC software will make one attempt to reestablish a connection to the primary server. If that fails, then the MDAC middleware will automatically attempt to connect to the secondary server. Since the new Transparent Client Redirection feature is implemented in the MDAC middleware, no changes are required by the client application in order to take advantage of this capability. Implementing SQL Server 2005 Database Mirroring The following procedure summarizes the steps required to implement Database Mirroring on SQL Server 2005. To implement Database Mirroring: 1. Make sure that the primary database uses the Full Recovery model. 2. Backup the database on the primary SQL Server system. 3. Restore the database on the mirror server using the NORECOVER option. 4. Set up mirroring on the standby server using the ALTER DATABASE SET PARTNER statement. 5. Set up mirroring on the primary server using the ALTER DATABASE SET PARTNER statement. 6. Optionally set up mirroring on the witness server using the ALTER DATABASE SET PARTNER statement. Database Mirroring can be set up using either the SQL Server Management Studio or by using Transact-SQL. Before setting up database mirroring you need to make sure that a few prerequisites are taken care of. First, the SQL Server service (MSSQLServer) must be started using a domain user account. It can not be started under the Local System account because the Local System account is restricted from network access. Next, you need to make sure the database that will be mirrored is using the Full Recovery model. Database Mirroring can not be set up using either the Simple or the Bulk-Logged Recovery models. 21 As you can see, setting up SQL Server 2005 Database Mirroring feature is much simpler than setting up HADR. There are significantly fewer steps and setup difficulty is substantially less. In addition, when configured to use a witness, the SQL Server 2005 database mirror is able to perform automatic failover and the Transparent Client Redirection capability handles all of the client connectivity issues without requiring any code changes. HADR requires changes in the client applications to handle redirection to the standby server. In addition, HADR is only available as a standard component in the UDB Enterprise Server Edition. It is an extra add-on component for the other editions. SQL Server 2005’s Database Mirroring is included in both the Standard and Enterprise Editions of SQL Server 2005. Conclusion While both database systems have features that increase database availability, SQL Server 2005 is a better choice for high availability. SQL Server 2005 has more highavailability options than DB2 UDB 8.2 and those features are easier to use in SQL Server 2005. In the area of database backup, SQL Server 2005 provides more flexibility and gives the administrator greater control over the amount of data saved in the transaction logs. For routine database maintenance, SQL Server 2005 provides a greater degree of online server configuration, including the ability to hot-add RAM with no database downtime. In the area of database restore, SQL Server 2005 Fast Recovery enables quicker data availability than DB2. SQL Server 2005 Database Mirroring is significantly simpler to setup than HADR and it offers fully automated failover and is supplied as part of the SQL Server 2005 Standard Edition. About the Authors Michael Otey is Technical Director for Windows IT Pro Magazine and Senior Technical Editor for SQL Server Magazine. He is also President of TECA, Inc., a software development and consulting company that specializes in interoperability and database applications. Michael is the author of the SQL Server 2005 New Features Guide published by Osborne McGraw Hill. Denielle Otey is the Vice President of TECA, Inc., as well as a software consultant who has extensive experience designing, implementing, testing, and debugging software in C, Visual C++®, Visual Basic®, and Visual Studio® .NET. She also is the developer of several SQL Server utilities and co-author of ADO.NET: The Complete Reference published by Osborne McGraw Hill. This article was developed in partnership with A23 Consulting. 22