Setting Up a Hot Standby Database Chris Lawson Database Specialists, Inc. www.dbspecialists.com clawson@dbspecialists.com Hot Standby Overview DB1 Primary Client Client Archive Logs DB2 Standby Read-only Client Read-only Client Hot Standby provides a way for a second database to automatically track a primary database. Hot Standby Overview (continued) • Prior to 8i, a standby database could be created, but without the automated features in the 8i version. • A hot standby database starts as a clone of the primary, using any hot or cold backup. • In order to “keep up” with the primary, the standby performs two separate, ongoing tasks: – Receive and store archive logs from the primary over Net8. – Apply archive logs in proper order. Modes of Operation The standby database has two main modes of operation: Recovery or Read-only Modes of Operation: Recovery DB1 Primary Archive logs (dest 1) Net 8 Archive logs (dest 2) DB2 Standby Archive logs • Managed Recovery is the normal mode of operation. In this mode, the standby database looks for and applies each archive log as it is received. Once started, no DBA intervention is required. • Manual Recovery may also be activated under some circumstances-namely whenever an archive log has been manually transferred to the standby server and needs to be applied. In manual recovery, the DBA starts database recovery. Modes of Operation: Read-only DB1 Primary Client Client Archive Logs DB2 Standby Read-only Client Read-only Client • In read-only mode, the database is actually open to all users for inquiries. • The archive logs continue to be transferred over Net8, but are not yet applied. • Whenever the mode is changed back to recovery, log application resumes as the standby “catches up.” • Note: Archive logs continue to be sent from the primary to the standby, regardless of which mode is in effect. Advantages of Hot Standby • It really works! Documentation is reasonably good. • Fairly easy to set up--no special operating system or database options required. • No special DBA training is required (in contrast to OPS). • Activation of standby is not complicated--but be sure to document and test a procedure specific for your site. • Standby database can actually be opened for queries, then return to recovery mode. This may facilitate off-loading large reports or other batch jobs, so that performance on primary database is not degraded. • Standby database will track actual production very closely--it will typically “lag” by only one archive log file, perhaps a delay of only 15 minutes or so. Standby database is typically on a completely separate server and file system, providing safety if disaster strikes. Disadvantages of Hot Standby • Hot standby only provides limited load-balancing because all users (except for read-only users) must continue to use the primary. • In contrast, OPS (Oracle Parallel Server) or replication allow use of multiple instances simultaneously. • For databases with heavy transaction activity, there will be increased network traffic due to log transfer. • If primary server crashes, and standby database needs to be activated, it may be impossible to access the last archive log on the primary. These transactions will be lost. • Smaller redo logs will minimize this loss by increasing the frequency of log transfers. Preliminary Setup • Ensure primary database is in archive mode, and correctly writes archive logs. • Ensure temp tablespace is marked as temporary • Make a standby control file to use as the starting point for the standby database. e.g., alter database create standby controlfile as '/path'; • Copy over all .dbf files, standby control file and redo logs from DB1 server to DB2. Setup Primary init.ora File • Add entries to write second set of archive logs; the destination is not a directory, but a tns alias that matches the standby connection. log_archive_dest_2='SERVICE=ALIAS optional reopen= 180' log_archive_dest_state_2=ENABLE • Note: – reopen=180 means wait 180 seconds before re-attempting failed archive. – optional means continue even if archival to second destination fails. Standby Control File Explanation DB1 Primary DB2 Standby Control file Control file /u01/ data files /u02/ data files Standby expects data files to be on /u01, but they aren’t ! • The primary control file cannot be used as-is, because the control file has .dbf and redo file locations for primary. • Instead of creating a new control file, the primary control file is adapted for use by the standby. • Without some type of correction, the standby will look in the wrong location for the redo and .dbf files. Standby Control File Explanation Standby control file .dbf location redo log location log_file_name_convert new .dbf path db_file_name_convert new .dbf path • Several new init.ora parameters allow the standby database to translate directory paths from where files were located on the primary, to where they are on the standby. Standby Control File Explanation (continued) For example, if the .dbf files are on /u01 on primary, then they could be translated to /u02 on standby: db_file_name_convert=('/u01','/u02') The path for redo logs is similarly translated: log_file_name_convert=('/oradata1','/oradata2') Configure Standby init.ora Copy primary init.ora to standby and setup following special parameters: db_name=[same as primary] lock_name_space=standby1 Needed if primary & secondary share same host log_archive_dest_1="location=/u00/app/oracle/admin/sec/arch" Used for manual recovery of archive logs standby_archive_dest = /u00/app/oracle/admin/sec/arch Typically set same as previous parameter db_file_name_convert = ('/u01/oradata/prime','/u02/oradata/sec') log_file_name_convert = ('/u01/prime','/u02/sec') Corrects file locations since control file originated from primary Prepare Standby Database • If using password-file authentication, create password file for standby: orapwd file=orapw[SID] Note: Database Configuration Assistant will create init.ora file with REMOTE_LOGIN_PASSWORDFILE=EXCLUSIVE, which implies need for the above password file. • Connect internal, then perform startup nomount; • Perform alter database mount standby database; • Set standby database in Managed (automatic) Recovery Mode recover managed standby database; Note: Prompt will not return; documentation suggests “run on the main console.” • Suggestion: Put last command above in script and run as nohup. Checking Transfer of Archive Logs • When the hot standby is working properly, two things are happening: 1. Archive logs are being transferred; and 2. These logs are being automatically applied • On the primary database, perform alter system switch log file; • A new archive log should appear within a few minutes in the standby database archive location. • If no log appears, check the alert.log for the primary database to check for problems connecting to the standby. • Also check v$archive_dest to confirm that all log destinations are enabled. Checking Application of Archive Logs On the standby database, review the last portion of the alert.log. As each log is applied, there should be a new entry listing the log number Media Recovery Start: Managed Standby Recovery Media Recovery Log Media Recovery Waiting for thread 1 seq# 465 Wed Jun 21 10:48:06 2000 Media Recovery Log /u00/app/oracle/admin/db2/arch/arch_1_465.arc Media Recovery Waiting for thread 1 seq# 466 Wed Jun 21 10:48:22 2000 Media Recovery Log /u00/app/oracle/admin/db2/arch/arch_1_466.arc Media Recovery Log /u00/app/oracle/admin/db2/arch/arch_1_467.arc Media Recovery Waiting for thread 1 seq# 468 Checking Application of Archive Logs (continued) If logs are not being applied, be sure that the expected archive log exists on the standby. If there is a “gap,” then the log should be manually copied to the standby server, and manual recovery performed. Once the gap is “plugged,” then the automatic recovery can be restarted. Mode Change The standby database mode can be switched back and forth at will: Switch to Read-Only Mode • First, cancel managed recovery: recover managed standby database cancel; • Then, set to read-only: alter database open read only; Switch back to Managed Recovery (This restarts the archive log application) • First, confirm there are no sessions active; • Then, resume automatic recovery: recover managed standby database; When Disaster Strikes: Activating Standby Database • Important! Opening standby database will terminate the standby recovery process. • Reversal back to recovery processing is NOT possible, as an implicit resetlogs is performed upon activation. • This is very similar to what is done in a database “clone”, running alter database open resetlogs; • If primary still operational, eke out last archive log using alter system archive log current; • Manually transfer archive log if necessary, putting in archive destination. • Apply as many logs as are available using manual recovery: recover standby database; Activating Standby Database (continued) • Activate standby: alter database activate standby database; shutdown immediate; startup mount; alter database open read write; • Prepare the new database for the archive mode (presumably). • Take physical backup of the newly activated database. • Set up new standby database, using the new physical backup. Restarting Interrupted Log Transfer If the standby database is briefly stopped, the archive log transfer from the primary may be interrupted, and the transfer error may need to be manually reset. • Confirm standby database is once again in startup nomount state. • On primary, confirm error in transfer status. Note failing dest_id: select dest_id, status, target, error from v$archive_dest; Restarting Interrupted Log Transfer (continued) DEST_ID ------1 2 STATUS -----VALID ERROR TARGET -----PRIMARY STANDBY DESTINATION ----------/db1/arch db2 ERROR ------ORA-xxxx • On primary, reset archiving error (replace 'n' with number of failing destination). Note: Even though reopen is specified, log transfer appears to require resetting the error: alter system set log_archive_dest_state_n = enable; Restarting Interrupted Log Transfer (continued) • Perform log switch on primary and confirm that a new archive log appears at standby. • Manually transfer any missing archive logs from primary to standby. Manually apply these logs: recover standby database; • Return to automatic recovery: recover managed standby database; Client Setup for Automatic Failover • In tnsnames.ora, use FAILOVER parameter. When set to ON, instructs Net8, at connect time, to fail over to a different address if the first address fails. When set to OFF, instructs Net8 to try one address. net_service_name= (description= (failover=on) (address=(protocol=tcp)(host=server1)(port=1521)) (address=(protocol=tcp)(host=server2)(port=1521)) (connect_data=(service_name=db1.acme.com))) Client Setup for Automatic Failover (continued) • Important: Do not set the GLOBAL_DBNAME parameter in the SID_LIST_listener_name section of the listener.ora. A statically configured global database name disables connect-time failover. • Only multiple addresses (not connect_data) are specified, thereby requiring that the standby database(s) has the same SID or service_name. Translation Complications • Remember how the .dbf and log pathnames need to be translated using two “special” init.ora parameters. • The standby database will look in a different directory using the new parameters as a “translator.” Translation Complications (continued) Problem: Files are typically not all in the same file system, but the “translation” parameter can only translate from one directory to one directory. How can files in the “other” directories be “fixed?” Primary Standby /u01/ data /u02/ data /u03/ data /u04/ data * For example: db_file_name_convert New parameter * tells Standby to to look in /u02 Standby will still be looking in /u03 for these files = ('/u01','/u02') Translation Complications (continued) Solution: • On the (mounted) standby database, prior to beginning recovery, manually correct the file names that are not covered by the two init.ora parameters. Primary Standby /u01/ data /u02/ data Parameter corrects these files /u03/ data /u04/ data Manually rename to '/u04…' Translation Complications (continued) • For .dbf file, simply rename; for redo log, drop the group, then add group back into desired directory: .dbf file: alter database rename file '/u03/user01.dbf' to '/u04/user01.dbf'; redo log: alter database drop logfile group 5; alter database add logfile group 5 '/u05/redo05.log' size 20m; Adding Datafiles to Primary Database • Adding a datafile to the primary database generates redo that adds the datafile name only to the standby control file; the datafile must still be explicitly added to the standby database. • The solution is simple, but not intuitive; so carefully review and test these special cases. • First, add datafile to primary database as usual. • Then, switch redo logs on the primary database to initiate redo archival to the standby database. Adding Datafiles to Primary Database (continued) • Recovery on the standby database will stop because the datafile does not exist. Standby alert log: WARNING! Recovering datafile 2 from a fuzzy file. If not the current file it might be an online backup taken without entering the begin backup command. Successfully added datafile 2 … • To resolve, create the datafile on the standby database: alter database create datafile '/u02/oradata/test.dbf' as '/u02/oradata/test.dbf'; • Place the standby database in managed recovery mode: recover managed standby database; Other Tips & Tricks • The documented method of connecting to standby appears to be impossible. Resolution: Just use the usual way to connect to an idle instance: connect internal or connect / as sysdba • Ensure that the init.ora parameter, JOB_QUEUE_PROCESSES = 0 (This implies conflict with the Advanced Replication Option, which typically sets parameter to 4. If parameter is non-zero, then standby mode change from read-only back to recover will fail.) Other Tips & Tricks (continued) • Finding which archive logs are need to fill the “gaps” seems to be unduly complicated. Upon starting recovery, the standby database will request a particular log, so why bother figuring it out? • IPC network connection parameters for tnsnames.ora file are “pickier” in 8i. Now the “key” value must match on client and server. [Relevant only to where primary and standby are on the same server] Useful References • Oracle Magazine, May/June 1999, “Implementing an Automated Standby Database,” by Roby Sherman. • Oracle Corporation, Oracle 8i Standby Database Concepts and Administration Release 2 (8.1.6) Contact Information Chris Lawson clawson@dbspecialists.com http://www.dbspecialists.com Database Specialists, Inc. 388 Market Street, Suite 400 San Francisco, CA 94111