Setting Up a Hot Standby Database

advertisement
Setting Up a Hot
Standby Database
Chris Lawson
Database Specialists, Inc.
www.dbspecialists.com
clawson@dbspecialists.com
Hot Standby Overview
DB1
Primary
Client
Client
Archive Logs
DB2
Standby
Read-only
Client
Read-only
Client
Hot Standby provides a way for a second
database to automatically track a primary
database.
Hot Standby Overview
(continued)
• Prior to 8i, a standby database could be created,
but without the automated features in the 8i
version.
• A hot standby database starts as a clone of the
primary, using any hot or cold backup.
• In order to “keep up” with the primary, the
standby performs two separate, ongoing tasks:
– Receive and store archive logs from the primary over
Net8.
– Apply archive logs in proper order.
Modes of Operation
The standby database has two
main modes of operation:
Recovery or Read-only
Modes of Operation: Recovery
DB1
Primary
Archive logs
(dest 1)
Net 8
Archive logs
(dest 2)
DB2
Standby
Archive logs
• Managed Recovery is the normal mode of operation. In this mode, the
standby database looks for and applies each archive log as it is
received. Once started, no DBA intervention is required.
• Manual Recovery may also be activated under some circumstances-namely whenever an archive log has been manually transferred to the
standby server and needs to be applied. In manual recovery, the DBA
starts database recovery.
Modes of Operation: Read-only
DB1
Primary
Client
Client
Archive Logs
DB2
Standby
Read-only
Client
Read-only
Client
• In read-only mode, the database is actually open to all users for
inquiries.
• The archive logs continue to be transferred over Net8, but are not yet
applied.
• Whenever the mode is changed back to recovery, log application
resumes as the standby “catches up.”
• Note: Archive logs continue to be sent from the primary to the
standby, regardless of which mode is in effect.

Advantages of Hot Standby
• It really works! Documentation is reasonably good.
• Fairly easy to set up--no special operating system or database options
required.
• No special DBA training is required (in contrast to OPS).
• Activation of standby is not complicated--but be sure to document and
test a procedure specific for your site.
• Standby database can actually be opened for queries, then return to
recovery mode. This may facilitate off-loading large reports or other
batch jobs, so that performance on primary database is not degraded.
• Standby database will track actual production very closely--it will
typically “lag” by only one archive log file, perhaps a delay of only 15
minutes or so.
 Standby database is typically on a completely separate server and file
system, providing safety if disaster strikes.
 Disadvantages of Hot Standby
• Hot standby only provides limited load-balancing because
all users (except for read-only users) must continue to use
the primary.
• In contrast, OPS (Oracle Parallel Server) or replication
allow use of multiple instances simultaneously.
• For databases with heavy transaction activity, there will be
increased network traffic due to log transfer.
• If primary server crashes, and standby database needs to be
activated, it may be impossible to access the last archive
log on the primary. These transactions will be lost.
• Smaller redo logs will minimize this loss by increasing the
frequency of log transfers.
Preliminary Setup
• Ensure primary database is in archive mode, and
correctly writes archive logs.
• Ensure temp tablespace is marked as temporary
• Make a standby control file to use as the starting
point for the standby database. e.g.,
alter database create standby controlfile as '/path';
• Copy over all .dbf files, standby control file and redo
logs from DB1 server to DB2.
Setup Primary init.ora File
• Add entries to write second set of archive logs; the
destination is not a directory, but a tns alias that
matches the standby connection.
log_archive_dest_2='SERVICE=ALIAS optional reopen= 180'
log_archive_dest_state_2=ENABLE
• Note:
– reopen=180 means wait 180 seconds before re-attempting
failed archive.
– optional means continue even if archival to second
destination fails.
Standby Control File Explanation
DB1
Primary
DB2
Standby
Control file
Control file
/u01/ data
files
/u02/ data
files
Standby expects
data files to be on
/u01, but they
aren’t !
• The primary control file cannot be used as-is, because the
control file has .dbf and redo file locations for primary.
• Instead of creating a new control file, the primary control file
is adapted for use by the standby.
• Without some type of correction, the standby will look in the
wrong location for the redo and .dbf files.
Standby Control File Explanation
Standby
control
file
.dbf location
redo log
location
log_file_name_convert
new .dbf
path
db_file_name_convert
new .dbf
path
• Several new init.ora parameters allow the standby
database to translate directory paths from where
files were located on the primary, to where they
are on the standby.
Standby Control File Explanation
(continued)
For example, if the .dbf files are on /u01 on primary,
then they could be translated to /u02 on standby:
db_file_name_convert=('/u01','/u02')
The path for redo logs is similarly translated:
log_file_name_convert=('/oradata1','/oradata2')
Configure Standby init.ora
Copy primary init.ora to standby and setup following
special parameters:
db_name=[same as primary]
lock_name_space=standby1
Needed if primary & secondary share same host
log_archive_dest_1="location=/u00/app/oracle/admin/sec/arch"
Used for manual recovery of archive logs
standby_archive_dest = /u00/app/oracle/admin/sec/arch
Typically set same as previous parameter
db_file_name_convert = ('/u01/oradata/prime','/u02/oradata/sec')
log_file_name_convert = ('/u01/prime','/u02/sec')
Corrects file locations since control file originated from primary
Prepare Standby Database
• If using password-file authentication, create password file for
standby: orapwd file=orapw[SID]
Note: Database Configuration Assistant will create init.ora file with
REMOTE_LOGIN_PASSWORDFILE=EXCLUSIVE, which implies need
for the above password file.
• Connect internal, then perform startup nomount;
• Perform alter database mount standby database;
• Set standby database in Managed (automatic) Recovery Mode
recover managed standby database;
Note: Prompt will not return; documentation suggests “run on the
main console.”
• Suggestion: Put last command above in script and run as nohup.
Checking Transfer of
Archive Logs
• When the hot standby is working properly, two things are
happening:
1. Archive logs are being transferred; and
2. These logs are being automatically applied
• On the primary database, perform
alter system switch log file;
• A new archive log should appear within a few minutes in the
standby database archive location.
• If no log appears, check the alert.log for the primary database
to check for problems connecting to the standby.
• Also check v$archive_dest to confirm that all log destinations
are enabled.
Checking Application of
Archive Logs
On the standby database, review the last portion of the
alert.log. As each log is applied, there should be a new
entry listing the log number
Media Recovery Start: Managed Standby Recovery
Media Recovery Log
Media Recovery Waiting for thread 1 seq# 465
Wed Jun 21 10:48:06 2000
Media Recovery Log /u00/app/oracle/admin/db2/arch/arch_1_465.arc
Media Recovery Waiting for thread 1 seq# 466
Wed Jun 21 10:48:22 2000
Media Recovery Log /u00/app/oracle/admin/db2/arch/arch_1_466.arc
Media Recovery Log /u00/app/oracle/admin/db2/arch/arch_1_467.arc
Media Recovery Waiting for thread 1 seq# 468
Checking Application of
Archive Logs (continued)
 If logs are not being applied, be sure that the
expected archive log exists on the standby.
 If there is a “gap,” then the log should be
manually copied to the standby server, and manual
recovery performed.
 Once the gap is “plugged,” then the automatic
recovery can be restarted.
Mode Change
The standby database mode can be switched back
and forth at will:
Switch to Read-Only Mode
• First, cancel managed recovery:
recover managed standby database cancel;
• Then, set to read-only:
alter database open read only;
Switch back to Managed Recovery
(This restarts the archive log application)
• First, confirm there are no sessions active;
• Then, resume automatic recovery:
recover managed standby database;
When Disaster Strikes:
Activating Standby Database
• Important! Opening standby database will terminate the
standby recovery process.
• Reversal back to recovery processing is NOT possible, as an
implicit resetlogs is performed upon activation.
• This is very similar to what is done in a database “clone”,
running alter database open resetlogs;
• If primary still operational, eke out last archive log using
alter system archive log current;
• Manually transfer archive log if necessary, putting in archive
destination.
• Apply as many logs as are available using manual recovery:
recover standby database;
Activating Standby Database
(continued)
• Activate standby:
alter database activate standby database;
shutdown immediate;
startup mount;
alter database open read write;
• Prepare the new database for the archive mode
(presumably).
• Take physical backup of the newly activated
database.
• Set up new standby database, using the new
physical backup.
Restarting Interrupted Log Transfer
If the standby database is briefly stopped, the archive
log transfer from the primary may be interrupted, and
the transfer error may need to be manually reset.
• Confirm standby database is once again in startup
nomount state.
• On primary, confirm error in transfer status. Note
failing dest_id:
select dest_id, status, target, error from v$archive_dest;
Restarting Interrupted Log Transfer
(continued)
DEST_ID
------1
2
STATUS
-----VALID
ERROR
TARGET
-----PRIMARY
STANDBY
DESTINATION
----------/db1/arch
db2
ERROR
------ORA-xxxx
• On primary, reset archiving error (replace 'n' with
number of failing destination).
Note: Even though reopen is specified, log transfer
appears to require resetting the error:
alter system set log_archive_dest_state_n = enable;
Restarting Interrupted Log
Transfer (continued)
• Perform log switch on primary and confirm that a
new archive log appears at standby.
• Manually transfer any missing archive logs from
primary to standby. Manually apply these logs:
recover standby database;
• Return to automatic recovery:
recover managed standby database;
Client Setup for Automatic Failover
• In tnsnames.ora, use FAILOVER parameter. When set
to ON, instructs Net8, at connect time, to fail over to a
different address if the first address fails. When set to
OFF, instructs Net8 to try one address.
net_service_name=
(description=
(failover=on)
(address=(protocol=tcp)(host=server1)(port=1521))
(address=(protocol=tcp)(host=server2)(port=1521))
(connect_data=(service_name=db1.acme.com)))
Client Setup for Automatic Failover
(continued)
• Important: Do not set the GLOBAL_DBNAME
parameter in the SID_LIST_listener_name section
of the listener.ora. A statically configured global
database name disables connect-time failover.
• Only multiple addresses (not connect_data) are
specified, thereby requiring that the standby
database(s) has the same SID or service_name.
Translation Complications
• Remember how the .dbf and log pathnames need
to be translated using two “special” init.ora
parameters.
• The standby database will look in a different
directory using the new parameters as a
“translator.”
Translation Complications
(continued)
Problem:
Files are typically not all in the same file system, but the
“translation” parameter can only translate from one directory to one
directory. How can files in the “other” directories be “fixed?”
Primary
Standby
/u01/ data
/u02/ data
/u03/ data
/u04/ data
* For example: db_file_name_convert
New parameter * tells
Standby to to look in /u02
Standby will still be
looking in /u03 for these
files
= ('/u01','/u02')
Translation Complications
(continued)
Solution:
• On the (mounted) standby database, prior to beginning
recovery, manually correct the file names that are not
covered by the two init.ora parameters.
Primary
Standby
/u01/ data
/u02/ data
Parameter corrects these files
/u03/ data
/u04/ data
Manually rename to '/u04…'
Translation Complications
(continued)
• For .dbf file, simply rename; for redo log, drop the
group, then add group back into desired directory:
.dbf file:
alter database rename file '/u03/user01.dbf'
to '/u04/user01.dbf';
redo log:
alter database drop logfile group 5;
alter database add logfile group 5
'/u05/redo05.log' size 20m;
Adding Datafiles to Primary Database
• Adding a datafile to the primary database generates
redo that adds the datafile name only to the standby
control file; the datafile must still be explicitly added
to the standby database.
• The solution is simple, but not intuitive; so carefully
review and test these special cases.
• First, add datafile to primary database as usual.
• Then, switch redo logs on the primary database to
initiate redo archival to the standby database.
Adding Datafiles to Primary Database
(continued)
• Recovery on the standby database will stop because the
datafile does not exist. Standby alert log:
WARNING! Recovering datafile 2 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup
command. Successfully added datafile 2 …
• To resolve, create the datafile on the standby database:
alter database
create datafile '/u02/oradata/test.dbf'
as '/u02/oradata/test.dbf';
• Place the standby database in managed recovery mode:
recover managed standby database;
Other Tips & Tricks
• The documented method of connecting to standby appears
to be impossible.
Resolution: Just use the usual way to connect to an
idle instance:
connect internal or connect / as sysdba
• Ensure that the init.ora parameter,
JOB_QUEUE_PROCESSES = 0
(This implies conflict with the Advanced Replication
Option, which typically sets parameter to 4. If parameter is
non-zero, then standby mode change from read-only back
to recover will fail.)
Other Tips & Tricks
(continued)
• Finding which archive logs are need to fill the “gaps”
seems to be unduly complicated. Upon starting
recovery, the standby database will request a particular
log, so why bother figuring it out?
• IPC network connection parameters for tnsnames.ora
file are “pickier” in 8i. Now the “key” value must
match on client and server. [Relevant only to where
primary and standby are on the same server]
Useful References
• Oracle Magazine, May/June 1999, “Implementing
an Automated Standby Database,” by Roby
Sherman.
• Oracle Corporation, Oracle 8i Standby Database
Concepts and Administration Release 2 (8.1.6)
Contact Information
Chris Lawson
clawson@dbspecialists.com
http://www.dbspecialists.com
Database Specialists, Inc.
388 Market Street, Suite 400
San Francisco, CA 94111
Download