Structures Used for Database Recovery

advertisement
Structures Used for Database Recovery
Several structures of an Oracle database safeguard data against possible failures. This
section introduces each of these structures and its role in database recovery.
Database Backups
A database backup consists of backups of the physical files (all datafiles and a control
file) that constitute an Oracle database. To begin media recovery after a media failure,
Oracle uses file backups to restore damaged datafiles or control files. Replacing a current,
possibly damaged, copy of a datafile, tablespace, or database with a backup copy is called
restoring that portion of the database.
Oracle offers several options in performing database backups, including:




Recovery Manager
operating system utilities
Export utility
Enterprise Backup Utility
Additional Information:
See Oracle8i Backup and Recovery Guide.
The Redo Log
The redo log, present for every Oracle database, records all changes made in an Oracle
database. The redo log of a database consists of at least two redo log files that are
separate from the datafiles (which actually store a database's data). As part of database
recovery from an instance or media failure, Oracle applies the appropriate changes in the
database's redo log to the datafiles, which updates database data to the instant that the
failure occurred.
A database's redo log can consist of two parts: the online redo log and the archived redo
log.
The Online Redo Log
Every Oracle database has an associated online redo log. The Oracle background process
LGWR uses the online redo log to immediately record all changes made through the
associated instance. The online redo log consists of two or more pre-allocated files that
are reused in a circular fashion to record ongoing database changes.
The Archived (Offline) Redo Log
Optionally, you can configure an Oracle database to archive files of the online redo log
once they fill. The online redo log files that are archived are uniquely identified and make
up the archived redo log. By archiving filled online redo log files, older redo log
information is preserved for operations such as media recovery, while the pre-allocated
online redo log files continue to be reused to store the most current database changes.
Datafiles that were restored from backup, or were not closed by a clean database
shutdown, may not be completely up to date. These datafiles must be updated by
applying the changes in the archived and/or online redo logs. This process is called
recovery.
See "Database Archiving Modes" for more information.
Rollback Segments
Rollback segments are used for a number of functions in the operation of an Oracle
database. In general, the rollback segments of a database store the old values of data
changed by ongoing transactions (that is, uncommitted transactions).
Among other things, the information in a rollback segment is used during database
recovery to "undo" any "uncommitted" changes applied from the redo log to the datafiles.
Therefore, if database recovery is necessary, the data is in a consistent state after the
rollback segments are used to remove all uncommitted data from the datafiles.
Control Files
In general, the control file(s) of a database store the status of the physical structure of the
database. Certain status information in the control file (for example, the current online
redo log file, the names of the datafiles, and so on) guides Oracle during instance or
media recovery.
See "Control Files" for more information.
Rolling Forward and Rolling Back
Database buffers in the buffer cache in the SGA are written to disk only when necessary,
using a least-recently-used algorithm. Because of the way that the DBWn process uses
this algorithm to write database buffers to datafiles, datafiles might contain some data
blocks modified by uncommitted transactions and some data blocks missing changes
from committed transactions.
Two potential problems can result if an instance failure occurs:

Data blocks modified by a transaction might not be written to the datafiles at
commit time and might only appear in the redo log. Therefore, the redo log
contains changes that must be reapplied to the database during recovery.

After the roll forward phase, the datafiles may contain changes that had not been
committed at the time of the failure. These uncommitted changes must be rolled
back to ensure transactional consistency. These changes were either saved to the
datafiles before the failure, or introduced during the roll forward phase.
To solve this dilemma, two separate steps are generally used by Oracle for a successful
recovery of a system failure: rolling forward with the redo log (cache recovery) and
rolling back with the rollback segments (transaction recovery).
The Redo Log and Rolling Forward
The redo log is a set of operating system files that record all changes made to any
database buffer, including data, index, and rollback segments, whether the changes are
committed or uncommitted. Each redo entry is a group of change vectors describing a
single atomic change to the database. The redo log protects changes made to database
buffers in memory that have not been written to the datafiles.
The first step of recovery from an instance or disk failure is to roll forward, or reapply all
of the changes recorded in the redo log to the datafiles. Because rollback data is also
recorded in the redo log, rolling forward also regenerates the corresponding rollback
segments. This is called cache recovery.
Rolling forward proceeds through as many redo log files as necessary to bring the
database forward in time. Rolling forward usually includes online redo log files and may
include archived redo log files.
After roll forward, the data blocks contain all committed changes. They may also contain
uncommitted changes that were either saved to the datafiles before the failure, or were
recorded in the redo log and introduced during roll forward.
Rollback Segments and Rolling Back
Rollback segments record database actions that should be undone during certain database
operations. In database recovery, rollback segments undo the effects of uncommitted
transactions previously applied by the rolling forward phase.
After the roll forward, any changes that were not committed must be undone. After redo
log files have reapplied all changes made to the database, then the corresponding rollback
segments are used. Rollback segments are used to identify and undo transactions that
were never committed, yet were either saved to the datafiles before the failure, or were
applied to the database during the roll forward. This process is called rolling back or
transaction recovery.
Figure 32-1 illustrates rolling forward and rolling back, the two steps necessary to
recover from any type of system failure.
Figure 32-1 Basic Recovery Steps: Rolling Forward and Rolling Back
Oracle can roll back multiple transactions simultaneously as needed. All transactions
system-wide that were active at the time of failure are marked as DEAD. Instead of
waiting for SMON to roll back dead transactions, new transactions can recover blocking
transactions themselves to get the row locks they need.
Improving Recovery Performance
When a database failure occurs, rapid recovery is very important in most situations.
Oracle provides a number of methods to make recovery as quick as possible, including:



Parallel Recovery
Fast-Start Recovery
Transparent Application Failover
Performing Recovery in Parallel
Recovery reapplies the changes generated by several concurrent processes, and therefore
instance or media recovery can take longer than the time it took to initially generate the
changes to a database. With serial recovery, a single process applies the changes in the
redo log files sequentially. Using parallel recovery, several processes simultaneously
apply changes from redo log files.
Attention:
Oracle8i provides limited parallelism with Recovery Manager; the
Oracle8i Enterprise Edition allows unlimited parallelism. See Getting
to Know Oracle8i for more information about the features available in
Oracle8i and Oracle8i Enterprise Edition.
Parallel recovery can be performed using three methods:




Parallel recovery can be performed manually by spawning several Oracle
Enterprise Manager sessions and issuing the RECOVER DATAFILE command
on a different set of datafiles in each session. However, this method causes each
Oracle Enterprise Manager session to read the entire redo log file.
You can use the Recovery Manager's RESTORE and RECOVER commands to
automatically parallelize all stages of recovery. Oracle uses one process to read
the log files sequentially and dispatch redo information to several recovery
processes, which apply the changes from the log files to the datafiles. The
recovery processes are started automatically by Oracle, so there is no need to use
more than one session to perform recovery. There are also some initialization
parameters to set for automatic parallel recovery. Refer to the Oracle8i Parallel
Server Setup and Configuration Guide for details.
You can use the SQL*Plus RECOVER command to perform parallel recovery.
Refer to the SQL*Plus User's Guide and Reference for details.
You can use the SQL command ALTER DATABASE RECOVER to perform
parallel recovery but this is not recommended.
Situations That Benefit from Parallel Recovery
In general, parallel recovery is most effective at reducing recovery time when several
datafiles on several different disks are being recovered concurrently. Crash recovery
(recovery after instance failure) and media recovery of many datafiles on many different
disk drives are good candidates for parallel recovery.
The performance improvement from parallel recovery is also dependent upon whether the
operating system supports asynchronous I/O. If asynchronous I/O is not supported,
parallel recovery can dramatically reduce recovery time. If asynchronous I/O is
supported, the recovery time may only be slightly reduced by using parallel recovery.
Additional Information:
See your operating system documentation to determine whether the
system supports asynchronous I/O.
Recovery Processes
In a typical parallel recovery situation, one process is responsible for reading and
dispatching redo entries from the redo log files. This is the dedicated server process that
begins the recovery session. The server process reading the redo log files enlists two or
more recovery processes to apply the changes from the redo entries to the datafiles.
Figure 32-2 illustrates a typical parallel recovery session.
Figure 32-2 Typical Parallel Recovery Session
In most situations, one recovery session and one or two recovery processes per disk drive
containing datafiles needing recovery is sufficient. Recovery is a disk-intensive activity
as opposed to a CPU-intensive activity, and therefore the number of recovery processes
needed is dependent entirely upon how many disk drives are involved in recovery. In
general, a minimum of eight recovery processes is needed before parallel recovery can
show improvement over a serial recovery.
Fast-Start Recovery
Fast-Start Recovery is an architecture that reduces the time required for rolling forward
and makes the recovery bounded and predictable. It also eliminates rollback time from
recovery for transactions aborted due to system faults. Fast-Start Recovery includes:



Fast-Start Checkpointing
Fast-Start On-Demand Rollback
Fast-Start Parallel Rollback
Fast-Start Checkpointing
Fast-Start Checkpointing records the position in the redo thread (log) from which crash or
instance recovery would need to begin. This position is determined by the oldest dirty
buffer in the buffer cache. Each DBWn process continually writes buffers to disk to
advance the checkpoint position, with minimal or no overhead during normal processing.
Fast-Start Checkpointing improves the performance of crash and instance recovery, but
not media recovery.
You can influence recovery performance for situations where there are stringent
limitations on the duration of crash or instance recovery. The time required for crash or
instance recovery is roughly proportional to the number of data blocks that need to be
read or written during the roll forward phase. You can specify a limit, or bound, on the
number of data blocks that will need to be processed during roll forward. The Oracle
server automatically adjusts the checkpoint write rate to meet the specified roll-forward
bound while issuing the minimum number of writes.
You can set the dynamic initialization parameter FAST_START_IO_TARGET to limit
the number of blocks that need to be read for crash or instance recovery. Smaller values
of this parameter impose higher overhead during normal processing because more buffers
have to be written. On the other hand, the smaller the value of this parameter, the better
the recovery performance, since fewer blocks need to be recovered. The dynamic
initialization parameters LOG_CHECKPOINT_INTERVAL and
LOG_CHECKPOINT_TIMEOUT also influence Fast-Start Checkpointing.
Additional Information:
See Oracle8i Tuning for information about how to set the value of
FAST_START_IO_TARGET, and see Oracle8i Backup and Recovery
Guide for a detailed description of checkpoints.
Fast-Start On-Demand Rollback
When a dead transaction holds a row lock on a row that another transaction needs, FastStart On-Demand Rollback immediately recovers only the data block under
consideration, leaving the rest of the dead transaction to be recovered in the background.
This improves the availability of the database for users accessing data that is locked by
large dead transactions. If Fast-Start Rollback is not enabled, the user would have to wait
until the entire dead transaction was recovered before obtaining the row lock.
Fast-Start Parallel Rollback
Fast-Start Parallel Rollback allows a set of transactions to be recovered in parallel using a
group of server processes. This technique is used when SMON determines that the
amount of work it takes to perform recovery in parallel is less than the time it takes to
recovery serially.
Masking Failures with Transparent Application Failover
Rapid recovery minimizes the time data is unavailable to users, but it does not address the
disruption caused when user sessions fail. Users need to re-establish connections to the
database, and work in progress may be lost. Oracle8i Transparent Application Failover
(TAF) can mask many failures from users, preserving the state of their applications and
resuming queries that had been in progress at the time of the failure. Developers can
further extend these capabilities by building applications that leverage TAF and make all
failures, including those affecting transactions, transparent to users.
Additional Information:
See the Oracle8i Tuning for more information about Transparent
Application Failover.
Recovery Manager
Recovery Manager is a utility that manages the processes of creating backups of all
database files (datafiles, control files, and archived redo log files) and restoring or
recovering files from backups.
Additional Information:
See the Oracle8i Backup and Recovery Guide for a full description of
Recovery Manager.
Recovery Catalog
Recovery Manager maintains a repository called the recovery catalog, which contains
information about backup files and archived log files. Recovery Manager uses the
recovery catalog to automate both restore operations and media recovery.
The recovery catalog contains:



information about backups of datafiles and archive logs
information about datafile copies
information about archived redo logs and copies of them


information about the physical schema of the target database
named sequences of commands called stored scripts.
The recovery catalog is maintained solely by Recovery Manager. The database server of
the backed-up database never accesses the recovery catalog directly. Recovery Manager
propagates information about backup datafile sets, archived redo logs, backup control
files, and datafile copies into the recovery catalog for long-term retention.
When doing a restore, Recovery Manager extracts the appropriate information from the
recovery catalog and passes it to the database server. The server performs various
integrity checks on the input files specified for a restore. Incorrect behavior by Recovery
Manager cannot corrupt the database.
The Recovery Catalog Database
The recovery catalog is stored in an Oracle database. It is the database administrator's
responsibility to make such a database available to Recovery Manager. Taking backups
of the recovery catalog is also the database administrator's responsibility. Since the
recovery catalog is stored in an Oracle database, you can use Recovery Manager to back
it up.
If the recovery catalog is destroyed and no backups are available, then it can be partially
reconstructed from the current control file or control file backups.
Operation Without a Recovery Catalog
Use of a recovery catalog is not required, but is recommended. Since most information in
the recovery catalog is also available from the control file, Recovery Manager supports
an operational mode where it uses only the control file. This operational mode is
appropriate for small databases where installation and administration of another database
to serve as the recovery catalog would be burdensome.
Some Recovery Manager features are only available when a recovery catalog is used.
Additional Information:
See the Oracle8i Backup and Recovery Guide for information about
creating the recovery catalog, and about which Recovery Manager
features require use of a recovery catalog.
Parallelization
Recovery Manager can parallelize its operations, establishing multiple logon sessions and
conducting multiple operations in parallel by using non-blocking UPI. Concurrent
operations must operate on disjoint sets of datafiles.
Attention:
The Oracle8i Enterprise Edition allows unlimited parallelism. Oracle8i
can only allocate one Recovery Manager channel at a time, thus
limiting the parallelism to one stream. See Getting to Know Oracle8i
for more information about the features available with Oracle8i and
Oracle8i Enterprise Edition.
Parallelization of the backup, copy, and restore commands is handled internally by the
Recovery Manager. You only need to specify:


a list of one or more sequential I/O devices
the objects to be backed up, copied, or restored.
Recovery Manager executes commands serially, that is, it completes the previous
command before starting the next command. Parallelism is exploited only within the
context of a single command. Thus, if 10 datafile copies are desired, it is better to issue a
single copy command that specifies all 10 copies rather than 10 separate copy
commands.
Report Generation
The report and list commands provide information about backups and image copies. The
output from these commands is written to the message log file.
The report command produces reports that can answer questions such as:



what files need a backup?
what files haven't had a backup in a while?
what backup files can be deleted?
You can use the report need backup and report unrecoverable commands on a regular
basis to ensure that the necessary backups are available to perform recovery, and that the
recovery can be performed within a reasonable length of time. The report deletable
command lists backup sets and datafile copies that can be deleted either because they are
redundant or because they could never be used by a recover command.
A datafile is considered unrecoverable if an unlogged operation has been performed
against a schema object residing in the datafile.
(A datafile that does not have a backup is not considered unrecoverable. Such datafiles
can be recovered through the use of the create datafile command, provided that logs
starting from when the file was created still exist.)
The list command queries the recovery catalog and produces a listing of its contents. You
can use it to find out what backups or copies are available:




backups or copies of a specified list of datafiles
backups or copies of any datafile that is a member of a specified list of
tablespaces
backups or copies of any archive logs with a specified name and/or within a
specified range
incarnations of a specified database.
Download