Leveraging EMC CLARiiON
Storage Replication to Offload
Oracle Recovery Manager (RMAN) Backup
Applied Technology
Abstract
Oracle Recovery Manager (RMAN) incremental backup allows very large databases to be backed up online
very efficiently. However, running RMAN alongside production activities can impact production service
levels. One method to offload RMAN backups, including fast incremental backups with Block Change
Tracking (BCT), is to leverage a storage system’s rapid point-in-time replication technology. This white
paper covers the procedure tested by EMC® CLARiiON® engineering to achieve this.
August 2008
Copyright © 2008 EMC Corporation. All rights reserved.
EMC believes the information in this publication is accurate as of its publication date. The information is
subject to change without notice.
THE INFORMATION IN THIS PUBLICATION IS PROVIDED “AS IS.” EMC CORPORATION
MAKES NO REPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO THE
INFORMATION IN THIS PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Use, copying, and distribution of any EMC software described in this publication requires an applicable
software license.
For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on EMC.com
All other trademarks used herein are the property of their respective owners.
Part Number H5681
Leveraging EMC CLARiiON Storage Replication to Offload Oracle RMAN Backup
Applied Technology
2
Table of Contents
Executive summary ............................................................................................ 4
Introduction ......................................................................................................... 4
Audience ...................................................................................................................................... 4
Terminology ................................................................................................................................. 5
Overview.............................................................................................................. 5
The RMAN offload process ................................................................................ 6
Offload procedure testing .................................................................................. 7
Production host environment ....................................................................................................... 7
Logical Oracle data storage layout on the production host...................................................... 8
RMAN backup offload server host environment .......................................................................... 8
Logical Oracle data storage layout on the backup host ........................................................... 9
Testing focus................................................................................................................................ 9
Test workload............................................................................................................................... 9
Test procedure............................................................................................................................. 9
Initial production database setup ............................................................................................. 9
Initial backup database setup................................................................................................. 10
Enable BCT tracking .............................................................................................................. 11
Test Phase 1: Establish baseline............................................................................................... 11
Test Phase 2: Exercise process to perform offloaded RMAN incremental backup................... 12
Begin hot backup.................................................................................................................... 12
Perform a clone split for database files .................................................................................. 12
End hot backup ...................................................................................................................... 13
Switch logs ............................................................................................................................. 13
Execute Block Change Tracking Switch ................................................................................ 13
Create control file copies........................................................................................................ 13
Perform a clone split for the redo and archive clone group ................................................... 13
Resynchronize the RMAN catalog ......................................................................................... 14
Leverage split clones to perform offloaded incremental backup............................................ 14
Start the ASM instance........................................................................................................... 14
Mount the database instance ................................................................................................. 14
Back up the database instance .............................................................................................. 15
Test Phase 3: Validating correct restore on the production host from offloaded incremental
backups...................................................................................................................................... 15
Restore procedure.................................................................................................................. 16
Verify correct restore/recovery ............................................................................................... 16
Test Phase 4: Analysis of offloading process effectiveness...................................................... 17
Testing observations/findings ......................................................................... 17
Verifying that BCT driven incremental backup is offloaded ....................................................... 20
Verification procedure ................................................................................................................ 21
Conclusion ........................................................................................................ 22
References ........................................................................................................ 22
Leveraging EMC CLARiiON Storage Replication to Offload Oracle RMAN Backup
Applied Technology
3
Executive summary
Oracle Recovery Manager (RMAN) is a comprehensive and powerful server-managed tool for backing up
Oracle databases online, and managing database recovery when needed. As databases continue to grow
larger in size, incremental backups, where only database changes since the last successfully completed
backup are actually captured, are becoming increasingly popular. The ability to perform incremental
backups has been steadily enhanced beginning with Oracle 9i through Oracle 11g. However, even with
incremental backups, a backup operation can still be quite time consuming, with production performance
impacted for the duration that RMAN backup is running. Hence, it would still be advantageous to be able
to offload the RMAN backup overhead from the production database host. The offloading of the backup
process can be achieved by leveraging a split mirror of the database, created by the underlying storagebased replication technology of EMC® CLARiiON® systems. The offloaded backups can be cataloged and
maintained by RMAN to achieve the most effective database recovery.
This white paper covers the general approach of how to utilize EMC CLARiiON replication technology to
offload the RMAN backup process, and the associated benefits that can be expected. In particular, the
paper covers how the process can be extended to incorporate the use of the Block Change Tracking feature
introduced in Oracle Database 10g to significantly enhance the speed for performing RMAN incremental
backups.
Introduction
Since the introduction at Oracle 8, the Recovery Manager (RMAN) utility had been continually enhanced
through Oracle Database 9i, 10g, and 11g.
Generally, RMAN runs alongside the production database service, competing with other workloads on the
same database. This can result in both prolonging the time required to complete the backup task and
performance impact to other database users.
By utilizing CLARiiON storage-based replication technology with RMAN online backups, a point-in-time
snapshot of the production database content can be quickly captured and mounted to a different server,
leveraging a secondary Oracle database instance where the actual RMAN backup operations can be
performed.
The process of offloading the RMAN backup operation to a split mirror created by storage-based
replication has been covered in past joint EMC/Oracle white papers. Those papers are listed in the
“References” section.
This paper specifically focuses on offloading incremental backups, which is a refinement of the general
offloading process. In Oracle Database 10g, a new Block Change Tracking (BCT) mechanism had been
added as an administrator-enabled database operational mode
With BCT enabled, all database blocks changed since the last RMAN backup will be tracked. When a new
incremental backup is performed, RMAN will be able to selectively back up only those database blocks
marked as changed since the last backup. This can significantly reduce the time needed to scan a large
database looking for changed blocks.
While the focus is on offloading the incremental RMAN backup in conjunction with the ability to leverage
BCT, the general methodology of offloading is generically applicable for all types of RMAN backups,
including the subsequent registering of the backup file in the RMAN catalog, to be used on the production
instance should an RMAN managed database restore/recovery become necessary.
Audience
This white paper covers the procedural steps that can be followed to implement the operational process as
stated. EMC field personnel supporting customers with Oracle deployments, database administrators, and
storage administrators responsible for supporting their own operations involving Oracle database
Leveraging EMC CLARiiON Storage Replication to Offload Oracle RMAN Backup
Applied Technology
4
application deployments can evaluate if the procedure as discussed in this paper would be suitable as an
extension or enhancement to their current operational practice.
Terminology
ASM — Automatic Storage Management, a logical volume management mechanism for holding and
managing Oracle data.
ASM diskgroup — A logical volume group consisting of OS defined disks devices. This diskgroup will be
used to house database files
BCT — Block Change Tracking file, an enhancement first introduced in Oracle Database 10g that can be
optionally enabled to improve efficiency for performing incremental backups using RMAN.
Clones — See SV clone below.
RMAN — Recovery Manager, Oracle’s primary facility for managing database backups and recoveries.
SV — SnapView™, CLARiiON array application software that allows users to create/manage/manipulate
bitwise replicas of LUNs managed by the storage system quickly, irrespective of the actual size of the
source LUN data involved.
SV clone — A LUN maintained by the SnapView software application running inside the CLARiiON
system to form a bitwise content reflection (mirror) of another LUN, known as the source LUN, that may
be used and changed actively by applications from servers attached to and using that particular source
LUN.
Overview
This white paper assumes that the readers are already familiar with the use of Oracle Recovery Manager
(RMAN), including the concept of enabling and using Block Change Tracking (BCT) to improve the
efficiency of performing incremental backup with RMAN. Readers needing more details on the specifics
of RMAN should refer to the Oracle Recovery Manager Concept and Recovery Manager Administration
Guide, as well as other related material that can be found at:
http://www.oracle.com/technology/documentation/index.html.
The database itself does not have to be explicitly opened. It can be in a mounted state.
RMAN backups can also be performed directly against an opened database being actively accessed and
modified by concurrent application usage.
However, there is a potential performance impact when running RMAN against a database with heavy user
activity taking place concurrently. It will also compete for server CPU, I/O, and memory resources against
the foreground user application processes. This has both the negative impact of elongating the RMAN
task, causing the backup to take longer to complete, while extending the time duration that normal
production service level will be compromised.
By leveraging storage managed “fast” replication techniques, such as CLARiiON SnapView support, it
becomes viable to leverage a point-in-time storage replica of the database, redirecting it to a separate server
(and database instance), where the RMAN backup can be done as an isolated activity. This avoids the
production user transaction service level impact for the purpose of conducting the RMAN backup activity,
as well as expediting the actual time to perform the backup, by offloading the backup process to take
advantage of additional server and storage resources.
As database sizes continue to grow, disk-based backups, and more frequent use of incremental backups as
opposed to full backups, become key trends.
Leveraging EMC CLARiiON Storage Replication to Offload Oracle RMAN Backup
Applied Technology
5
Block Change Tracking (BCT), which improves the efficiency of RMAN incremental backup, was first
introduced in Oracle Database 10g. With the growing acceptance for disk-based backup through the Oracle
managed Flash Recovery Area, the use of the BCT feature to facilitate incremental backup (especially to
the disk-based backup areas) is also becoming increasingly popular.
If RMAN is run directly against the production database, at the point the RMAN activity is invoked, the
database engine will freeze the current BCT map, and switch to a new BCT map to track further changes
that may be occurring while the current RMAN incremental backup is being done. The switched off map is
then used to drive RMAN to pick up the correct set of database pages for the purpose of creating the
incremental backup set with pages changed since the last completed RMAN backup of the database.
To leverage storage-based replication support to offload the RMAN process, it is necessary, as part of the
offloading procedure, to ensure that the BCTs are switched appropriately in the production instance before
we leverage the storage replica for offloading, since we will not be running RMAN backup directly against
the production instance. Oracle provides an explicit administration function that can be called to achieve
this. The offload procedure has to include this BCT map switch call against the production system prior to
the storage split for offloading purpose. By explicitly switching the tracking map prior to splitting the
storage images and offloading the RMAN backup at midnight on Monday, for example, all new changes
during Monday against the production will be reflected on the new map. So, when we get to midnight on
Tuesday, and are ready to repeat the offload process, the split-off storage replica will correctly include the
tracking map segment reflecting the block changes between Monday and Tuesday midnight.
This paper outlines the procedural steps that have been tested and verified as functioning correctly in the
CLARiiON engineering labs. Specifically, the process as reported in the following sections includes the
steps to perform the proper BCT map switching, and the use of the BCT map to drive the offloaded RMAN
process to perform the efficient incremental backup.
As the main purpose of running Oracle Managed Backup using RMAN is the relative ease and efficiency in
restoring and recovering the production database in the event of a failure, the testing conducted to support
this paper included the verification of database restore/recovery from the backups generated from the
offload process.
This paper also covers the relative effectiveness comparison of having the RMAN task offloaded as
opposed to running the same backup directly against the production database while it is actively supporting
database user activity. A relative comparison of the offloaded incremental backup generation with and
without relying on the BCT option is also included.
The RMAN offload process
Figure 1 is a logical flow diagram of the procedural steps involved in the offload process. The process flow
assumes that the production database is never shut down and therefore remains available throughout the
process.
The procedural steps are exercised in the order as numbered. The actual details of DBA SQL and OS or
storage commands used are described in the following sections.
Leveraging EMC CLARiiON Storage Replication to Offload Oracle RMAN Backup
Applied Technology
6
Production Oracle
Database Instance
1
2
Begin DB
Hot backup
End DB
Hot Backup
4
Switch
BCT
6
7
Database
clone
CLARiiON
DB clone
split
3
5
Production
database
Two copies of
backup
controlfiles
ASM instance
start, mount DB
clone, FRA,
ARCHIVE
8
Mount Cloned
database with
FRA, cloned
archived logs
9
BACKUP
file-set
diskgroup
FLASH
RECOVERY
AREA
10
Catalog backup in
RMAN catalog
11
FRA
cloned area
Archive
current log
Split FRA
clone
Run RMAN
incremental
backup to
BACKUP
diskgroup
RMAN
catalog
Offload Backup
Oracle
Database
Figure 1. Logical flow of RMAN incremental backup offload process
Offload procedure testing
The following sections detail the environment under which the engineering testing of the procedure was
conducted, and the specific samples of Oracle database administration commands (SQL commands),
CLARiiON storage management CLI commands, RMAN commands, ASM commands, and other OS
commands used to implement and exercise the process. Most of the commands are generic to OS platforms.
The OS platform-specific commands can be adopted for the actual deployment platform as appropriate.
Production host environment
•
Dell 2650 servers with:
ƒ 2 x 3.2 GHz Xeon CPUs with hyper-threading
ƒ 1 GB L1 cache
Leveraging EMC CLARiiON Storage Replication to Offload Oracle RMAN Backup
Applied Technology
7
•
•
•
•
•
ƒ 8 GB RAM
RHEL 4.0 Update 4 (2.6.9-42) kernel for IA32 architecture
Dual port QLA2462 4 Gb FC HBA (both ports used for HA support)
Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production
EMC PowerPath® 5.0 b157 for RHEL 4.0 Linux for IA32 systems
EMC CLARiiON CX3-80 with:
ƒ FLARE® pre-release builds of 26, build bundle 03.26.080.5.005
ƒ 146 GB FC drives at 15k rpm
Logical Oracle data storage layout on the production host
ASM diskgroup name
RMDATA
RMBACKUP
RMFLASH
RMREDO
RMCAT
CX3 LUN name
RMDATA1
RMDATA2
RMDATA3
RMBACK1
RMFLASH1
RMREDO1
RMREDO2
RMCAT1
RAID type and size
4+1R5, 50 GB
4+1R5, 50 GB
4+1R5, 50 GB
4+1R5, 150 GB
4+1R5, 50 GB
4+1R5, 10 GB
4+1R5, 10 GB
4+1R5, 150 GB
Attributes
DB data source
DB data source
DB data source
Backup files store
Flash_recovery_area
Redo log area
Redo log area
RMAN catalog database
Note that a separate disk group, RMBACKUP, is dedicated to hold backup files created, as opposed to
having the backup file sets created by default into the Flash Recovery Area. The reason this has to be done
is that when we try to offload the RMAN backup task to be performed by another Oracle instance on the
backup server, that instance is really a distinct instance, not part of a clustered service instance with the
production instance. ASM only allows ASM disk groups to be share mounted concurrently by database
instances that are co-operating as part of a cluster. When the two servers and ASM instances are not
configured as part of the cluster, the ASM instances on the different server would prevent simultaneous
mounting of the same set of storage for use as an ASM group.
So, if the production instance is using the RMFLASH group for its activities, such as creating archived
logs, flashback logs, and so on, it would not be possible for the backup instance to try to mount up the same
set of storage as the Flash_Recovery_Area disk group (RMFLASH), and to write backup file sets into it as
part of the RMAN activity.
By isolating the backup file sets to use an ASM separate disk group, this ASM group can be alternatively
mounted on either the backup server (when the RMAN backup offloading is performed), or back to the
production server, when it becomes necessary to perform a restore/recovery action on the production
server. The backup file set will be stored within this disk group, and the file path names would be
appropriately recorded in the RMAN catalog.
RMAN backup offload server host environment
•
•
•
•
•
•
Dell 2650 servers with:
ƒ 2 x 3.2 GHz Xeon CPUs with hyper-threading
ƒ 1 GB L1 cache
ƒ 8 GB RAM
RHEL 4.0 Update 4 (2.6.9-42) kernel for IA32 architecture
Dual port QLA2462 4 Gb FC HBA (both ports used for HA support)
Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production
EMC PowerPath 5.0 b157 for RHEL 4.0 Linux for IA32 systems
Shared (with production) EMC CLARiiON CX3-80 with:
ƒ FLARE pre-release builds of 26, build bundle 03.26.080.5.005
ƒ 146GB FC drives at 15k rpm
Leveraging EMC CLARiiON Storage Replication to Offload Oracle RMAN Backup
Applied Technology
8
Logical Oracle data storage layout on the backup host
ASM diskgroup name
RMDATA
RMBACKUP
RMFLASH
RMREDO
CX3 LUN name
RMDATA1_clone
RMDATA2_clone
RMDATA3_clone
(RMBACK1)
RMFLASH1_clone
RAID type and size
4+1R5, 50 GB
4+1R5, 50 GB
4+1R5, 50 GB
(4+1R5, 150 GB )
4+1R5, 50 GB
RMREDO1_clone
RMREDO2_clone
4+1R5, 10 GB
4+1R5, 10 GB
Attributes
DB data clone
DB data clone
DB data clone
Backup file set store
Flash Recovery Area
clone
Redo log area clone
Redo log area clone
Note that the backup instance uses the same ASM diskgroup RMBACKUP from the same physical storage
as the production instance. Theoretically, an alternative approach is to create a separate storage clone for
the RMBACKUP group, mount the clone to the backup instance, let the RMAN backup write the backup
file set into the storage clone, and then leverage storage clone reverse synchronization to update the
RMBACKUP production disk group content. But to do this reliably and safely, the production source for
RMBACKUP and the storage clone on the backup server still would likely need to be closed and
unmounted from both servers. In that case, it is easier just to close and unmount the RMBACKUP group
from the host (production or server) that currently is not using it, and mount it back up on the server that
actually needs to read from or write into the RMBACKUP group when the need arises.
Testing focus
The primary testing focus was to validate the correctness of the offload procedure. In particular, the ability
to leverage BCT to ensure the most effective incremental backup as being viable for the RMAN task run
against the offloaded database storage replicated set was key.
Correctness was deemed satisfied when a subsequent RMAN database recovery task could be run directly
against the production database, using the backup file sets created through the offload process, and
verifying that the content of the database as restored and recovered in fact contained the expected data
content. The views were also examined to ensure that incremental backups were actually occurring.
Since the main purpose of going through the offloading process is to minimize ongoing production service
in order to perform the necessary operational backups, as auxiliary goals, production test workload
performance impact comparisons were done to determine the tradeoffs between running the RMAN backup
task directly against the production database compared to the offloading process.
Also, the time to actually execute the same type of RMAN backups, including BCT-enabled incremental
backups, between doing the backups directly, versus having that done on the offloaded system, were
collected and compared.
Test workload
A TPC-C like OLTP workload was used as the test workload to drive continual database changes into the
“production” database in conjunction with the attempt to run the RMAN backup, either directly from the
production host, or with the RMAN task offloaded to the backup host.
Test procedure
The following are the step-by-step details of the tests conducted to validate the offloading process.
Initial production database setup
Five distinct ASM diskgroups were created from the LUNs provisioned from the CX3-80 system.
Leveraging EMC CLARiiON Storage Replication to Offload Oracle RMAN Backup
Applied Technology
9
In this test scenario, the RMDATA group holds the production database with about 25 GB of operational
data.
The RMBACKUP group is configured to be used to keep all the database RMAN backup sets created.
Normally, this group is not mounted on the production instance. It is only mounted manually when there is
a need to access any backup file sets from this disk group in order to perform restore/recovery of lost
database files.
The RMREDO group holds the redo logs.
The RMFLASH group is used as the FLASH_RECOVERY_AREA. Archived logs are kept, in addition to
control file backups, in the BCT table files.
The RMCAT group is used to hold the RMAN catalog, which records what has been backed up.
After the database was populated with the test workload data, an initial full database backup was captured
by running an RMAN full database backup. The full backup was cataloged in the RMAN catalog.
Clones for LUNs making up the RMDATA, RMREDO, and RMFLASH were initially synchronized in the
CX3-80 in preparation for conducting the rest of the test steps.
Initial backup database setup
Three clones of the LUNs that form the RMDATA production data group were exposed to the server used
to perform the backup offloading. The clones of the RMREDO group, the RMFLASH group, and the
source LUN forming the RMBACKUP group from the production database were also visible on the backup
server. This set of clones provides a point-in-time storage image of the production LUNs that will be used
to support the offloading of the backup functions.
After the clones had been fully content synchronized with the production source LUN content, the clones
were storage split from their source LUNs. Once clones were split off from their production source LUNs,
they became storage LUNs on the backup server.
When the ASM instance on the backup server was started up, ASM correctly identified and mounted up the
following diskgroups: RMDATA (from the data clones), RMREDO (from the redo LUN clone), and
RMFLASH (clones of the production Flash_Recovery_Area, which also holds the archived logs and other
files required, such as copies of backup controlfiles, the BCT file itself). In our ASM instance
configuration, the ASM instance will try to automatically mount up the RMDATA, RMREDO, and
RMFLASH groups upon startup. However, RMBACKUP is excluded from the list of ASM groups to be
automatically mounted, as we have to manually coordinate the use of this group between the production
and backup servers, each needing to mount the group from the same set of storage exclusively when the use
of the group is required. We will use RMAN catalog on the backup server directly from production server
by inserting the catalog address in the backup server’s tnsnames.ora file.
CATDB =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = ProductionhostAddress )(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = catdb.ProductionhostAddress )
)
)
The clone of the production RMFLASH is leveraged to provide access to the needed backup control file
copies, the BCT map files, and the archived logs.
Flashback logging may have been enabled on production, and flashback logs will be present in RMFLASH
clones. For the backup instance which is activated for the main purpose of supporting the running of the
RMAN task, flashback logging is not enabled and used for the backup instance. Flashback database should
not be done on the backup instance after the database has been mounted on the backup instance.
Leveraging EMC CLARiiON Storage Replication to Offload Oracle RMAN Backup
Applied Technology
10
When necessary (such as when we have to perform a direct restore on the production server), the
RMBACKUP group would be manually remounted on the production server (after unmounting it from the
backup server). Since the ASM path and file name are correctly cataloged by RMAN from the backup, the
needed files can therefore be correctly restored directly from the RMBACKUP group once it is remounted
on the production server, allowing the restore function to be performed correctly.
Enable BCT tracking
Enable BCT for the production database by executing the following commands:
# export ORACLE_SID=oastoltp
# sqlplus /nolog
connect / as sysdba;
SQL> ALTER DATABASE ENABLE BLOCK CHANGE TRACKING USING FILE
‘+RMFLASH/rman_change_track.f’ REUSE;
The REUSE option tells Oracle to overwrite any present file with the offered name. To determine whether
change tracking is enabled, you can query V$BLOCK_CHANGE_TRACKING.STATUS.
As shown by the above command, create tracking file in the RMFLASH disk group.
With BCT tracking enabled, RMAN incremental backup leverages the map to optimize the work required,
accessing and backing up only the database pages that have been changed by committed transactions since
the last successful database backup point.
Note that every time Block Change Tracking is disabled, and then re-enabled, a completely new set of
maps will be established. To be able to reap the full benefit of being able to minimize the number of pages
being scanned to support ongoing RMAN incremental backup runs, BCT should be left enabled for the
production database unless the added overhead so adversely impacts the production service level that it is
unacceptable to do so.
Test Phase 1: Establish baseline
The test phases began with establishing a baseline for performing a direct incremental backup against the
test database that was being updated at a steady rate by an OLTP workload.
The key metrics collected in these phases included:
•
The observed workload transaction throughput input while the incremental backup was executing
• The time taken to complete that particular incremental backup
For our testing, an OLTP test workload developed by the Oracle internal stress testing QA group called the
Oracle Automated Stress Test (OAST) was used.
An OAST test session was started up to execute for 60 minutes, with the database already running with
BCT enabled. After 15 minutes into the execution of OAST, a direct RMAN incremental backup on the
production instance was performed. The test steps executed were as follows:
Go to the $OAST_HOME (/opt/app/oracle/db_1/oast/home/) location.
$ ./nrunoastoltp50.sh –n OLTP –u 150 –t 3600
where options
-n = Name of directory where you want to save results
-u = number of users
-t = time duration of operation
After 15 minutes, start RMAN backup using the following commands:
Leveraging EMC CLARiiON Storage Replication to Offload Oracle RMAN Backup
Applied Technology
11
$ rman TARGET / CATALOG rman/rman@catdb
Recovery Manager: Release 11.1.0.6.0 - Production on Tue Sep 18 15:38:12 2007
Copyright (c) 1982, 2005, Oracle. All rights reserved.
connected to target database: TPCC (DBID=3136004487)
connected to recovery catalog database
RMAN> RUN {
RECOVER COPY OF DATABASE WITH TAG ‘incr_update’;
BACKUP INCREMENTAL LEVEL 1 FOR RECOVER OF COPY WITH TAG ‘incr_update’
DATABASE;
}
Record the elapsed time required to complete the incremental backup action.
Capture the OLTP transaction rate per minute running data for the 60 minutes of run duration, noting
specifically the time region when the RMAN backup was running. Mark the transaction throughput
degradation observed during the time period when the RMAN backup is executing. This establishes the
operational effectiveness baseline to which the offloading process will be eventually compared against.
Shut down the production database. Perform a database RESTORE to tag ‘incr_update’. This causes the
database to be restored first to the level 0 base copy of the database. Then the incremental level 1 backup
would be automatically applied as part of the RMAN RESTORE DATABASE process.
Verify that the database content is in fact restored as expected to the point where the incremental backup
was generated. The expected SCN to which the restored/recovered database should be consistent with the
last SCN for the level 1 backup captured and registered in the RMAN catalog.
Test Phase 2: Exercise process to perform offloaded RMAN
incremental backup
The procedure for performing the offloading and the backup from the backup server to avoid impact to the
production instance was exercised as follows.
Shut down the production database. Delete the RMAN incremental backup set from RMBACKUP and
remove its record from the RMAN catalog. Restore the database to the initial setup state with the level 0
full backup.
Restart the database. Make sure that BCT is still enabled. (If not, re-enable BCT using the procedure as
described in the previous section “Enable BCT tracking.”)
Run the same OLTP workload against the restored database for 60 minutes again. Then, about 15 minutes
into the run, invoke the following sequence of operational steps on the production server:
Begin hot backup
SQL> alter database begin backup;
Perform a clone split for database files
# naviseccli –h array1
-user {admin_user} –password {admin_password} snapview
Leveraging EMC CLARiiON Storage Replication to Offload Oracle RMAN Backup
Applied Technology
12
-consistentfractureclones –ClonGroupNameCloneId
CL_RMDATA1 0000000001 CL_RMDATA2 0000000001 CL_RMDATA3 0000000001
where CL_RMDATA1, CL_RMDATA2, and CL_RMDATA3 are the clone relationship sets associating
RMDATA1 to RMDATA1_clone, RMDATA2 to RMDATA2_clone, and RMDATA3 to
RMDATA3_clone.
This naviseccli command ensures that the synchronized clones for each of the ASM disks making up the
DATA ASM group are fractured from their respective production source LUN at exactly the same point in
time, as the clones representing a dependent write order consistent state of the three ASM disks.
End hot backup
SQL> alter database end backup;
Switch logs
SQL> alter system archive log current;
Execute Block Change Tracking Switch
SQL> execute dbms_backup_restore.bctSwitch ();
This will switch the BCT map and begin a new BCT map.
Create control file copies
Create two copies of the control file. One copy (control_start) will be used to mount the database on the
backup host. The second copy (control_backup) will be used as a part of the incremental backup set used
by RMAN.
$ rman TARGET / CATALOG rman/rman@catdb
Recovery Manager: Release 11.1.0.6.0 - Production on Tue Sep 18 15:45:30 2007
Copyright (c) 1982, 2007, Oracle. All rights reserved.
connected to target database: TPCC (DBID=3136004487, not open)
connected to recovery catalog database
RMAN> run {
Copy current controlfile to ‘+RMFLASH/control_start’;
Copy current controlfile to ‘+RMFLASH/control_backup’;
}
Perform a clone split for the redo and archive clone group
# naviseccli –h array1
-user {admin_user} –password {admin_password} snapview
-consistentfractureclones –ClonGroupNameCloneId
CL_RMREDO1 0000000001 CL_RMREDO2 0000000001 CL_RMFLASH 0000000001
where CL_RMREDO1, CL_RMREDO2, and CL_RMFLASH are the clone relations association with
RMREDO1, RMREDO2, RMFLASH, and their corresponding clones.
Leveraging EMC CLARiiON Storage Replication to Offload Oracle RMAN Backup
Applied Technology
13
Resynchronize the RMAN catalog
RMAN > resync catalog
Record the elapsed time required. This should include:
•
Time to get the running database into hot backup mode
•
Time to execute the storage clone split
• Time to take the database out of hot backup state
Also record the transactional performance impact to production for the duration when the database has to
be placed into hot backup state.
Leverage split clones to perform offloaded incremental backup
Enable host access to the clones of RMDATA and RMARCH LUNs for the backup instance. Start up ASM
in exclusive mode on the backup server. Mount the RMDATA, RMARCH groups from the storage clones
activated for access.
Start the ASM instance
# export ORACLE_SID=+ASM
# sqlplus /nolog
SQL > connect / as sysdba
SQL> startup mount
Mount RMBACKUP ASM group to the backup server. If RMBACKUP is still mounted on production, it
will be necessary to go back to the production instance and unmount the group first.
The remounted ASM groups (based on the clones of RMDATA1, RMDATA2, and RMDATA3, as well as
RMFLASH, plus the RMBACKUP group), now allow the backup database instance to be started back up
for performing the RMAN incremental backup task.
It is extremely important to perform the RMAN backup action without opening the database under the
backup instance. Otherwise, the backup instance would have attempted to perform log and crash recovery
on the database as re-opened on the backup instance. Backup file sets so created would not have consistent
SCN sequencing relative to the production database. The backup file set will therefore not be correctly
registered into the RMAN catalog, and would therefore be unusable for subsequent production database
recovery.
Mount the database instance
Before the database is mounted, change the Backup database instance init.ora CONTROL_FILE parameter
to point to the copied control file. For example:
Set the parameter control_file = +RMFLASH/control_start in the p_run.ora configuration file of the
database instance on the backup server.
After changing the parameter, mount the database:
# export ORACLE_SID=oastoltp
# sqlplus /nolog
SQL > connect / as sysdba
SQL> startup mount
The ORACLE instance is then started with the particular control file.
Leveraging EMC CLARiiON Storage Replication to Offload Oracle RMAN Backup
Applied Technology
14
Back up the database instance
Now we will perform incremental backup on the backup host using the “control_backup” control file, as
this control file is SCN consistent with the production database. Additionally, this control file was the
previously backed up control file. The reason behind this concept is, once the database is mounted, the
SCN will be changed and will no longer be pointing to the initial state of the control file:
$ rman TARGET / CATALOG rman/rman@catdb
RMAN> run
{allocate channel dev1 type disk;
allocate channel dev2 type disk;
backup format '+RMFLASH/ctl%d%s%p%t' controlfilecopy ‘+RMFLASH/control_backup';
recover copy of database with tag ‘incr_update’;
backup incremental level 1 for recover of copy with tag ‘incr_update’ database;
release channel dev1;
release channel dev2;
}
It is worthwhile going through the details of the RMAN script for clarification. This backup script will take
an incremental backup at level 1 using the copied backup control file. The backup incremental level 1 for
recover….. database command does not always create an incremental backup. If there is no level 0 backup
available then applying this command creates an image copy backup of the database with précised tagging.
The first time the script runs, this commands has no effect, since no level 0 backup has been created.
The recover of copy with tag…command enables RMAN to apply any available incremental level 1 backup
to a set of datafile copies with the mentioned tag. The script has no effect on the first and second run
because there is neither incremental level 1 backup nor datafile copy during the first time. For the second
time, there is a datafile copy but it is still based on the incremental level 1 backup copy. But the third and
all subsequent runs, contains both datafile copy and incremental level 1 backup. Hence level 1 incremental
backup applied to the existing datafile copy, brings the datafile copy up to the checkpoint SCN of the level
1 incremental.
Record the time to perform the RMAN incremental backup task above.
The test steps, starting from BEGIN BACKUP on the production instance through the RMAN incremental
backup execution on the backup server, can be repeated as part of a recurring backup offloading process. If
BCT is turned off on production server occasionally, the map will get reset, and a subsequent incremental
backup may not be able to leverage the full performance advantage of BCT. However, the procedural step
will still function correctly.
Test Phase 3: Validating correct restore on the production host
from offloaded incremental backups
This test phase covers the process of restoring and recovering the database on the production server using
incremental backups created from the backup node, as well as validating that the correct database content is
in fact restored and recovered.
Shut down the backup database instance and unmount the RMFLASH ASM group.
Go back to the production server.
Before shutting down the database instance on the production server, remove some data from the table as
follows:
$ sqlplus /nolog
SQL*Plus: Release 11.1.0.6.0 - Production on Wed Dec 5 13:47:45 2007
Copyright (c) 1982, 2007, Oracle. All rights reserved.
Leveraging EMC CLARiiON Storage Replication to Offload Oracle RMAN Backup
Applied Technology
15
SQL> connect system/manager as sysdba
Connected.
SQL> select count(*) from oastoltp.cust;
COUNT(*)
1500000
SQL> delete from oastoltp.cust where C_ID < 100;
49500 rows deleted.
SQL> select count(*) from oastoltp.cust;
COUNT(*)
1450500
Shut down the running database instance.
Perform a direct recovery of the production database, first by restoring with the level 0 full backup. Then
RMAN will apply the incremental backup that was generated through the offloading process from the
backup server directly using the production instance.
Restore procedure
$ rman CATALOG rman/rman@catdb TARGET system/manager
Recovery Manager: Release 11.1.0.6.0 - Production on Wed Dec 5 14:02:42 2007
Copyright (c) 1982, 2007, Oracle. All rights reserved.
connected to target database: OASTDB (DBID=4002616050, not open)
connected to recovery catalog database
RMAN> resync catalog;
starting full resync of recovery catalog
full resync complete
RMAN> run {
2> restore database;
3> recover database;
4> alter database open;
5> }
The incremental restore should work against the level 0 backup, just as if the incremental backup had been
created directly by running RMAN against the production instance originally as in step 1. Upon the
completion of the database restore and recovery process leveraging the level 0 and level 1 backups, key
tables were examined to ensure that the content was in fact correctly restored to the database state at the
time the level 1 incremental backup was taken. Alternatively, you can try the following for the same.
Verify correct restore/recovery
Verify the correctness by counting total number of records in the oastoltp.cust table again.
SQL> select count(*) from oastoltp.cust;
Leveraging EMC CLARiiON Storage Replication to Offload Oracle RMAN Backup
Applied Technology
16
COUNT(*)
1500000
This confirms successful execution of the database restore and recovery process.
Test Phase 4: Analysis of offloading process effectiveness
Compare the production operational duration of service impact observed from the time needed to perform
an RMAN backup directly against the production database, versus just going through the process of putting
the database into hot backup, creating a storage replica, and then taking the database back out of hot
backup, as the procedure to enable offloading of the RMAN backup action.
Also, compare the relative transactional impact to production by estimating the total number of
“transactions that were pre-empted” from production in order to accommodate the need to execute an
operational backup action required by the business.
The testing findings are summarized as follows.
Testing observations/findings
Test case 1: Only a database OLTP workload running on the production host
The observation graph in Figure 2 shows the performance impact when only the database was running on
the production server.
60000
50000
Transactions
40000
30000
TPM
20000
10000
60
62
54
56
58
48
50
52
42
44
46
36
38
40
30
32
34
24
26
28
18
20
22
12
14
16
6
8
10
2
4
M
IN
U
TE
S
0
Figure 2. TPM performance graph while only the OLTP workload was running on the
production host
The transaction per minute (TPM) range is between 26570 and 19850. Fewer transactions occur for the
initial 10 minutes as the database needs some time to load all drivers and start up the measurement interval.
But sharp observation shows that it increases the TPM rate and decreases variation from 22174 to 26570.
Leveraging EMC CLARiiON Storage Replication to Offload Oracle RMAN Backup
Applied Technology
17
Test Case 2: RMAN incremental backup using BCT running parallel with a database
workload on the production host
The observation graph in Figure 3 shows performance impact on the transaction execution rate when
RMAN backup operation is running in parallel with the database workload on the production host.
TPM
60000
50000
Transactions
40000
30000
20000
10000
62
58
60
56
52
54
48
50
46
44
42
38
40
36
34
32
28
30
26
22
24
18
20
16
14
12
8
10
6
2
4
M
IN
U
TE
S
0
Figure 3. TPM performance graph while the backup and OLTP workload run parallel
The collected test performance data indicated that for the test workload, with the RMAN backup started
and running in parallel, the ongoing OLTP workload took a 23 percent throughput hit.
Also, the RMAN task itself took longer to complete, and was still executing when the OLTP test workload
was shut down after it was running for about an hour.
Leveraging EMC CLARiiON Storage Replication to Offload Oracle RMAN Backup
Applied Technology
18
Test Case 3: Offload data for incremental backup using the bitmap change tracking file
The observation graph in Figure 4 shows little performance impact on the production host during the hot
backup period.
TPM
60000
50000
Transactions
40000
30000
20000
HOT BACKUP PERIOD
10000
62
60
56
58
54
52
50
48
46
44
42
40
38
36
32
34
30
28
26
24
20
22
18
16
14
12
8
10
4
6
2
M
IN
U
TE
S
0
Figure 4. TPM performance graph when offloading data with hot backup
In Figure 4, the OLTP workload took a performance hit for about 3 minutes, during the time the database
was put into hot backup mode, as part of the offloading process. Once the hot backup state was exited and
all the offloading procedural steps completed on the production server, the production server level returned
to normal (comparable to what we were reporting in Figure 2).
Leveraging EMC CLARiiON Storage Replication to Offload Oracle RMAN Backup
Applied Technology
19
Combined graph
OLTP-TPM
RMAN-OLTP-TPM
BCT-TPM
60000
50000
40000
30000
20000
10000
62
60
56
58
54
50
52
48
44
46
42
38
40
34
36
32
28
30
26
22
24
20
16
18
14
12
8
10
4
6
2
M
IN
U
TE
S
0
Figure 5. Performance impact difference
The graph in Figure 5 supports the recommendation to try to offload the actual RMAN backup task
leveraging storage-based replication techniques. When the actual RMAN task was run in the midst of
active production work (illustrated by the yellow graph line), the OLTP workload throughput rate dropped.
At the same time, the actual time to complete the RMAN incremental backup task essentially was
prolonged to extend beyond the time when the OLTP test workload was terminated.
Leveraging Oracle Hot Backup and storage-based point-in-time replication, the foreground user transaction
processing was momentarily impacted. The performance impact was limited to the duration that the
database had to be in the hot backup state, accommodating all the needed procedural steps to be properly
conducted.
The actual storage-based point-in-time replication for all practical purpose took zero time within the
database hot backup window. Once the database exited hot backup state, user processing reverted back to
normal performance level. The actual RMAN backup task offloaded can be scheduled and conducted with
more latitude for a convenient time window.
Because the RMAN backup that was offloaded was in fact run against a database state that was no longer
subjected to high transactional content changes, the actual time taken to run the RMAN backup task was
also shortened on the backup server. This was due to the fact that the RMAN task was no longer
contending with other activities against the database content being backed up.
Verifying that BCT driven incremental backup is offloaded
The obvious reason to leverage BCT is to optimize the amount of work, and therefore the time required, to
perform an incremental backup.
Without the BCT maps, RMAN performs an incremental backup by scanning through all the database files
involved, looking for data pages that have been changed since the last successfully completed backup. For
large database files, this can turn out to be a very time consuming process.
Oracle maintains information within the BCT map, tying the map to a particular checkpoint number. As
RMAN reads up the database blocks tracked, if, for whatever reason, a database page is inconsistent with
Leveraging EMC CLARiiON Storage Replication to Offload Oracle RMAN Backup
Applied Technology
20
the checkpoint established according to the map, RMAN will forgo the use of the map, and revert back to
scanning the database blocks to ensure that the backup is created correctly.
As part of the offload testing, we specifically verified that the BCT was correctly used to drive the
offloaded incremental backup efficiently, without requiring full scans on the database files.
Verification procedure
Before taking an incremental backup on the backup host, execute the following SQL query:
SQL> select checkpoint_time,
checkpoint_change#,blocks_read,datafile_blocks,used_change_tracking,file# from v$backup_datafile
order by file# asc;
CHECKPOIN
CHECKPOINT_CHANGE#
BLOCKS_READ
DATAFILE_BLOCKS
USE
FILE#
---------
------------------
-----------
---------------
---
--------
4-Dec-07
781054
690
690
NO
0
The Parameter “USE” shows NO, which means the datafiles have not been incrementally backed up yet
using the BCT map. Execute the same SQL query to determine how many datafiles are being backed up
using the BCT map after taking the incremental backup:
CHECKPOIN
CHECKPOINT_CHANGE#
BLOCKS_READ
DATAFILE_BLOCKS
---------
------------------
-----------
---------------
USE
---
FILE#
---------
4-Dec-07
781054
690
690
NO
4-Dec-07
10008076
4-Dec-07
9314394
690
690
NO
0
471
204800
YES
1
4-Dec-07
9314394
4-Dec-07
9314394
3995
38400
YES
2
1
102400
YES
4-Dec-07
9314394
3
75
3328
YES
4
4-Dec-07
9314394
1
3584
YES
5
4-Dec-07
9314394
1051
6144
YES
6
4-Dec-07
9314394
13367
16128
YES
7
4-Dec-07
9314394
12175
22528
YES
8
4-Dec-07
9314394
1
39808
YES
9
4-Dec-07
9314394
41967
115328
YES
10
4-Dec-07
9314394
50031
128256
YES
11
4-Dec-07
9314394
287
3584
YES
12
4-Dec-07
9314394
1
3328
YES
13
4-Dec-07
9314394
1
3328
YES
14
4-Dec-07
9314394
1
15360
YES
15
4-Dec-07
9314394
14899
27136
YES
16
4-Dec-07
9314394
1
30208
YES
17
4-Dec-07
9314394
12819
41728
YES
18
4-Dec-07
9314394
1
3328
YES
19
4-Dec-07
9314394
28675
115456
YES
20
4-Dec-07
9314394
392832
392832
YES
21
4-Dec-07
9314394
42047
113408
YES
22
4-Dec-07
9314394
42151
112768
YES
23
4-Dec-07
9314394
40467
114048
YES
24
4-Dec-07
9314394
51323
126976
YES
25
Leveraging EMC CLARiiON Storage Replication to Offload Oracle RMAN Backup
Applied Technology
0
21
CHECKPOIN
CHECKPOINT_CHANGE#
BLOCKS_READ
DATAFILE_BLOCKS
USE
FILE#
4-Dec-07
9314394
40907
114688
YES
26
4-Dec-07
9314394
52043
126336
YES
27
4-Dec-07
9314394
51343
127616
YES
28
4-Dec-07
9314394
28679
113536
YES
29
4-Dec-07
9314394
32247
114176
YES
30
4-Dec-07
9314394
31739
114816
YES
31
4-Dec-07
9314394
3115
25600
YES
32
Thirty-four rows were selected.
The V$BACKUP_DATAFILE view reflects the different database files involved from different past
backup actions. The USE column indicates whether the backup for that particular file was in fact done
leveraging the BCT tracking information. With BCT enabled and correctly leveraged by RMAN, the
offloaded RMAN incremental backup took about 1.5 minutes in our test.
Disabling the BCT map with the following:
SQL> ALTER DATABASE DISABLE BLOCK CHANGE TRACKING;
And re-executing the same incremental backup, the function took close to 5 minutes to complete Host
level OS IO monitoring also confirmed that significantly more database file pages were read when the
incremental backup was performed without using the BCT properly (or not using the BCT). For our testing,
BCT enabled incremental backup effectively took 30 percent of the time that would have been needed
otherwise if RMAN had to scan through all the files to ascertain the pages changed since last backup.
Conclusion
Our testing and observations confirmed that by properly combining Oracle backup technologies and tools
with the underlying EMC CLARiiON storage replication capabilities to offload the actual backup task, the
Oracle database backup process can be made significantly more effective:
•
Impact to ongoing database service is minimized to perform the backup using RMAN.
•
The time to complete the RMAN backup is optimized and more predictable (while trying to run the
RMAN backup in the midst of heavy foreground production imposes more work, and more time
variability, to the RMAN task).
•
The more efficient incremental backup capability enabled by the BCT feature in Oracle Database 11g
is not affected by leveraging storage replication technique to enable the RMAN backup task to be
offloaded.
References
The following resources should be consulted. Other references and information can be found on EMC.com
•
Using Oracle Database 2- Day DBA 11g Release (11.1)
•
Using Oracle Database Reference 11g Release 1 (11.1)
•
EMC CLARiiON SnapView and MirrorView for Oracle Database 10g Automatic Storage Management
—Best Practices Planning white paper
•
Using Oracle 10g’s Automatic Storage Management with EMC Storage Technology white paper
•
Using Oracle 10g Release 2 (10.2) Database Backup and Recovery Basics
•
Oracle Database 11g Automatic Storage Management page on Oracle.com
Leveraging EMC CLARiiON Storage Replication to Offload Oracle RMAN Backup
Applied Technology
22