Uploaded by Rishab R

RecoverPoint Administrator's guide 3.3-A04

(Templates v2.2)
(Publication
Template v3.1)
EMC® RecoverPoint
Release 3.3
Administrator’s Guide
P/N 300-010-641
REV A02
EMC Corporation
Corporate Headquarters:
Hopkinton, MA 01748-9103
1-508-435-1000
www.EMC.com
Copyright © 2006 - 2010 EMC Corporation. All rights reserved.
Published March, 2010
EMC believes the information in this publication is accurate as of its publication date. The information is
subject to change without notice.
THE INFORMATION IN THIS PUBLICATION IS PROVIDED “AS IS.” EMC CORPORATION MAKES NO
REPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO THE INFORMATION IN THIS
PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED WARRANTIES OF MERCHANTABILITY OR
FITNESS FOR A PARTICULAR PURPOSE.
Use, copying, and distribution of any EMC software described in this publication requires an applicable
software license.
For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on EMC.com.
All other trademarks used herein are the property of their respective owners.
2
EMC RecoverPoint Release 3.3 Administrator’s Guide
Contents
Preface............................................................................................................................ 13
Introduction .......................................................................................
Audience .....................................................................................
Related documentation.............................................................
Conventions used in this documentation ..............................
Where to get help.......................................................................
Online help .................................................................................
Chapter 1
14
14
14
15
15
16
Concepts
RecoverPoint product family ..........................................................
RecoverPoint ..............................................................................
RecoverPoint/SE .......................................................................
RecoverPoint configurations ...........................................................
CDP configurations ...................................................................
CRR configurations ...................................................................
CLR configurations....................................................................
RecoverPoint hardware and software............................................
RPAs.............................................................................................
Splitters........................................................................................
RecoverPoint Management Applications ..............................
RecoverPoint logical entities ...........................................................
Consistency groups ...................................................................
Copies ..........................................................................................
Replication sets ..........................................................................
Journals .......................................................................................
Volumes.......................................................................................
Snapshots ....................................................................................
Links ............................................................................................
EMC RecoverPoint Release 3.3 Administrator’s Guide
20
20
20
22
22
23
23
24
24
27
28
30
30
32
33
33
36
38
47
3
Contents
RecoverPoint performance .............................................................. 49
Application regulation.............................................................. 49
Replication modes ..................................................................... 50
RPO control ................................................................................ 53
RTO control ................................................................................ 56
Distributed consistency groups............................................... 58
Load balancing........................................................................... 62
RecoverPoint data recovery procedures........................................ 73
Image access ............................................................................... 73
Failover ....................................................................................... 77
RecoverPoint synchronization processes ...................................... 78
Initialization ............................................................................... 78
Full sweeps ................................................................................. 80
Volume sweeps .......................................................................... 83
Long initializations.................................................................... 85
Short initializations ................................................................... 85
First-time initializations............................................................ 86
Fast first-time initializations .................................................... 87
RecoverPoint data flow.................................................................... 89
RecoverPoint replication phases ............................................. 89
The write phase.......................................................................... 89
The transfer phase ..................................................................... 90
The distribution phase .............................................................. 93
RecoverPoint workflows ............................................................... 106
Configuring replication .......................................................... 106
Monitoring and managing RecoverPoint ............................ 106
Moving operations to another site ........................................ 106
Event notification .................................................................... 106
Chapter 2
Getting Started
Licensing overview......................................................................... 108
The Getting Started Wizard........................................................... 110
Welcome screen........................................................................ 110
Account Settings screen .......................................................... 110
System Report Settings screen ................................................ 111
Managing RecoverPoint licences.................................................. 113
Prerequisites ............................................................................. 113
Defining your license key in RecoverPoint ......................... 113
Requesting an activation code ............................................... 114
Defining your activation code in RecoverPoint .................. 115
Upgrading your license .......................................................... 116
Re-activating your license ...................................................... 117
4
EMC RecoverPoint Release 3.3 Administrator’s Guide
Contents
Viewing your license information ......................................... 118
Access control .................................................................................. 119
User authentication.................................................................. 119
User authorization ................................................................... 124
Chapter 3
Starting Replication
Adding splitters............................................................................... 128
How to add splitters to the RecoverPoint system ............... 129
Creating new consistency groups ................................................. 132
The New Consistency Group Wizard ................................... 132
Configuring replication policies.................................................... 143
Configuring consistency group policies ............................... 143
Configuring copy policies....................................................... 152
Modifying existing settings and policies ..................................... 158
How to modify an existing consistency group .................... 158
How to modify an existing copy ........................................... 163
How to modify an existing replication set ........................... 166
How to modify an existing journal ....................................... 170
Manually attaching volumes to splitters ..................................... 173
Chapter 4
Managing and Monitoring
RecoverPoint Management Application...................................... 178
The System Pane ...................................................................... 179
The Traffic Pane ........................................................................ 180
The Navigation Pane ............................................................... 181
The Component Pane .............................................................. 185
Monitoring and analyzing system performance ........................ 213
Monitoring and analyzing system performance ................. 213
Chapter 5
Testing, Failover, and Migration
Use cases........................................................................................... 224
First-time initialization............................................................ 224
First-time initialization from backup .................................... 225
First-time failover..................................................................... 227
Testing a replica........................................................................ 228
Offloading a task ...................................................................... 229
Recovering from a disaster ..................................................... 229
Recovering the production source......................................... 230
Failing over to a replica temporarily..................................... 231
Routine maintenance on production system ....................... 233
EMC RecoverPoint Release 3.3 Administrator’s Guide
5
Contents
Migration ..................................................................................
Bookmarking ...................................................................................
Creating a bookmark...............................................................
Applying bookmarks to multiple groups simultaneously
Automatic periodic bookmarking.........................................
Applying bookmarks using KVSS ........................................
Accessing a replica..........................................................................
Enabling image access ............................................................
Direct Image Access ................................................................
Image Access Enabled mode .................................................
Failover commands ........................................................................
Chapter 6
Notification of Events
Configuring event notification .....................................................
E-mail notification ..........................................................................
SNMP notification ..........................................................................
SNMP trap configuration .......................................................
OMSA support.........................................................................
Syslog notification ..........................................................................
System reports.................................................................................
Before you begin ......................................................................
System report operations........................................................
Best practice..............................................................................
System alerts....................................................................................
Before you begin ......................................................................
System alert operations ..........................................................
Collecting system information......................................................
Process alternatives .................................................................
Process errors ...........................................................................
Splitter credentials...................................................................
How to collect system information.......................................
Chapter 7
233
235
235
236
237
239
242
242
246
247
249
252
253
255
256
257
259
260
261
262
265
266
266
267
268
268
269
269
269
Host Cluster Support
Configuring RecoverPoint cluster support................................. 276
Appendix A
Events
Introduction..................................................................................... 278
Normal events ................................................................................. 279
Detailed events................................................................................ 301
6
EMC RecoverPoint Release 3.3 Administrator’s Guide
Contents
Appendix B
Kutils Reference
Introduction ..................................................................................... 312
Usage.......................................................................................... 312
Path designations ..................................................................... 312
Commands ....................................................................................... 314
flushFS ....................................................................................... 315
manage_auto_host_info_collection ....................................... 316
mount......................................................................................... 317
showFS....................................................................................... 318
show_vol_info .......................................................................... 319
show_vols.................................................................................. 320
sqlRestore .................................................................................. 321
sqlSnap....................................................................................... 323
start............................................................................................. 326
stop............................................................................................. 327
umount ...................................................................................... 328
Appendix C
Troubleshooting
My host applications are hanging................................................. 330
When does application regulation happen? ........................ 330
How does application regulation work? .............................. 330
How do I know application regulation is happening?....... 330
What can I do to stop my group from being regulated? .... 331
My copy is being regulated ........................................................... 332
When does control action regulation happen? .................... 332
How do I know control action regulation is happening? .. 332
How does control action regulation work?.......................... 332
How do I release a copy from control action regulation? .. 333
How do I verify that regulation is over? .............................. 333
My copy has entered a high load state......................................... 334
How do I know a copy is experiencing a high load?.......... 334
What is a permanent high load? ............................................ 334
When do permanent high loads occur? ................................ 335
How do permanent high loads work? .................................. 335
How can I tell a copy is under permanent high load?........ 335
What can I do to come out of permanent high load?.......... 336
How do I verify that a permanent high load is over? ........ 336
What is a temporary high load?............................................. 336
When do temporary high loads occur? ................................ 336
How do temporary high loads work?................................... 337
How can I tell a copy is under temporary high load? ........ 337
What should I know about temporary high loads? ............ 337
EMC RecoverPoint Release 3.3 Administrator’s Guide
7
Contents
How do I verify that a temporary high load is over?.........
My RPA keeps rebooting ...............................................................
When does reboot regulation happen? ................................
How does reboot regulation work? ......................................
How do I know reboot regulation is happening?...............
What should I do to stop reboot regulation?.......................
8
EMC RecoverPoint Release 3.3 Administrator’s Guide
337
339
339
339
339
339
Figures
Title
1
2
3
4
5
6
7
Page
Examples of snapshots and bookmarks...................................................... 39
Automatic snapshot consolidation .............................................................. 42
Snapshot consolidation policy...................................................................... 46
Schematic of logged image access................................................................ 75
RecoverPoint Management Application................................................... 178
Consistency Groups Tab ............................................................................. 187
Normal replication to local and remote replica simultaneously........... 242
EMC RecoverPoint Release 3.2 Administrator’s Guide
9
Figures
10
EMC RecoverPoint Release 3.2 Administrator’s Guide
Tables
Title
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
Page
Journal size with snapshot consolidation equation legend ...................... 35
Snapshot consolidation policies .................................................................... 45
Image access modes ........................................................................................ 74
RecoverPoint license parameters ................................................................ 108
Add New User settings ................................................................................ 120
Predefined users ............................................................................................ 121
LDAP Configuration settings ...................................................................... 122
Add New Role settings................................................................................. 125
Permissions that may be granted or denied.............................................. 125
Consistency Group General Settings.......................................................... 134
Copy General Settings .................................................................................. 135
Consistency Group General Settings.......................................................... 143
Consistency Group Compression Policy Settings .................................... 144
Consistency Group Protection Policy Settings.......................................... 144
Consistency Group Resource Allocation Policy Settings ........................ 147
Consistency Group Stretch Cluster / SRM Support Policy Settings ..... 148
Consistency Group Advanced Policy Settings ......................................... 149
Copy Protection Policy Settings .................................................................. 153
Copy Journal Policy Settings ....................................................................... 155
Copy Advanced Policy Settings .................................................................. 156
General commands ....................................................................................... 185
Multiple consistency group commands..................................................... 186
Specific consistency group commands ...................................................... 188
Status Tab ....................................................................................................... 190
Copy commands............................................................................................ 193
Journal Tab: Image information .................................................................. 197
Journal Tab: Journal information ................................................................ 197
Journal Tab: Sample images information .................................................. 198
Journal Tab: Snapshot Consolidation Progress information .................. 199
Splitter commands ........................................................................................ 199
EMC RecoverPoint Release 3.2 Administrator’s Guide
11
Tables
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
12
Volume commands .......................................................................................
vCenter Server commands...........................................................................
vCenter Server detail commands................................................................
Add vCenter Server Settings .......................................................................
Edit vCenter Server Settings........................................................................
Log commands ..............................................................................................
Log filtering settings .....................................................................................
Components monitored from the Management Application.................
Bottlenecks .....................................................................................................
Consolidated statistics output.....................................................................
Consolidation policies ..................................................................................
Image access modes ......................................................................................
Image access enabled mode.........................................................................
Failover commands.......................................................................................
New Alert Rule settings ...............................................................................
SNMP general settings .................................................................................
RecoverPoint SNMP trap variables ............................................................
Syslog settings ...............................................................................................
Collect system information settings ...........................................................
Listing of normal events and their descriptions.......................................
Listing of detailed events and their descriptions .....................................
EMC RecoverPoint Release 3.2 Administrator’s Guide
203
205
205
206
207
210
211
213
214
217
236
244
247
249
253
255
256
259
270
279
301
Preface
Preface
As part of an effort to improve and enhance the performance and capabilities
of its product lines, EMC periodically releases revisions of its hardware and
software. Therefore, some functions described here may not be supported by
all versions of the software or hardware currently in use. For the most
up-to-date information on product features, refer to your product release
notes.
If a product does not function properly or does not function as described in
the following sections, please contact your EMC representative.
Preface
13
Preface
Introduction
This help file is part of the EMC RecoverPoint documentation, and is
intended for use by those who are responsible for administering the
EMC RecoverPoint system.
Audience
Readers of this help file are expected to be familiar with the following
topics:
◆
◆
◆
◆
Documentation
relevance per
RecoverPoint product
operating systems
network topologies
storage technologies
enterprise-level applications
Excluding the limitations described in EMC RecoverPoint and
RecoverPoint/SE Release Notes, the procedures in this documentation are
correct for both RecoverPoint and RecoverPoint/SE, see
“RecoverPoint product family” on page 20.
However, any procedure steps or guidelines that are only available as
part of the new features of RecoverPoint/SE version 3.2, are included
and excluded, from the general procedures with In RecoverPoint/SE
only or In RecoverPoint only labels.
Note: These additional steps and guidelines are only applicable to
procedures when the conditions in “New features of RecoverPoint/SE
version 3.2” on page 20 are met. When these conditions are not met, all of the
new features are unavailable, and therefore, the general procedure steps and
descriptions apply to both RecoverPoint and RecoverPoint/SE.
Related
documentation
14
Related documents include:
◆
EMC RecoverPoint Deployment Manager Product Guide
◆
EMC RecoverPoint CLI Reference Guide
◆
EMC RecoverPoint Deploying RecoverPoint with SANTap and
SAN-OS Technical Notes
◆
EMC RecoverPoint Deploying RecoverPoint with SANTap and NX-OS
Technical Notes
◆
EMC RecoverPoint Deploying RecoverPoint with Connectrix
AP-7600B and PB-48K-AP4-18 Technical Notes
EMC RecoverPoint Release 3.3 Administrator’s Guide
Preface
Conventions used in
this documentation
EMC uses the following conventions for special notices:
Note: A note presents information that is important, but not hazard-related.
EMC uses the following type style conventions in this help file:
Where to get help
Normal
• Names of interface elements (such as names of windows, dialog boxes,
buttons, fields, and menus)
• Names of resources, attributes, pools, Boolean expressions, buttons,
DQL statements, keywords, clauses, environment variables, functions,
utilities
• URLs, pathnames, filenames, directory names, computer names,
filenames, links, groups, service keys, file systems, notifications
Bold
• Names of commands, daemons, options, programs, processes, services,
applications, utilities, kernels, notifications, system call, man pages
• Names of interface elements (such as names of windows, dialog boxes,
buttons, fields, and menus)
• What user specifically selects, clicks, presses, or types
Italic
•
•
•
•
Courier
• System input and output (command prompt indicates to input everything
after the command prompt)
<>
Angle brackets enclose parameter or variable values supplied by the user
[]
Square brackets enclose optional values
|
Vertical bar indicates alternate selections - the bar means “or”
{}
Braces indicate content that you must specify (that is, x, or y, or z)
...
Ellipses indicate nonessential information omitted from the example
Full titles of publications referenced in text
Emphasis (for example a new term)
Variables
Values of parameters
EMC support, product, and licensing information can be obtained as
follows.
Product information: For documentation, release notes, software
updates, or for information about EMC products, licensing, and
service, go to the Powerlink Web site (registration required) at:
http://powerlink.emc.com
Technical support: For technical support, go to EMC Customer
Service on Powerlink. To open a service request through Powerlink,
you must have a valid support agreement.
Introduction
15
Preface
Please contact your EMC representative for details about obtaining a
valid support agreement or to answer any questions about your
account.
Your suggestions: Your suggestions will helps us continue to
improve the accuracy, organization, and overall quality of the user
publications. Please send your opinion of this guide to:
SSG_Documentation@EMC.com
Online help
To search, bookmark, or print sections of the documentation released
with your RecoverPoint product version; select Help > Help
Contents from the main menu of the EMC RecoverPoint
Management Application GUI (see “Monitoring and analyzing
system performance” on page 213). The RecoverPoint Help dialog
box is displayed.
The following sections deal with the topics:
◆
◆
◆
◆
◆
“Searching help topics”
“Viewing search results”
“Printing help topics”
“Bookmarking help topics”
“Viewing bookmarked help topics”
The RecoverPoint Help dialog box contains the contents of the EMC
RecoverPoint Administrator’s Guide.
The RecoverPoint Help landing page contains links to the complete
and most current RecoverPoint documentation on Powerlink.
16
EMC RecoverPoint Release 3.3 Administrator’s Guide
Preface
Searching help topics
To search the RecoverPoint Help files for a specific term, type the
term into the Search field at the top-left corner of the RecoverPoint
Help dialog box, and click the Go button. Click one of the search
results to display the corresponding content.
Viewing search results
To view the results of your last search, click the Search Results Tab in
the bottom-left corner of the RecoverPoint Help dialog box.
Printing help topics
To print a topic, or a topic and all subtopics contained in the topic,
click the Print button in the Contents Pane of the RecoverPoint Help
dialog box.
Introduction
17
Preface
Bookmarking help
topics
Viewing bookmarked
help topics
18
To bookmark help topics for later viewing, click the Bookmark
button in the top-right corner of the RecoverPoint Help dialog box.
To view bookmarked help topics, click the Bookmarks Tab in the
bottom-left corner of the RecoverPoint Help dialog box.
EMC RecoverPoint Release 3.3 Administrator’s Guide
1
Concepts
Concepts
This section explains the basic concepts of RecoverPoint, replicating
with RecoverPoint and the workflows for configuring and managing
replication volumes.
The topics in this section are:
◆
◆
◆
◆
◆
◆
◆
◆
◆
RecoverPoint product family ........................................................... 20
RecoverPoint configurations ............................................................ 22
RecoverPoint hardware and software............................................. 24
RecoverPoint logical entities ............................................................ 30
RecoverPoint performance ............................................................... 49
RecoverPoint data recovery procedures ......................................... 73
RecoverPoint synchronization processes ....................................... 78
RecoverPoint data flow ..................................................................... 89
RecoverPoint workflows................................................................. 106
Concepts
19
Concepts
RecoverPoint product family
The RecoverPoint product family consists of:
◆
“RecoverPoint” - provides cost-effective, continuous data
protection and continuous remote replication, enabling
on-demand protection and recovery at any point in time.
◆
“RecoverPoint/SE”- ensure continuous data protection and
continuous remote data replication for your EMC CLARiiON
networked storage.
Note: For the complete list of RecoverPoint and RecoverPoint/SE limitations,
see the EMC RecoverPoint and RecoverPoint/SE Release Notes.
RecoverPoint
EMC RecoverPoint brings you continuous data protection and
continuous remote replication for on-demand protection and
recovery at any point in time. The advanced capabilities of
RecoverPoint include policy-based management, application
integration, and WAN acceleration.
With RecoverPoint you'll implement a single, unified solution to
protect and/or replicate data across heterogeneous storage. You'll
simplify management and reduce costs, recover data at a local or
remote site at any point in time, and ensure continuous replication to
a remote site without impacting performance.
RecoverPoint/SE
EMC RecoverPoint/SE brings continuous data protection and
continuous remote replication to your EMC CLARiiON networked
storage. RecoverPoint/SE gives you on-demand protection and
recovery at any point in time and advanced capabilities such as
policy-based management and bandwidth optimization.
With RecoverPoint/SE you'll implement a single unified solution for
data protection, simplify management, reduce costs, and avoid data
loss due to server failures or data corruption.
New features of
RecoverPoint/SE
version 3.2
20
New features were introduced in RecoverPoint/SE version 3.2 that
are only available when the following conditions are met:
◆
A new installation is performed using the RecoverPoint/SE
version 3.2 Installer Wizard.
EMC RecoverPoint Release 3.3 Administrator’s Guide
Concepts
◆
Only CLARiiON splitters (one per site) are required for
replication, and installed at each customer site.
Note: See “Documentation relevance per RecoverPoint product” on page 14.
RecoverPoint product family
21
Concepts
RecoverPoint configurations
EMC RecoverPoint replicates data over any distance:
◆
◆
◆
within the same site or to a local bunker site some distance away,
see “CDP configurations” on page 22.
to a remote site, see “CRR configurations” on page 23.
to both a local and a remote site simultaneously, see “CLR
configurations” on page 23.
The following sections deal with the topics:
◆
◆
◆
CDP configurations
“CDP configurations”
“CRR configurations”
“CLR configurations”
There are two types of CDP configurations:
◆
(Standard) CDP - in which all components (splitters, storage,
RPAs, and hosts) exist at the same site.
◆
Stretch CDP - in which the production host exists at the local site,
splitters and storage exist at both the bunker site and the local
site, and the RPAs only exist at the bunker site. In this
configuration, the repository volume and both the production
and local journals exist at the bunker site.
RecoverPoint CDP can instantly recover data to any PIT by
leveraging bookmarks from the replica journal.
In CDP configurations, data can be replicated locally at a distance
that does not exceed the limitation specified in the EMC RecoverPoint
and RecoverPoint/SE Release Notes, and the data is transferred by Fibre
Channel. By definition, writes from the splitter to the RPA are written
synchronously, and snapshot granularity is set to per second, so the
exact data size and contents are ultimately dependant upon the
number of writes made by the host application per second.
Users can change the snapshot granularity to per write, if necessary.
Users can also set the replication mode to synchronous, when an RPO
time of zero is required.
22
EMC RecoverPoint Release 3.3 Administrator’s Guide
Concepts
CRR configurations
In CRR configurations, data is transferred between two sites over
Fibre Channel or a WAN. In this configuration, the RPAs, storage and
splitters exist at both the local and the remote site.
By default, the replication mode is set to asynchronous, and snapshot
granularity is set to dynamic, so the exact data size and contents are
ultimately dependant upon the policies set by the user and system
performance. This provides protection to application consistent, and
other specific points in time.
Note: Synchronous replication is only supported when the local and remote
sites are connected via Fibre Channel, see EMC RecoverPoint Deployment
Manager Product Guide for limitations.
CLR configurations
Both RecoverPoint CDP and CRR feature bidirectional replication
and a point-in-time recovery mechanism which enables replica
volumes to be rolled back to a previous point-in-time and used for
read/write operations without effecting the ongoing replication
process or data protection.
RecoverPoint configurations
23
Concepts
RecoverPoint hardware and software
The following replication hardware and software is used in the
RecoverPoint solution:
◆
◆
◆
RPAs
“RPAs”
“Splitters”
“RecoverPoint Management Applications”
The RPA is RecoverPoint's intelligent data protection appliance. In
RecoverPoint, RPAs manage all aspects of reliable data replication at
all sites. During replication for a given consistency group, an RPA at
the production site makes intelligent decisions regarding when and
what data to transfer to the replica site. It bases these decisions on its
continuous analysis of application load and resource availability,
balanced against the need to prevent degradation of host application
performance and to deliver maximum adherence to the specified
replication policy. The RPAs at the replica site distribute the data to
the replica storage.
In the event of failover, these roles can be reversed. Moreover,
RecoverPoint supports simultaneous bidirectional replication, where
the same RPA can serve as the production RPA for one consistency
group and the replica RPA for another.
Each RPA has the following dedicated interfaces:
◆
Fibre Channel. Used for data exchange with local host
applications and storage subsystems. The RPA supports a
dual-port configuration to the Fibre Channel, thereby providing
redundant connectivity to the SAN-attached storage and the
hosts.
◆
WAN. Used to transfer data to other sites (Ethernet).
◆
Management. Used to manage the RecoverPoint system
(Ethernet).
You can access each RPA directly through an SSH connection to the
RPAs dedicated box-management IP address. You can also access all
RPAs in the RecoverPoint configuration through the virtual
site-management IP address of each site in the RecoverPoint
configuration. In other words, once RecoverPoint is configured, you
can manage the entire installation from a single location.
24
EMC RecoverPoint Release 3.3 Administrator’s Guide
Concepts
RPA terminology
In RecoverPoint, the following terminology is used when referencing
RPAs:
◆
Preferred RPA
Each consistency group must be assigned one or more preferred
RPAs. A non-distributed consistency group will have one
preferred RPA, called the Primary RPA. Distributed consistency
groups have multiple preferred RPAs, a minimum of one primary
RPA and one secondary RPA, and they can be assigned a
maximum of three secondary RPAs, see “Distributed consistency
groups” on page 58.
◆
RecoverPoint cluster
A group of inter-linked RPAs, working together closely, to
provide replication services (so closely, that in many respects,
they form a single computer). The RPAs in a RecoverPoint cluster
are called nodes. The nodes of the RecoverPoint cluster are
connected to each other through the local area network, a wide
area network, or by Fibre Channel. A RecoverPoint cluster can be
deployed within a single site for CDP (or stretched CDP)
configurations, or deployed in two sites for CRR and CLR
configuration.
To scale-up and support a higher throughput rate, more RPA
nodes can be added to the RecoverPoint cluster. RecoverPoint can
be deployed with two to eight nodes per site, as set during
RecoverPoint system installation. The cluster size must be the
same at both sites in a RecoverPoint installation (see EMC
RecoverPoint Deployment Manager Product Guide).
The RPA nodes are used to attain availability. If a node fails, the
consistency groups using that node (i.e. the consistency groups
whose Primary or Secondary RPA settings are set to the RPA that
failed), will flip-over to a different node in the RecoverPoint
cluster.
The RecoverPoint cluster at each site is managed by a process
called the site control. The RPA node that can be used to host the
site control is selected using cluster leader arbitration (LEP) and
can only be RPA1 or RPA2.
Physically, the RPA cluster can be located in the same facility as
the host and storage subsystems. Alternatively, because RPAs
have their own independent computing and storage resources,
they can be located at a separate facility some distance away from
RecoverPoint hardware and software
25
Concepts
the host or storage subsystems. This provides greater data
protection in the event of a localized disaster. See EMC
RecoverPoint and RecoverPoint/SE Release Notes for limitations.
During normal operation, all RPAs in a cluster are active all of the
time. Consequently, if one of the RPAs in a cluster goes down, the
RecoverPoint system supports immediate switchover of the
functions of that box to another RPA in the cluster.
◆
Primary RPA
The RPA that, whenever possible, handles replication for a
consistency group. If an error occurs in the primary RPA,
replication can, in most cases, be switched over to another RPA at
the same side. The Primary RPA is set through the consistency
group Policy tab (see “Copy General Settings” on page 135).
◆
RPA1
The RPA node that was designated as RPA1 of a RecoverPoint
cluster, during a RecoverPoint installation.
◆
RPA2
The RPA node that was designated as RPA 2 of a RecoverPoint
cluster, during a RecoverPoint installation.
Note: Only the first two RPAs (RPA 1 and RPA 2) in a RecoverPoint
cluster can host the Site control services.
◆
Site management (also known as: site control) - The process that
manages the RecoverPoint cluster at each site. In CDP
configurations, there is only one site control. In CRR and CLR
configurations, there are separate site controls for each site in the
configuration. The active instance of the site control is run only by
RPA 1 or RPA 2.
The user accesses the site control to manage and monitor
RecoverPoint, using a Site Management IP address. To run the
EMC RecoverPoint Management Application GUI, the user
connects to a RecoverPoint cluster by opening a browser window
and typing the site management IP of the RecoverPoint cluster
into the browser address bar.
To run the EMC RecoverPoint Command Line Interface, the user
would connect to a RecoverPoint cluster by opening an SSH
connection with the site management IP of the RecoverPoint
cluster.
26
EMC RecoverPoint Release 3.3 Administrator’s Guide
Concepts
To identify the site control, log into the RecoverPoint
Management Application as a user with SE privileges, and click
on the RPAs section of the Navigation Pane. The ID column of the
RPAs table, displays an asterisk for each RPA acting as the Site
Control.
Note: The RPA node that can be used to host the site control is selected
using cluster leader arbitration (LEP) and can only be RPA1 or RPA2.
◆
Site management IP
A virtual, floating IP address assigned to the RPA that is currently
active (runs the Site control). In the event of a failure by this RPA,
this floating IP address dynamically switches to the RPA that
assumes operation (which will either be RPA1 or RPA2).
Although using the site management IP is best practice, all
management activities can also be performed on a specific RPA,
by entering its dedicated IP address.
Splitters
A splitter is proprietary software that is installed on either host
operating systems, storage subsystems, or intelligent fibre switches.
Splitters access replica volumes; i.e., volumes that contain data to be
replicated. The primary function of a splitter is to “split” application
writes so that they are sent to their normally designated storage
volumes and the RPA simultaneously. The splitter carries out this
activity efficiently, with little perceptible impact on host performance,
since all CPU-intensive processing necessary for replication is
performed by the RPA.
The splitter function can be host-based (see “Host-based splitters” on
page 28), intelligent fabric-based (SANTap, Brocade), or array-based
(“Array-based splitters” on page 28).
This help file is written for RecoverPoint systems using the following
host-based and array-based splitters; Windows, Solaris and
CLARiiON.
◆
For SANTap splitter procedures, see EMC RecoverPoint Deploying
RecoverPoint with SANTap and NX-OS Technical Notes and EMC
RecoverPoint Deploying RecoverPoint with SANTap and SAN-OS
Technical Notes
◆
For Brocade splitter procedures, see EMC RecoverPoint Deploying
RecoverPoint with Connectrix AP-7600B and PB-48K-AP4-18
Technical Notes.
RecoverPoint hardware and software
27
Concepts
Host-based splitters
◆
For VMWare splitter procedures, see EMC RecoverPoint Deploying
RecoverPoint with VMware Technical Notes.
◆
For AIX splitter procedures, see EMC RecoverPoint Deploying
RecoverPoint with AIX Technical Notes.
A host-based splitter is a splitter that is installed on the host, and
performs the data splitting function.
To date, this feature is available for the following operating systems:
◆
Windows
◆
Solaris
◆
AIX
The RecoverPoint host-based splitter consists of two main
components, a user space application, which is frequently referred to
as the KDriver and a kernel driver, which is frequently referred to as
the splitter. In general, the KDriver is responsible for the control path
and the splitter is responsible for the I/O path.
Note: Refer to the EMC support matrix on Powerlink for exact support
statements including OS versions, and other caveats in supported and
unsupported configurations.
Array-based splitters
An array-based splitter is a splitter that is used with EMC CLARiiON
storage arrays, to perform the data splitting function. To date, this
feature is only available for EMC CLARiiON storage, and therefore,
the RecoverPoint array-based splitter is sometimes referred to as
CLARiiON splitter.
The array-based splitter can either be bundled with the CLARiiON's
FLARE, or a splitter enabling package can be non-disruptively
upgraded.
Note: Refer to the EMC support matrix on Powerlink for exact support
statements including OS versions, and other caveats in supported and
unsupported configurations.
RecoverPoint
Management
Applications
28
The RecoverPoint Management Applications allow you to manage
the RecoverPoint system. Management activities are the primary
subject of this Administrator's Guide.
EMC RecoverPoint Release 3.3 Administrator’s Guide
Concepts
Site management provides access to all boxes in the local RPA cluster,
as well as to the RPA cluster at the other site (i.e., to which the local
cluster is joined).
Command line
interface (CLI)
RecoverPoint
Management
Application
Management activities can be carried out interactively or by scripts
using the command-line interface. For information about the
command-line interface, refer to the EMC RecoverPoint CLI Reference
Guide.
Management activities can also be carried out using any standard
web browser to access the RecoverPoint Management Application.
The rest of this help file describes how to use the features of the
RecoverPoint Management Application to manage and monitor
RecoverPoint replication.
See “Monitoring and analyzing system performance” on page 213.
RecoverPoint hardware and software
29
Concepts
RecoverPoint logical entities
The following logical entities constitute your replication
environment:
◆
◆
◆
◆
◆
◆
◆
◆
◆
Consistency groups
“Consistency groups”
“Non-distributed (regular) consistency groups”
“Distributed consistency groups”
“Copies”
“Replication sets”
“Journals”
“Volumes”
“Snapshots”
“Links”
A consistency group consists of one or more replication sets (see
“Replication sets” on page 33). Each replication set consists of a
production volume and the replica volumes to which it is replicating.
The consistency group ensures that updates to the replicas are always
consistent and in correct write order; that is, the replicas can always
be used to continue working or to restore the production source, in
case it is damaged.
The consistency group monitors all the volumes added to it to ensure
consistency and write-order fidelity. If two data sets are dependent
on one another (for instance, a database and a database log), they
must be in the same consistency group.
Imagine a motion picture film. The video frames are saved on one
volume, the audio on another. But neither will make sense without
the other. The saves must be coordinated so that they will always be
consistent with one another. In other words, the volumes must be
replicated together in one consistency group. That will guarantee that
at any point, the saved data will represent a true state of the film.
A consistency group consists of:
30
◆
Settings and policies: Settings, such as consistency group name,
primary RPA, reservation support; policies, such as compression,
bandwidth limits, and maximum lag, that govern the replication
process, see “Configuring replication policies” on page 143.
◆
Replication sets: A production source volume and the replica
volumes to which it replicates, see “Replication sets” on page 33
and “The replica volumes” on page 36.
EMC RecoverPoint Release 3.3 Administrator’s Guide
Concepts
◆
Journals: Receive changes to data. Each copy has a single journal.
Changes are distributed from the replica journal to storage. The
replica journals also retain rollback data for their replica, see
“Journals” on page 33, “The replica journal volumes” on page 37
and “The production journal volume” on page 36.
The production journal does not contain rollback information.
The system marking information is contained in the production
journal.
Consistency group
types
In RecoverPoint, consistency groups can be of the following types:
◆
“Non-distributed (regular) consistency groups”
◆
“Distributed consistency groups”
Note: Throughout the RecoverPoint documentation, the term “consistency
group” is used to refer to groups when no differentiation is required between
distributed and non-distributed groups.
How can I tell if a group is distributed or non-distributed?
You can tell that a group is distributed by clicking on a consistency
group name in the Navigation Pane and looking at the top of the
consistency group’s Status Pane. If the group is distributed, the text
(primary) followed by a comma-separated list of numbers indicating
the designated secondary RPAs is displayed.
If the group is non-distributed, only one RPA is specified in this area.
You can tell which of all of your groups are distributed by selecting
Consistency Groups in the Navigation Pane. In the Active RPA
column of the consistency group list, distributed consistency groups
are displayed with the text (primary) followed by a comma-separated
list of numbers indicating the designated secondary RPAs.
RecoverPoint logical entities
31
Concepts
For groups that are non-distributed, only one RPA is specified in this
column.
Non-distributed
(regular) consistency
groups
New consistency groups are by default defined as non-distributed.
Non-distributed consistency groups transfer data through one
primary RPA that is designated by the user during group creation
and can be modified at any time through the group policy settings.
A maximum of 128 consistency groups can be defined in
RecoverPoint, and a single RPA cannot be configured to have more
than 64 consistency groups. In the event of RPA failure, groups that
transfer data through one RPA will move to other RPAs in the cluster.
In such a case, an RPA can temporarily hold up to 128 groups, and the
data of all groups will continue to be transferred between sites. This
state is temporary, however, as an RPA with more than 64 groups
may run into high loads, and if this state is prolonged, group policies
could be affected.
Each RPA has a maximum throughput rate (see EMC RecoverPoint and
RecoverPoint/SE Release Notes for this limit), which together with the
host write-rate and available network resources, limits the maximum
size of the consistency group. For a higher throughput rate, and to
balance the load of extra-large consistency groups, define the
consistency group as distributed, through the group policy settings
(see “Distributed consistency groups” on page 58).
For the complete set of limitations associated with consistency
groups, see the EMC RecoverPoint and RecoverPoint/SE Release Notes.
Copies
32
A logical RecoverPoint entity that constitutes all of the volumes
defined for replication at a given location (production, local, or
remote). These include; a journal size limit setting that defines RTO,
journal compression policies, and protection policies that define
snapshot consolidation and the required protection window.
EMC RecoverPoint Release 3.3 Administrator’s Guide
Concepts
In CDP and CRR configurations, there is one production copy and
one replica. In CLR configurations, there is one production copy and
two replicas (one local copy at the production site and one remote
copy at the disaster recovery site).
Note: The term ‘replica’ is used to differentiate between production and
non-production copies, whenever necessary. In CLR configurations there are
two replicas, or non-production copies (Also known as: Targets).
The production copy consists of production volumes and the
production journal, which may consist of one or more journal
volumes. The non-production copies (i.e. replica copies) each consist
of replica volumes and a replica journal, which may consist of one or
more journal volumes.
In the RecoverPoint Management Application (GUI), new copies are
defined using the New Consistency Group Wizard. In the
RecoverPoint Command Line Interface (CLI), new copies are defined
using the create_copy command.
Replication sets
Every SAN-attached storage volume in the production storage must
have a corresponding volume at each copy. A production volume and
its associated replica volumes is called a replication set. Each
consistency group contains as many replication sets as there are
volumes in the production storage to replicate. Data consistency and
write-order fidelity are maintained across all volumes assigned to a
consistency group, including volumes on different storage systems.
At least one volume must be added to the journal of each copy in a
replication set, see “Journals” on page 33, “The replica journal
volumes” on page 37 and “The production journal volume” on
page 36.
Journals
One or more volumes are dedicated on the storage at each replica site
for the purpose of holding images that are either waiting to be
distributed, or that have already been distributed, to the replica
storage. See “The production journal volume” on page 36 and “The
replica journal volumes” on page 37.
Each journal holds as many images as its capacity allows.
Subsequently, the oldest image (provided that it has already been
distributed to the copy storage) is removed to make room for the
newest one, in a first-in, first-out manner. The actual number of
RecoverPoint logical entities
33
Concepts
images in the journal varies, depending on the size of the images and
the capacity of the storage dedicated to this purpose. See “Journal
size without snapshot consolidation” on page 34.
You can address individual images in a replica journal. Hence, if
required due to a disaster, you can roll back the stored data image to
an earlier image that was unaffected by the disaster. Frequent
small-aperture snapshots provide high granularity for achieving
maximum data recovery in the event of such a rollback.
Journal size without
snapshot
consolidation
For example:
The recommended journal size when not utilizing snapshot
consolidation is:
1.05 * [(Δ data per second)*(required rollback time in
seconds)/(1 – image access log size)] + (reserved for
marking)
If:
Δ data per second = 5 Mb/s
required rollback time = 24 hr = 86 400 s
image access log = 0.20
reserved for marking = 1.5 GB
Then required journal size:
1.05 * 5 Mb/s * 86 400 s / (1 - 0.20) + 1.5 GB = 579 000 Mb
579 000 Mb = 579 000/8 MB = 72 375 MB = 72.4 GB
You can use iostat (UNIX) or perfmon (Windows) to determine the
value for Δ data per second.
The default image access log size is 20%, see “Copy Journal Policy
Settings” on page 155.
See EMC RecoverPoint and RecoverPoint/SE Release Notes for the
minimum and maximum journal size limitations.
Journal size with
snapshot
consolidation
34
Journal sizing with snapshot consolidation must take into account the
incremental change of data over the period of consolidation, see
“Snapshot consolidation” on page 39. Therefore, when snapshot
consolidation is enabled, the formula for estimating the journal size
is:
EMC RecoverPoint Release 3.3 Administrator’s Guide
Concepts
[[(Change rate * (Journal for period of continuous data
protection)) + (Journal for daily backups) (Change rate *
24hrs) (1 - Locality of reference per day) + (Journal for
weekly backups) (Change rate * 7 days)(1 - Locality of
reference per week)] / [1 - Percentage of journal used
for the image access log]] / Compression
or:
J = [[ (C*(PW+24Hrs))+(BD + 6)(C*24Hrs)(1 -LD)
+ (BW-1)(C*7days)(1-LW)] / (1 - IA)]] / COMP
Where:
Table 1
Journal size with snapshot consolidation equation legend
Symbol
Denotes
Description
J
Journal size
The required journal size.
C
Change rate
The average change rate based on iostat.a
LD
Locality of reference per day
The percentage of data that is repeated to the same location on the snapshots of a
day, where the putative daily incremental backup size is C*24hrs(1-LD).
LW
Locality of reference per week
The percentage of data that is repeated to the same location in the snapshots of a
week, where the putative weekly incremental backup size is C*7Days(1-LW).
PW
Protection window
The continuous protection rate specified for Do not consolidate snapshots for at
least setting, see “Copy Protection Policy Settings” on page 153.
BD
Number of Daily backups
The value specified for the one snapshot per day for x days setting, see “Copy
Protection Policy Settings” on page 153.
BW
Number of Weekly backups
The value specified for the one snapshot per day for y weeks setting, see “Copy
Protection Policy Settings” on page 153.
IA
Image access log percentage
Default=20%
The value specified for the proportion of journal allocated for image access log
setting, see “Copy Journal Policy Settings” on page 155.
COMP
Compression
The value specified for the enable compression setting, see “Consistency Group
Compression Policy Settings” on page 144, where:
• If compression is disabled Comp=1
• If compression is enabled Comp=2
a. You can use iostat (UNIX) or perfmon (Windows) to determine the value for Δ data per second.
RecoverPoint logical entities
35
Concepts
Note: For consolidation to occur, 25% of the total journal space allotted to
snapshots (~75%, see “Journal size without snapshot consolidation” on
page 34) must be free (must not contain snapshots). Otherwise, snapshot
consolidation will not run and you will receive an appropriate event
message. In other words, 25% of 75% of the total journal size must not
contain snapshots, for consolidation to occur.
See EMC RecoverPoint and RecoverPoint/SE Release Notes for the
minimum and maximum journal size limitations, when snapshot
consolidation is enabled.
See “Journal size without snapshot consolidation” on page 34.
Volumes
In the EMC RecoverPoint Management Application, LUNs are
represented as volumes. Therefore, this help file refers to LUNs when
referencing the storage entity, and volumes when referencing the
RecoverPoint entity.
The following types of volumes exist in all RecoverPoint
configurations:
◆
◆
◆
◆
◆
The production
volumes
The replica volumes
The production journal
volume
“The production volumes”
“The replica volumes”
“The production journal volume”
“The replica journal volumes”
“The repository volume”
The production volumes are the volumes that are written to by the
host applications at the production site, see “Replication sets” on
page 33.
The replica volumes are the volumes that the production volumes are
replicated to, see “Replication sets” on page 33.
The production journal volume stores information about the
replication process (called marking information) that is used to make
synchronization of the replication volumes at the two sites, when
required, much more efficient.
In RecoverPoint/SE only: The production and local replica journals
and repository volume must all reside on the same CLARiiON array.
36
EMC RecoverPoint Release 3.3 Administrator’s Guide
Concepts
Note: Since the production journal contains the system marking information,
the removal of a journal volume from the production site will cause a
full-sweep synchronization.
The replica journal
volumes
Each journal (see “Journals” on page 33) can consist of one or more
journal volumes.
Note: In RecoverPoint/SE only: The production and local replica journals
and repository volume must all reside on the same CLARiiON array.
If more than one volume at a time is added to the journal, it is
recommended that all added volumes be the same capacity for best
performance and efficiency. If the added volumes are the same or
nearly the same capacity (at least 85% of the largest volume), data is
striped across those journal volumes, improving performance. When
striped, the capacity used in each journal volume is equal to the
capacity of the smallest journal volume in the group of added
volumes; remaining capacity in those volumes is not used.
Volumes of very different capacities will be concatenated and not
striped. In most cases, this will affect performance, but all capacity
will be used.
If two groups of volumes of two different capacities are added, they
are striped in two groups. If additional volumes are added
afterwards, the new volumes will be considered as a group by
themselves according to the criteria above. Existing volumes and
newly added volumes will not be striped together.
In the case that the combined physical size of all journal volumes at a
given copy is larger than the combined physical size of the journal
volumes at the other copy, the protection window at the first copy
will be larger than the protection window at the other copy.
Note: Journal volumes must not reside on LUNs that are virtually
provisioned (thin LUNs).
The repository volume
A special volume, called the repository volume, must be dedicated on
the SAN-attached storage at each site, for each RPA cluster. The
repository volume serves all RPAs of the particular cluster and
splitters associated with that cluster. It stores configuration
information about the RPAs and consistency groups, which enables a
RecoverPoint logical entities
37
Concepts
properly functioning RPA to seamlessly assume the replication
activities of a failing RPA from the same cluster.
Note: In RecoverPoint/SE only: The production and local replica journals
and repository volume must all reside on the same CLARiiON array.
Snapshots
A snapshot is a point in time marked by the system for recovery
purposes. A snapshot includes only that data that has changed from
the previous image. Upon being distributed, it creates a new current
image on the remote storage.
A snapshot is the difference between one consistent image of stored
data and the next. Snapshots are taken seconds apart. The application
writes to storage; at the same time, the splitter provides a second
copy of the writes to the RecoverPoint appliance. In asynchronous
replication, the appliance gathers several writes into a single
snapshot. The exact time for closing the snapshot is determined
dynamically depending on replication policies and the journal of the
consistency group. In synchronous replication, each write is a
snapshot. When the snapshot is distributed to a replica, it is stored in
the journal volume, making it possible to revert to previous images
by selecting the stored snapshots.
The snapshots at a copy are displayed in the copy Journal Tab, see
“The Journal Tab” on page 196.
For each consistency group, a Snapshot Granularity policy can be
configured to regulate data transfer, and the following granularities
can be defined:
Bookmarks
38
◆
Fixed (per write): To create a snapshot from every write
operation.
◆
Fixed (per second): To create one snapshot per second.
◆
Dynamic: To have the system determine the snapshot granularity
according to available resources.
A bookmark is a named snapshot. The bookmark uniquely identifies
an image. Bookmarks can be set and named manually; they can also
be created automatically by the system either at regular intervals or in
response to a system event. Bookmarked images are listed by name.
EMC RecoverPoint Release 3.3 Administrator’s Guide
Concepts
Figure 1
Examples of snapshots and bookmarks.
You can bookmark a snapshot at any time. Bookmarks are useful to
mark particular points in time, such as an event in an application, or a
point in time you wish to fail over to. The procedures for
bookmarking are described in “Managing and Monitoring” on
page 177.
The bookmarked snapshots at a copy are displayed in the copy
Journal Tab, see “The Journal Tab” on page 196.
Snapshot dilution
Only 1000 snapshots (a sub-set of all available snapshots) are
displayed in the copy Journal tab. You can access images that do not
appear in the Journal tab image list either by specifying a point in
time, previous/next, or searching for an image in the search engine.
The actual image that is accessed in this way is subject to the criteria
set in the image search utility.
To accommodate the snapshot display limit of the image list, the
system begins diluting the list of images as the number of images it is
holding approaches the limit. For example, it may begin by diluting
the journal of every second snapshot. Ultimately, the objective of the
dilution process is to maintain a journal in which the maximum
images are displayed and distributed in a more-or-less even manner
(e.g., by time, by number of snapshots) across the journal.
The system does not dilute bookmarked images from the list, unless
the number of bookmarked snapshots exceeds the snapshot
maximum. In that case, the system drops the oldest bookmarked
snapshots from the list of addressable snapshots.
The list of snapshots at a copy are displayed in the copy Journal Tab,
see “The Journal Tab” on page 196.
Snapshot
consolidation
RecoverPoint captures every write, enabling you to recover data from
any point in time. Keeping days to months of every point in time,
RecoverPoint logical entities
39
Concepts
however, requires very large journals. Over time, re-writes to a single
disk location consume a large amount of journal space. Additionally,
the need to recover back to an exact point in time decreases as the
data ages. The granularity of snapshots becomes less important over
time. Snapshot consolidation enables longer-term point-in-time
recovery using the same storage consumption. Snapshot
consolidation discards re-writes to the same disk location to save on
journal space, which allows a longer history to be retained in the
journal.
Using snapshot consolidation, you decide how long to retain every
write captured, and at what point to consolidate these writes to a
daily, weekly, or monthly snapshot. Snapshot consolidation allows
you to retain the crucial per write or per second data of write
transactions for a specified period of time (for example, the last 12
hours, days, weeks or months of transactions) and only then to start
gradually increasing the granularity of older snapshots at preset
intervals (for example, to create daily, then weekly, and then monthly
snapshots).
With snapshot consolidation:
40
◆
All writes to a single disk location are saved in their end-state at
one specific point-in-time. Therefore, all writes to the same disk
location between the start-time and end-time of the consolidation
are deleted.
◆
The amount of journal space saved depends on the I/O pattern,
how many blocks get re-written, and other factors. The formula
for journal size when snapshot consolidation is enabled is
described in “Journal size with snapshot consolidation” on
page 34.
◆
The time needed to consolidate snapshots depends on the
number and size of the snapshots being consolidated.
◆
A minimum of 1GB of changes (writes) must have occurred and
been transferred to the replica journal for the snapshot
consolidation process to start.
◆
By default, 20% of the journal's capacity is dedicated to the image
access log used during Logged access (although this default value
can be modified by the user), another 5% of the journal's capacity
is dedicated to the calculation of indexes that are used for Virtual
access, and an additional ~1GB is dedicated to handling bursts
during distribution. This means that only ~75% of the journal is
available for the storage of snapshots. When snapshot
EMC RecoverPoint Release 3.3 Administrator’s Guide
Concepts
consolidation is enabled, the consolidation process uses 25% of
the remaining ~75%. Therefore, for consolidation to occur, 25% of
75% of the journal must be free (must not contain snapshots).
Otherwise, snapshot consolidation will not run and you will get
an appropriate event message (see “Journal size without snapshot
consolidation” on page 34 and “Journal size with snapshot
consolidation” on page 34).
◆
Snapshots can be consolidated:
• automatically through the RecoverPoint Management
Application Copy Policy settings. See “Configuring copy
policies” on page 152 for instructions on configuring
automatic snapshot consolidation.
• manually through the RecoverPoint Command Line Interface
consolidate_snapshots command. See “Manual snapshot
consolidation” on page 44 for more information.
◆
You can set a snapshot consolidation policy for an individual
snapshot. This policy determines how the snapshot is treated
during both automatic and manual snapshot consolidation. This
allows you to retain a specific snapshot in its original form for a
specific duration. See “Automatic snapshot consolidation” on
page 41 for more information.
This section deals with the following topics:
◆
◆
◆
◆
◆
“Automatic snapshot consolidation”
“Enabling automatic consolidation”
“Manual snapshot consolidation”
“Snapshot consolidation policy”
“Viewing consolidation results”
Automatic snapshot consolidation
Using automatic snapshot consolidation, you can define how
snapshots are consolidated during four intervals. You can define the
interval during which:
• snapshots are not consolidated. During this interval,
RecoverPoint maintains continuous data (that is, all writes),
allowing you to recover to any point-in-time. By default,
automatic snapshot consolidation maintains all writes for 2
days.
RecoverPoint logical entities
41
Concepts
• snapshots are consolidated into a daily snapshot. Once a day
during this interval, snapshots that are older than the interval
defined for maintaining continuous data are consolidated into
a single daily snapshot. By default, automatic snapshot
consolidation maintains 5 daily consolidations.
You can also specify that the daily consolidation process run
indefinitely. This setting disables weekly and monthly
consolidations.
• snapshots are consolidated into a weekly snapshot. Once a week
during this interval, snapshots that are older than the total
interval defined for continuous data and daily consolidations
are consolidated into a single snapshot. By default, automatic
snapshot consolidation maintains 4 weekly consolidations.
• snapshots are consolidated into a monthly snapshot. Once a month,
snapshots that are older than the total interval defined for
continuous data, the daily period, and the weekly period are
collapsed into a monthly consolidation. RecoverPoint
maintains as many monthly consolidations as the journal can
hold.
You can also specify that the weekly consolidation process run
indefinitely. This setting disables monthly consolidations.
Figure 2
42
Automatic snapshot consolidation
EMC RecoverPoint Release 3.3 Administrator’s Guide
Concepts
The settings in Figure 2 maintain two days of continuous data, three
daily consolidations, two weekly consolidations, and monthly
consolidations up to the capacity of the journal.
The snapshot in the example uses the default consolidation policy of
‘Always consolidate’. Refer to “Snapshot consolidation policy” on
page 44 for more information.
Suppose that a snapshot is taken at 10 A.M. on 3/15. Here’s what
happens:
◆
The original version of the snapshot remains in the journal for at
least the next 48 hours.
◆
After 48 hours have elapsed, the original snapshot becomes a
candidate for daily consolidation. It will be consolidated by the
daily consolidation process. The consolidated daily snapshot
remains in the journal for at least 3 days.
◆
After 3 days, the daily snapshot becomes a candidate for weekly
consolidation. It will be consolidated by the weekly consolidation
process. That consolidated weekly snapshot remains in the
journal for at least 2 weeks.
◆
After 2 weeks, the weekly snapshot becomes a candidate for
monthly consolidation. It will be consolidated by the monthly
consolidation process. The consolidated monthly snapshot
remains in the journal as long as space is available.
Enabling automatic consolidation
When a consistency group is created, automatic snapshot
consolidation is disabled by default. Snapshots of this group are not
included in the automatic snapshot consolidation process.
To enable automatic snapshot consolidation:
1. In the RecoverPoint Management Application, select a copy in the
Navigation Pane. In the Component Pane, click the Policy Tab,
and then the Protection Tab to display the automatic snapshot
consolidation settings.
2. Select the Enable RecoverPoint snapshot consolidation
checkbox to begin consolidating the snapshots in your copy
journal according to the default settings. If required, adjust these
settings to your specific requirements according to the
instructions in Table 18 on page 153.
RecoverPoint logical entities
43
Concepts
Once enabled, automatic snapshot consolidation begins according
to the values set on the Policy tab, provided the system is in the
distribution phase and initialization is over.
Note: You cannot enable automatic consolidation on a consistency group
that belongs to a group set.
For automatic consolidation to take place, the following conditions
must be met:
◆
The total size of all snapshots between the specified start and end
times must be at least 1 GB.
◆
Snapshots that account for 90% of the consolidation period must
be available in journal. For example, for daily consolidation to
take place, the starting and ending snapshots must be at least 22
hours apart. Likewise, automatic consolidation will not take place
if the snapshots in the journal exceed 110% of the consolidation
period. For example, daily consolidation will not take place if the
starting and ending snapshots are more than 26 hours apart.
Manual snapshot consolidation
Manual snapshot consolidation allows you to select specific
point-in-times to use as the starting and closing snapshots in a
snapshot consolidation. For example, a script can be run once a day at
a specific time to create a bookmark and then to use that bookmark as
the closing snapshot for a manual consolidation.
Use the consolidate_snapshots command in the RecoverPoint
command line interface to manually consolidate snapshots. See the
EMC RecoverPoint CLI Reference Guide for more information.
Snapshot consolidation policy
You can set a consolidation policy for an individual snapshot. The
consolidation policy determines how the snapshot is treated during
both automatic and manual snapshot consolidation. You can set the
consolidation policy for a snapshot when it is first created, and you
can set or change the consolidation policy for a snapshot (both
regular and consolidated) at any time after it is first created.
44
EMC RecoverPoint Release 3.3 Administrator’s Guide
Concepts
By applying a consolidation policy to a snapshot, you can determine
if and when a snapshot is consolidated. Table 2 summarizes the
effects of snapshot consolidation policy.
Table 2
Snapshot consolidation policies
Consolidated in
daily
consolidations?
Consolidated in
weekly
consolidations?
Consolidated in
monthly
consolidations?
Consolidated in
manual
consolidations?
Always
consolidate
Yes
Yes
Yes
Yes
Survive daily
No
Yes
Yes
Yes
Survive weekly
No
No
Yes
Yes
Survive monthly
No
No
No
Yes
Never
consolidate
No
No
No
No
Snapshots with
this policy ...
Take note of the following best practices:
◆
Apply a specific consolidation policy to a single snapshot within
the period defined by the policy. That is, in any one month
(28-day span), apply the ‘Survive monthly’ consolidation policy
to a single snapshot. In any one week, apply the ‘Survive weekly’
to a single snapshot. In any one day, apply the ‘Survive daily’ to a
single snapshot.
◆
Do not apply the same (or longer) consolidation policy to two
snapshots within the same time period. If you do, automatic
consolidation will not take place between those snapshots. For
example, if you apply the ‘Survive weekly’ policy to two
snapshots in the same week, no automatic consolidations will
take place between the snapshots.
◆
Do not apply consolidation policies in overlapping periods. For
example, do not apply the ‘Survive weekly’ policy in the same
24-hour period that you apply the “Survive daily” policy. If you
apply overlapping policies, the policy with the longer time span
takes precedence.
RecoverPoint logical entities
45
Concepts
Figure 3 shows a bookmark named Server restored created on January
1 with a consolidation policy of ‘Survive weekly’.
Figure 3
Snapshot consolidation policy
Using the automatic consolidation settings defined in Figure 2 on
page 42, the bookmark would be treated as follows during automatic
snapshot consolidation:
◆
The bookmark is not included in any automatic consolidation job
that runs within 48 hours of the bookmark being created.
◆
After 48 hours elapse, daily consolidation jobs begin to run. A
daily consolidation job runs once a day for the next three days.
But because the bookmark has a policy of ‘Survive weekly’, the
bookmark is not consolidated in any of these daily jobs. Instead,
the bookmark remains untouched in the journal in its original
form.
◆
After 3 days (5 days since the bookmark was originally created),
daily snapshots begin to be consolidated into weekly snapshots.
Still, this bookmark, since it has a policy of ‘Survive weekly’, is
not consolidated into a weekly consolidation. Rather, it remains
available in the journal in its original form.
◆
After 2 weeks and 3 days, monthly consolidation jobs begin. At
this point, the bookmark is consolidated into a monthly
consolidation and is no longer available in the journal (unless it is
selected by the consolidation process to be the monthly snapshot).
Manual consolidation overrides any consolidation policy applied to a
specific snapshot, with the exception of ‘Never consolidate’. Since the
Server restored bookmark has a consolidation policy of ‘Survive
weekly,’ it will be included in a manual consolidating if it falls within
the time or image range specified by the consolidation.
46
EMC RecoverPoint Release 3.3 Administrator’s Guide
Concepts
Viewing consolidation results
Consolidated snapshots are represented by an icon in the Journal Tab
of the copy. The icon indicates the type of consolidation (daily,
weekly, monthly, or manual). Additionally, a tooltip indicates the
amount of space saved by the consolidation.
The following image shows an example of a manual consolidation
with a closing time of 20:09:16 that saved 6.84GB of space.
See “The Journal Tab” on page 196 for more information.
You can also use the get_images command to view the results of a
consolidation. See the EMC RecoverPoint CLI Reference Guide for more
information.
Links
The communication pipe between a production and replica copy,
through which data is transferred. In RecoverPoint, data transfer for
each link can be over WAN or Fibre Channel.
There can be one or two such links per consistency group, depending
on the number of copies in the RecoverPoint installation:
◆
The remote link: In CRR or CLR configurations; the link between
the production volumes and their corresponding remote replica
volumes
◆
The local link: In CDP or CLR configurations; the link between
the production volumes and their corresponding local replica
volumes.
This section answers the questions:
◆
◆
◆
What are the main
properties of a link?
What can trigger a
link to close?
“What are the main properties of a link?”
“What can trigger a link to close?”
“How can I tell whether a link is open or closed?”
The link can be open, meaning that transfer is possible. The link can
be closed, meaning that transfer is not possible to the replica.
During a user or system-triggered pause in transfer.
RecoverPoint logical entities
47
Concepts
User triggered pauses in transfer happen when the user manually:
◆
Runs the pause transfer command.
◆
Changed the Primary RPA of the consistency group.
◆
Selects to enable image access in direct access mode.
◆
Changes the direction of replication, failing over to the replica
(see “Failover” on page 77).
System- triggered pauses in transfer happen when the system
encounters one of the following states:
How can I tell whether
a link is open or
closed?
◆
High loads, both temporary and permanent (see “My copy has
entered a high load state” on page 334).
◆
Initialization after a disaster (see “Initialization” on page 78).
◆
WAN unavailable.
◆
No communication with RPA at remote site (for remote replica).
◆
No communication with RPA at local site (for local replica).
◆
System failure (a failed process, where the system does not
recognize the reason for failure).
You can verify whether a link is open or closed in the RecoverPoint
Management Application. In the main Consistency Groups table or
in the Status pane of a particular consistency group; when the state of
Transfer for a copy is something other than N/A, Paused or Paused by
system (in other words, when the state of Transfer is Active,
Initializing, etc.), the link is open.
Note: Only relevant for replica copies, not relevant for the production copy.
48
EMC RecoverPoint Release 3.3 Administrator’s Guide
Concepts
RecoverPoint performance
Naturally, replication performance is subject to:
◆
◆
◆
◆
◆
◆
Replication requirements
WAN/FC link conditions
Storage performance
Host performance
Change-rate of replicated data
Number of RPAs installed per site
The following settings are available to users that want to control and
monitor the performance of their replication environments, according
to the aforementioned constraints:
◆
◆
◆
◆
◆
◆
Application
regulation
“Application regulation”
“Replication modes”
“RPO control”
“RTO control”
“Distributed consistency groups”
“Load balancing”
Allow Regulation is a consistency group policy that enables
RecoverPoint to control the acknowledgement of writes back to the
host in the case of bottlenecks or insufficient resources that would
otherwise prevent RecoverPoint from replicating the data.
When the Allow regulation setting is disabled, the selected RPO (lag)
is not guaranteed, and the system will try it's best to replicate within
the RPO setting, without affecting host performance.
In asynchronous replication mode (see “Asynchronous replication
mode” on page 51), when enabled, the Allow Regulation setting
slows host applications when approaching the lag policy limit (see
“RPO control” on page 53) or a high load state (see “My copy has
entered a high load state” on page 334).
In synchronous replication mode (see “Synchronous replication
mode” on page 51), this setting is automatically enabled, and cannot
be modified.
For more information, see “My host applications are hanging” on
page 330 and “Allow Regulation” on page 146.
RecoverPoint performance
49
Concepts
Replication modes
Regardless of the replication mode, RecoverPoint is unique in its
ability to guarantee a consistent replica at the target side under all
circumstances, and in its ability to retain write order fidelity in
multi-host heterogeneous SAN environments.
RecoverPoint replicates data in one of two replication modes:
◆
Asynchronous replication mode - the host application initiates a
write, and does not wait for an acknowledgment from the remote RPA
before initiating the next write. The data of each write is stored in the
local RPA, and acknowledged at the local site. The RPA decides based on
the lag policy and system loads/available resources when to transfer the
writes in the RPA to the replica storage. This is the default replication
mode.
In asynchronous replication mode, a Snapshot Granularity
policy can be configured to regulate data transfer, (see Snapshot
Granularity in “Consistency Group Advanced Policy Settings” on
page 149). The following granularities can be defined:
• Fixed (per write): To create a snapshot from every write
operation.
• Fixed (per second): To create one snapshot per second.
• Dynamic: To have the system determine the snapshot
granularity according to available resources.
See “Asynchronous replication mode” on page 51 for a more
detailed description.
◆
Synchronous replication mode - the host application initiates a
write, and then waits for an acknowledgment from the remote RPA
before initiating the next write. This is not the default replication mode,
and must be specified by the user.
Replication in synchronous mode produces a replica that is
always one hundred percent up-to-date with its production
source. The trade-off, is that to ensure that no subsequent writes
are made until an acknowledgement is received from the remote
RPA, host applications can be regulated by RecoverPoint, and this
could impact application performance. Alternatively, users can
configure RecoverPoint to dynamically alternate between
synchronous and asynchronous replication modes, according to
predefined lag and/or throughput conditions. To do so, see
“Dynamic sync mode” on page 52.
50
EMC RecoverPoint Release 3.3 Administrator’s Guide
Concepts
Synchronous replication mode is only supported for replication:
• to a local replica.
• to a remote replica over Fibre Channel.
Note: Synchronous replication mode is not supported for replication
over the WAN.
See “Synchronous replication mode” on page 51 for a more
detailed description.
Asynchronous
replication mode
The host application initiates a write, and does not wait for an
acknowledgment from the remote RPA before initiating the next write. The
data of each write is stored in the local RPA, and acknowledged at the local
site. The RPA decides based on the lag policy and system loads/available
resources when to transfer the writes in the RPA to the replica storage. This
is the default replication mode.
The primary advantage of asynchronous replication is its ability to
provide synchronous-like replication without degrading the
performance of host applications.
Asynchronous replication, however, is not the primary mode of
replication in all situations. Asynchronous replication does not
conserve bandwidth. Furthermore, and particularly as volumes
increase, more data can be lost, as larger chunks of data that have
been acknowledged at the local site may not be delivered to the target
side in the case of a disaster.
RecoverPoint replicates asynchronously only in situations in which
doing so enables superior host performance without resulting in an
unacceptable level of potential data loss.
Synchronous
replication mode
The host application initiates a write, and then waits for an acknowledgment
from the remote RPA before initiating the next write. This is not the default
replication mode, and must be specified by the user.
To configure RecoverPoint to replicate synchronously, see
“Consistency Group Protection Policy Settings” on page 144.
In order to ensure that no subsequent writes are made until an
acknowledgement is received from the remote RPA, host applications
are regulated by RecoverPoint. If your applications cannot be
regulated for any reason, choose asynchronous replication mode.
Replication in synchronous mode produces a replica that is always
one hundred percent up-to-date with its production source.
RecoverPoint performance
51
Concepts
Synchronous replication mode is efficient for replication both within
the local SAN environment (as in CDP configurations, referred to as
local replication), as well as for replication over Fibre Channel (as in
CRR configurations, referred to as remote replication). However,
when replicating synchronously, the longer the distance between the
production source and the replica copy, the greater the latency.
In order to replicate data synchronously, your current RecoverPoint
license must support synchronous replication. To verify that your
current RecoverPoint license supports synchronous replication; from
the main system menu of the RecoverPoint Management Application,
select System > System Settings. Click on Account Settings in the
Navigation Pane, and verify that the word Supported is displayed
next to Synchronous Replication in the License Usage section of this
dialog box. If you wish to replicate synchronously and this feature is
not supported in your current version of RecoverPoint, contact EMC
Customer Service.
By default, new consistency groups are created with asynchronous
mode enabled, and must be set to replicate synchronously through
the RecoverPoint Management Application, see “Consistency Group
Protection Policy Settings” on page 144.
Note: New consistency groups are created with the Measure lag when writes
reach the target RPA (as opposed to the journal) setting enabled. When
replicating synchronously, performance is substantially higher when this
setting is enabled, see “Measure lag when writes reach the target RPA (as
opposed to the journal)” on page 150.
To verify that a consistency group copy is replicating synchronously,
check its transfer status in the Consistency Groups Tab, see Figure 6
on page 187 or the group’s Status Tab, see “The Status Tab” on
page 190.
Dynamic sync mode
When replicating synchronously over longer distances, users can set
RecoverPoint to replicate in dynamic sync mode, a submode of
synchronous replication mode. In this mode, users can define group
protection policies that will enable the group to automatically begin
replicating asynchronously whenever the group’s data throughput or
latency reaches a maximum threshold, and then automatically revert
to synchronous replication mode when the throughput or latency
falls below a minimum threshold.
52
EMC RecoverPoint Release 3.3 Administrator’s Guide
Concepts
You can also switch manually between replication modes, using the
RecoverPoint CLI. This is useful, for example, if you generally require
synchronous replication, but wish to use CLI scripts and your system
scheduler to manually switch between replication modes during
different times in the day, like during your nightly backups.
When the replication policy is controlled dynamically by both
throughput and latency (both Dynamic by latency and Dynamic by
throughput are enabled), it is enough that one of the two values of
Start async replication above are met for RecoverPoint to
automatically start replicating asynchronously to a replica. However,
both Resume sync replication below settings must be met before
RecoverPoint will automatically revert to synchronous replication
mode.
To prevent jittering, the values specified for Resume sync replication
below must be lower than the values specified for Start async
replication above, or the system will issue an error.
To check whether a replica is being replicated to synchronously, see
“Multiple consistency group management” on page 186.
Note: Groups undergo a short initialization every time the replication mode
changes (for example, from synchronous to asynchronous and vice-versa).
During initialization, data is not transferred synchronously.
RPO control
For example:
The Recovery Point Objective (or RPO) is the point in time to which
you are required to recover data, for a specific application, as defined
by your organization. This is generally a definition of what an
organization determines is an acceptable loss in a disaster situation.
If a company’s data must be restored to within 3 hours of the disaster
event and the time it takes to get the recovered data back into
production is 6 hours:
◆
The RPO is 3 hours
◆
The RTO is 6 hours
There is a trade-off between the amount of data that a customer is
willing to lose and its cost. If the customer must have an RPO of zero,
this means that replication must by synchronous (see “Synchronous
replication mode” on page 51), or in other words, that each write
must be replicated to the DR site before another write is made. This
usually introduces additional cost in terms of the resources that are
RecoverPoint performance
53
Concepts
required for this to occur effectively (such as storage performance
and replication bandwidth).
Each RecoverPoint configuration provides a different level of
protection (in terms of RPO) in the case of logical, storage, local and
regional disasters.
How do I control the
RPO?
In asynchronous replication mode (“Asynchronous replication
mode” on page 51), RecoverPoint allows you to control the RPO for
each consistency group through the Lag and Allow regulation
settings in the consistency group protection policy section (see
“Consistency Group Protection Policy Settings” on page 144).
Using the Lag setting, RPO can be expressed in terms of time,
quantity of data, or number of writes. To guarantee the RPO setting,
host applications can be throttled upon approaching the defined Lag
setting, using the Allow regulation setting (see “Allow Regulation”
on page 146). In synchronous replication mode (see “Synchronous
replication mode” on page 51), this setting is automatically enabled,
and cannot be modified.
54
EMC RecoverPoint Release 3.3 Administrator’s Guide
Concepts
How do I monitor the
RPO?
To monitor the RPO, in the EMC RecoverPoint Management
Application:
◆
Click on a consistency group name in the Navigation Pane, click
the Statistics Tab in the components area, and click the
Replication Performance Tab to the bottom of the Components
area. In the Replication Performance tab, the consistency group's
RPO can be monitored in terms of time, writes, or quantity of
data.
◆
You can also click on System Monitoring in the Navigation Pane,
and select the Groups tab to monitor the lag (RPO) of all
consistency groups in RecoverPoint.
In the RecoverPoint Command Line Interface, run the
get_group_statistics command, and identify the Lag output in the
Link stats area.
RecoverPoint performance
55
Concepts
RTO control
The Recovery Time Objective (or RTO) is the duration of time and a
service level within which a business process must be restored after a
disaster (or disruption) in order to avoid unacceptable consequences
associated with a break in business continuity.
RTO includes the time for trying to fix the problem without a
recovery, the recovery itself (RecoverPoint's role in an organization's
RTO, controlled by the Maximum journal lag setting), tests and the
communication to the users. Decision time for users is not included.
RecoverPoint's role in an organization's RTO can be defined as the
amount of time it would take to enable physical access to the latest
application-consistent image at a replica, or in other words, the
amount of time that it would take to apply all of the snapshots in a
replica journal to the replica storage and bring the replica storage
up-to-date with the latest application-consistent image at production.
In RecoverPoint, RTO is measured by data size.
Note: Since there is a trade-off between the amount of data that can be stored
in a replica journal (the amount of available data recovery points) and the
amount of time to which the replica can be up-to-date with production (the
RTO setting), if there is no specific reason to modify the RTO setting, it is
recommended to leave the default setting.
How do I control the
RTO?
56
RecoverPoint allows you to control the data access aspects of RTO for
each copy through the Maximum journal lag setting in the copy
journal policy (see “Copy Journal Policy Settings” on page 155).
When the system approaches the limit set in the Maximum journal
lag policy, it moves to three-phase distribution mode (see
“Three-phase distribution” on page 97).
EMC RecoverPoint Release 3.3 Administrator’s Guide
Concepts
How do I monitor the
RTO?
In the RecoverPoint Management Application GUI, click on a copy
name in the Navigation Pane and select the copy Journal Tab to
display the current journal lag. The value displayed in the Journal Lag
field indicates the RTO in terms of data size.
In the EMC RecoverPoint Command Line Interface, run the
get_group_statistics command to display the journal lag.
RecoverPoint performance
57
Concepts
Distributed
consistency groups
Distributed consistency groups allow users to create and use
consistency groups that require a total throughput and IOPS rate that
exceeds the supported throughput and IOPS rate of a single
RecoverPoint appliance, and prevent users from having to split data
that requires strict write-order fidelity into multiple consistency
groups. They do so by dividing the consistency group into four
segments, and running these segments on one primary RPA and one
to three additional secondary RPAs, as defined by the user.
Distributed groups can handle a much higher throughput and IOPS
rate (see the EMC RecoverPoint and RecoverPoint/SE Release Notes for
this limit) regardless of the amount of data being replicated.
Up to eight distributed consistency groups can be defined in
RecoverPoint, and the total number of distributed and
non-distributed consistency groups is 128.
Distributed consistency group support is a function of your
RecoverPoint license. See “How do I verify that distributed groups
are supported by my RecoverPoint license?” on page 61 for more
information.
To better understand the differences between distributed and
non-distributed consistency groups, read “Consistency groups” on
page 30.
For the complete set of limitations associated with distributed
consistency groups, see the EMC RecoverPoint and RecoverPoint/SE
Release Notes.
This section answers the questions:
◆
◆
◆
◆
◆
◆
58
“When should I set a consistency group as distributed?”
“How do distributed consistency groups work?”
“What should I know before setting a group as distributed?”
“How do I verify that distributed groups are supported by my
RecoverPoint license?”
“How do I set a consistency group as distributed?”
“How do I monitor each group segment’s performance?”
EMC RecoverPoint Release 3.3 Administrator’s Guide
Concepts
When should I set a
consistency group as
distributed?
How do distributed
consistency groups
work?
You should consider setting a consistency group as distributed when:
◆
The maximum throughput rate of a single RPA is not sufficiently
sustaining the write-rate or peaks of the consistency group (see
the EMC RecoverPoint and RecoverPoint/SE Release Notes for the
maximum throughput rate of a single RPA).
◆
Your consistency group is experiencing high loads often.
◆
You expect a consistency group will, in the future, require a
higher throughput rate than that of a single RPA. In this case, it is
preferable to initially create the consistency group as distributed,
rather than modifying an existing consistency group after
creation. The reason for this is explained in the journal
configuration instructions in “What should I know before setting
a group as distributed?” on page 59.
Distributed consistency groups are divided into four segments and
these segments are transferred through one primary RPA and up to
three secondary RPAs, as designated by the user.
RecoverPoint data recovery processes are affected in the following
way:
◆
The primary RPAs at both sites (if two sites exist) are responsible
for the receipt and handling of all system process requests.
◆
All of the marking information is handled by the primary RPA at
the source-side.
The data flow for distributed consistency groups, in synchronous and
asynchronous replication modes, is described in “The transfer phase”
and “The distribution phase”.
What should I know
before setting a group
as distributed?
When setting a group as distributed, the following limitations apply:
◆
The snapshot granularity of all links in the consistency group can
be no finer than one second. See “Consistency Group Advanced
Policy Settings” on page 149 for more information on snapshot
granularity.
◆
Journal loss will occur when modifying a group’s topology
(setting a non-distributed group as distributed, or setting a
distributed group as non-distributed).
◆
When configuring journals for a distributed consistency group,
keep the following in mind:
RecoverPoint performance
59
Concepts
• All copies of distributed consistency groups must have a
journal that is at least 20 GB in size. See “Journals” on page 33
for more information.
• The recommended journal size for distributed groups with
snapshot consolidation enabled is at least 40GB. See “Snapshot
consolidation” on page 39 for more information.
• oIf the capacity of an existing copy journal is less than the
minimum journal size required for distributed consistency
groups (see the EMC RecoverPoint and RecoverPoint/SE
Release Notes for this limit), the consistency group will need
to be disabled and then enabled again after adding journal
volumes, and this will cause a full sweep. See “How to resize
an existing journal volume” on page 171 for more information.
◆
Distributed consistency groups are only supported if there is a
Fibre Channel connection between all RPAs in a RecoverPoint
cluster (per site).
Therefore:
• In Fibre Channel environments, make sure all of the RPAs at
each site are connected to the SAN through a Fibre Channel
switch, and zoned together so that they see each other in the
SAN.
• In iSCSI environments, make sure all RPAs are physically
connected to each other through their HBA Fibre Channel
ports.
Note: In iSCSI configurations, there can be a maximum of two RPAs
per RecoverPoint cluster (i.e. per site) because two of the four existing
Fibre Channel ports in the RPA’s HBA are already connected directly
to the storage. If more than 2 RPAs per RecoverPoint cluster are
required, connect all of the RPAs in the cluster through Fibre Channel
switches (two should be used for high availability) and zone them
together.
60
◆
If any of the primary or secondary RPAs associated with a
consistency group becomes unavailable, there will be a brief
pause in transfer on all of the group’s primary and secondary
RPAs, and all of the group segments will undergo a short
initialization.
◆
Under certain circumstances (for example, if one of the primary
or secondary RPAs becomes unavailable) two consistency group
segments could be handled by the same RPA.
EMC RecoverPoint Release 3.3 Administrator’s Guide
Concepts
◆
In general, distributed consistency groups offer better
performance than non-distributed (regular) consistency groups,
as distributed groups run on a minimum of two RPAs (one
primary RPA and one secondary RPA). There is only a small
improvement in performance when a group is run on three RPAs.
However, there is a steep improvement in performance when a
group is run on four RPAs.
How do I verify that
distributed groups are
supported by my
RecoverPoint license?
To verify that this feature is supported, select System > System
Settings from the main RecoverPoint menu and click on the Account
Settings tab. In the License Usage section of the accounts settings
screen, you should see the text Distributed Groups: Supported. If
this text does not appear, contact EMC Customer Support.
How do I set a
consistency group as
distributed?
Before you set a consistency group as distributed, read “What should
I know before setting a group as distributed?” on page 59, and review
the relevant limitations and performance statistics in the EMC
RecoverPoint and RecoverPoint/SE Release Notes.
To set a non-distributed group as distributed, or set a distributed
group as non-distributed:
RecoverPoint performance
61
Concepts
1. Display the consistency group’s Advanced policy section.
2. Check or uncheck the Distribute group checkbox.
3. The Secondary RPA checkboxes are enabled or disabled
accordingly.
4. If you are enabling this feature, select one to three secondary
RPAs, by checking the relevant checkboxes in the Secondary RPA
section.
Note: When modifying a group’s topology, journal loss will occur.
How do I monitor
each group
segment’s
performance?
Load balancing
To monitor each group segment’s performance separately, and in
relation to the group as a whole, run the detect_bottlenecks
command in the EMC RecoverPoint Command Line Interface.
Load balancing is the process of assigning preferred RPAs to
consistency groups. Assigning a preferred RPA to a consistency
group dictates the RPA to “prefer” when transferring data for a
specific group.
During a group’s creation, one or more preferred RPAs (see
“Distributed consistency groups” on page 58) must manually be
assigned to each new group. Each group will always run on its
preferred RPA/s, unless an RPA disaster occurs. If an RPA disaster
occurs, the group will flip-over to another (non-preferred) RPA, and
then back as soon as possible. Flipover (also known as: switchover)
can cause high loads on RPAs, when the loads of all consistency
groups defined in the system are not evenly distributed between
RPAs, so these loads must be balanced. In other words, groups must
be moved from RPA to RPA every once in awhile to re-arrange the
62
EMC RecoverPoint Release 3.3 Administrator’s Guide
Concepts
load. On the other hand, to ensure consistency, consistency groups
are initialized when moving from RPA to RPA. During switchover, all
groups running on the preferred RPA are initialized once, when they
move to the non-preferred RPA, and then another time, when they
switch back to their preferred RPAs. Therefore, any re-assignment of
RPAs during replication should be carefully planned out, as not to
affect the performance of the production host applications.
This section answers the questions:
◆
◆
◆
When should I perform
load balancing?
How do I perform load
balancing?
“When should I perform load balancing?”
“How do I perform load balancing?”
“How do I check that load balancing improved my group’s
performance?”
You should perform load balancing:
◆
When a new consistency group is added to the replication
environment. After a new consistency group is added to the
system (see “Creating new consistency groups” on page 132),
wait a week (7 full days) for a long enough traffic history to
accumulate before you perform load balancing.
◆
When a new RPA is added to a RecoverPoint cluster. Perform
load balancing immediately after the RPA is added.
◆
If your system enters high load frequently. When load balancing
is required, the high load event logs will display a message
indicating so. When you see this message, perform load
balancing.
◆
If the bottleneck detection tool recommends it. When load
balancing required, the bottleneck detection tool will display a
message indicating so. When you see this message, perform load
balancing.
◆
Periodically. To ensure that your system is always handling
distributing loads evenly, a script can be created to periodically
perform load balancing.
There are two load balancing methods in RecoverPoint:
◆
“Manual load balancing”, which is performed by setting or
changing a consistency group’s primary RPA setting (and
secondary, if relevant).
RecoverPoint performance
63
Concepts
◆
See also:
◆
◆
◆
◆
“Using RecoverPoint’s load balancing tool”, which analyses
multiple group performance over time, recommends a load
balancing strategy based on this analysis, and optionally,
automatically assigns preferred RPAs to specified consistency
groups based on the recommendation.
When should I perform load balancing?: page 1-63
Manual load balancing: page 1-64
Using RecoverPoint’s load balancing tool: page 1-65
How do I check that load balancing improved my group’s
performance?: page 1-72
Manual load balancing
To manually perform load balancing:
1. Perform a system analysis to identify consistency groups with a
high change-rate or run the balance_load CLI command to create
a load balancing recommendation, see “How do I use the load
balancing tool?” on page 67.
2. Based on your system analysis or the load balancing
recommendation output (see “How do I check that load
balancing improved my group’s performance?” on page 72) from
the Navigation Area, select the name of the group whose
preferred RPAs you want to change, and click on the group’s
Policy Tab.
3. Change the preferred RPAs of the consistency group:
• For regular (non-distributed) consistency groups, change the
Primary RPA setting and click the Apply button, see
“Non-distributed (regular) consistency groups” on page 32 for
more information.
• For distributed consistency groups, you have the following
options:
– Change the Primary RPA setting, by clicking the Apply
button.
– Change any or all secondary RPAs by clicking on the
Advanced policy section, and then checking or unchecking
the relevant secondary RPAs in the General section of the
Advanced tab.
Note: See “Distributed consistency groups” on page 58 for more
information.
64
EMC RecoverPoint Release 3.3 Administrator’s Guide
Concepts
4. Check the load balancing results, to verify improved
performance, see “How do I check that load balancing improved
my group’s performance?” on page 72.
Note: Each time a consistency group’s preferred RPA (primary or
secondary) is modified, the group undergoes an initialization to ensure
consistency.
Using RecoverPoint’s load balancing tool
RecoverPoint’s automated load balancing tool enables:
◆
more efficient preliminary group configuration, by automating
the process of assigning preferred primary RPAs (and preferred
secondary RPAs, see “Distributed consistency groups” on
page 58) manually, yet providing a recommendation, allowing
users who wish to balance RPA load manually, do so, based on
the recommendation.
◆
more efficient ongoing group administration, by automating the
task of balancing the loads of groups between RPAs, as
workloads and circumstances change after the preliminary group
configuration.
Since the modification of preferred RPA assignments causes flipovers
that cause initializations, if the load balancing analysis finds that no
significant changes in workload or number of consistency groups
have occurred, the mechanism will not recommend preferred RPA
re-assignment. Also, the load balancing mechanism smooths-out I/O
statistics, so that any sudden peaks or lapses in traffic do not cause it
to recommend unnecessary changes.
This section answers the questions:
◆
◆
◆
◆
“What is the load balancing tool good for?”
“What should I know before I begin using the tool?”
“How do I use the load balancing tool?”
“How do I retrieve the load balancing results?”
What is the load balancing tool good for?
RecoverPoint’s load balancing tool automates the following tasks:
◆
Analyzing of the performance of a specified set or all consistency
groups in the system over a substantial period of time, and
displaying a recommended load balancing strategy based on the
historical performance statistics.
RecoverPoint performance
65
Concepts
◆
Saving the recommended load balancing strategy (containing the
current traffic performance statistics) of any or all of the
consistency groups in the system, at a particular point-in-time:
• for later reference and analysis, see “How do I retrieve the
load balancing results?” on page 71 and “How do I check that
load balancing improved my group’s performance?” on
page 72.
• as reference material, from which to perform manual load
balancing, see “Manual load balancing”.
◆
Automatically re-assigning preferred RPAs to all, or a subset of
all, consistency groups in the system, based on the recommended
load balancing strategy.
What should I know before I begin using the tool?
Before you begin using the load balancing tool, note the following:
66
◆
The balance_load command is only available on the site control
RPA, or through the floating Site Management IP. This command
cannot be run if site control is down. If site control goes down,
and then back up again, five minutes must pass before this
command can be run again.
◆
The load balancing analysis is performed on all RPAs in the
environment. If there are three RPAs in total, and one of them
goes down for a week, and it is the primary RPA of a group
included in the analysis, the load balancing recommendation
calculations are still based on the total number of RPAs, as if they
were all working. In other words, the load balancing mechanism
accumulates group statistics for its calculations, as if groups were
always running on their preferred RPAs, even if flipovers
happened during the week.
◆
The output of the load balancing analysis and recommendation
are saved in a text file. By default, only the most recent ten files
are saved, see “How do I retrieve the load balancing results?” on
page 71 for more information.
◆
For best results, groups should be configured and replicating for
seven days before running the load balancing tool, so that a long
enough traffic history is available for the load balancing analysis.
If seven days of traffic data are not available, any existing time
period of data is used, and the load balancing recommendation is
accompanied by a note indicating the period of time upon which
the recommendation results are based, and noting that seven
days are preferable.
EMC RecoverPoint Release 3.3 Administrator’s Guide
Concepts
◆
The load balancing analysis is performed on all RPAs at all sites.
◆
Groups can be excluded from the load balancing
recommendation, however, even when excluded from the
recommendation, all groups are included in the analysis.
◆
The load balancing tool is a smart mechanism that can identify
cases in which distributed group segments do not actually require
four separate RPAs to run on, and recommends RPA assignments
accordingly. Distributed group segments can be run on different
RPAs, or the same RPA, and each group segment is treated as a
separate group in the load balancing analysis and
recommendation. If you are not well acquainted with the
distributed consistency group feature, it is recommended that
you become acquainted with this feature before using the load
balancing tool. See “Distributed consistency groups” on page 58
for more information.
How do I use the load balancing tool?
If you want to use the load balancing tool:
◆
To analyze system performance and display a load balancing
recommendation containing the performance results on the
screen, perform steps 1-5.
◆
To analyze system performance, display a load balancing
recommendation containing the performance results on the
screen, and retrieve the results from the webdownload directory
for later reference or to perform manual load balancing (see
“Manual load balancing” on page 64), perform steps 1-6, but
answer NO when asked to apply the recommendation.
◆
To analyze system performance, display a load balancing
recommendation containing the performance results on the
screen, and automatically re-assign RPAs to consistency groups
based on this recommendation, perform steps 1-6, and answer
YES when asked to apply the recommendation.
To use the load balancing tool:
1. Make sure you are well acquainted with the information in “What
should I know before I begin using the tool?” on page 66.
2. Open a direct SSH session with the RPA running the site control,
or connect to the RPA running the site control through the site
management IP.
3. Run the balance_load command.
RecoverPoint performance
67
Concepts
4. Enter the consistency groups to leave out of the load balancing
recommendation, enter multiple consistency groups in the form
of a comma-separated list, and press ENTER.
Note: The groups that you specify here will be preceded by an asterisk in
the load balancing recommendation output, and their traffic statistics
will be included in the analysis upon which the recommendation is
based.
5. The system lets you know that the analysis and recommendation
process can take several minutes, and prompts you as to whether
or not to start the process now. Type YES, and click ENTER, to run
the analysis and recommendation process now.
[Wait for the process to complete]
At the end of the process:
a. If the load balancing recommendation is based on less than 7
days-worth of data, a note is displayed indicating so.
b. One or more of the following load balancing
recommendations are displayed:
– No action is necessary. The environment is stable, and groups
are evenly distributed across all RPAs.
– Action is necessary. The environment is not stable because
groups are not evenly distributed across all RPAs. To
correct this, apply the load balancing recommendation or
manually modify the Preferred RPA of each group.
– Action may be necessary. The environment is stable, although
groups are not evenly distributed across all RPAs, and this
may affect future performance. To distribute groups evenly
across all RPAs, apply the load balancing recommendation
or manually modify the Preferred RPA of each group.
c. The current preferred and recommended RPA assignments are
displayed per consistency group, along with each RPAs
average throughput and incoming write-rate, in a
Recommended Load Distribution table.
68
EMC RecoverPoint Release 3.3 Administrator’s Guide
Concepts
Note: For distributed groups, each consistency group segment is
treated as a separate group, and the performance statistics are
displayed per group segment.
d. Two tables, Traffic per RPA before Application of
Recommendation and Traffic per RPA after Application of
Recommendation, are displayed in order to aid you in the
process of understanding the performance implications on all
RPAs, if you do choose to apply the recommendation. Each
RPAs average throughput and IOPS are displayed, and the
RPA with the least amount of traffic is highlighted, in each of
the tables.
RecoverPoint performance
69
Concepts
e. If any non-distributed (regular) consistency groups have
experienced a minute of throughput that exceeded the
maximum throughput rate of a single RPA (as desribed in
EMC RecoverPoint and RecoverPoint/SE Release Notes) at least 50
times during the week, the load balancing recommendation
indicates that these groups should be set as distributed. If any
distributed consistency groups have experienced less than 40
minutes of throughput at a rate of over 60 MB/sec during the
week, the load balancing recommendation indicates that these
groups should be set as non-distributed.
f. If the total group throughput is exceeded, a message is
displayed indicating that you should add additional RPAs to
your RecoverPoint cluster, and run the balance_load
command again after 7 days.
Note: See the EMC RecoverPoint and RecoverPoint/SE Release Notes for
the total throughput per distributed consistency group and total
throughput per RPA limitations.
6. Decide whether or not to apply the recommendation:
◆
If the load balancing recommendation was No action is
necessary, a message is displayed indicating the location of the
output file containing the analysis results and recommendations,
and the process ends. See “How do I retrieve the load balancing
results?” on page 71 for instructions on how to retrieve the
results.
You can save the results file, and use it later to:
• analyze system performance over an extended period of time.
• manually perform load balancing, see “Manual load
balancing” on page 64.
Note: Only the past ten load balancing results are saved. The oldest
results are replaced with the newest results.
◆
If the load balancing recommendation was Action Necessary or
Action may be Necessary, RecoverPoint asks whether you want
to apply the recommended load balancing.
• Type NO, and click ENTER, to indicate that you do not wish to
apply the recommended load balancing. In this case, a
message is displayed indicating the location of the output file
containing the analysis results and recommendations, and the
70
EMC RecoverPoint Release 3.3 Administrator’s Guide
Concepts
process ends. See “How do I retrieve the load balancing
results?” on page 71 for instructions on how to retrieve the
results.
• Type YES, and click ENTER, to indicate that you do wish to
apply the recommended load balancing. In this case:
– All relevant consistency group preferred RPAs are
re-assigned according to the load balancing
recommendation.
– An initialization process starts on each consistency group
whose preferred primary or secondary RPAs are modified.
Note: For distributed consistency groups, if one group segment is
initialized, all group segments are initialized.
– A normal scope event is logged indicating that the load
balancing process has ended. The event contains a copy of
the load balancing recommendation output.
– A message is displayed indicating the location of the
output file containing the analysis results and
recommendations, and the process ends. See “How do I
retrieve the load balancing results?” on page 71 for
instructions on how to retrieve the results.
7. Optionally, you can now check the implications on performance,
see “How do I check that load balancing improved my group’s
performance?” on page 72.
How do I retrieve the load balancing results?
After using the load balancing tool, to retrieve the results:
1. Open a browser window and type the following path into your
browser’s address bar:
<SiteMgmtIP/SiteControlIP>/info/load_balancing/
Note: The balance_load command is only available on the site control
RPA, or through the floating Site Management IP.
2. Download the relevant text file/s and store them for later
reference. The load balancing output is saved to text files with the
name: balance_load_yyyy-mm-dd_hh-mm.txt.
RecoverPoint performance
71
Concepts
Note: Ten load balancing output files are stored in this location at a time.
When ten files have already been stored, the older files are removed to
make way for newer files. Note the date and time of the output file is
indicated in the filename. The file names are suffixed with the text
“applied” whenever the load balancing recommendation was actually
applied by the user, through the load balancing tool.
How do I check that
load balancing
improved my group’s
performance?
After load balancing manually or applying the recommendation
produced by the load balancing tool, check the implications on
performance:
1. Wait 7 days, for a new traffic history to be available.
2. Re-run the balance_load CLI command, see “How do I use the
load balancing tool?” on page 67.
3. Compare the old load balancing results and performance
statistics to the new load balancing results and performance
statistics, see “How do I retrieve the load balancing results?” on
page 71.
72
EMC RecoverPoint Release 3.3 Administrator’s Guide
Concepts
RecoverPoint data recovery procedures
The following procedures are central in RecoverPoint:
◆
◆
Image access
“Image access”
“Failover”
When replicating normally, writes to the production source are also
written to the journal of the replica or replicas. The state of the
storage at the copy site is No access; that is, a host cannot access the
storage.
To test a replica to verify that it is a reliable and consistent replica of
the production storage image, it is necessary to access the image of
the replica. Image access is also required in order to restore the
production storage from the replica, to roll back to a previous state of
the data, to recover files, and to fail over to the replica.
Such access can be enabled only by suspending distribution of data at
the copy, from the journal to the replica volume/s. Access to the
remote image can be either logged or virtual.
To specify the image for which access is to be enabled, the user either
selects an image from a list of images, or specifies a specific
point-in-time. When specifying a point-in-time, a set of specific
search parameters can also be defined.
This section deals with the following topics:
◆
◆
◆
◆
◆
“Image access modes”
“Logged access”
“Virtual access”
“Direct access”
“Disabling image access”
RecoverPoint data recovery procedures
73
Concepts
Image access modes
Table 3
74
When image access is enabled, host applications at the copy site can
access the replica. The access modes listed in the following table are
available.
Image access modes
Mode
Description
Logged access
(physical)
In Logged access, the system rolls backwards (or forwards) to
the snapshot (point in time) you select to access. There will be a
delay while the successive snapshots are applied to the replica
image to create the image you selected. The length of delay
depends on how far the selected snapshot is from the snapshot
currently being distributed to storage.
Once the access is enabled, hosts will have direct access to the
replica volumes, and the RPA will not have access; that is,
distribution of snapshots from the journal to storage will be
paused.
When you disable image access, the writes to the volume while
image access was enabled will be rolled back (undone). Then
distribution to storage will continue from the accessed snapshot
forward.
Virtual access (instant)
In Virtual access, the system creates the image you wish to
access in a separate virtual LUN (or in memory). Access is very
fast, as the system does not actually roll to the image in storage,
and the virtual volume and the physical volume have the same
SCSI ID; therefore the switch from one to the other will be
transparent to servers and applications.
You can use Virtual access in the same way as logged access;
however, it is not suitable if you need to run many commands or
if you need data from large areas of the replica.
When you disable image access, the virtual LUN and all writes
to it are discarded.
Virtual access (instant)
with Roll image in
background
In Virtual access with Roll image in background, the system
creates the image you wish to access in a virtual volume, which
is very fast, as in Virtual access. But, simultaneously in
background, the system rolls to the physical image. Once the
system has rolled to the image, the virtual volume is discarded,
and the physical volume takes its place. At this point, the system
continues as for Logged access. The virtual volume and the
physical volume have the same SCSI ID; therefore the switch
from one to the other will be transparent to servers and
applications.
When you disable image access, the writes to the volume while
image access was enabled will be rolled back (undone). Then
distribution to storage will continue from the accessed image
forward.
EMC RecoverPoint Release 3.3 Administrator’s Guide
Concepts
Table 3
Logged access
Figure 4
Image access modes
Mode
Description
Direct access
Direct access allows host application to write directly to the
replica storage. These changes cannot be automatically
undone; that is, the journal is deleted. To restore consistency
between the replica and the production storage, it will be
necessary to perform a full sweep synchronization.
Direct access generally gives better performance than other
types of image access, especially when the replica is written to
many times. However, the journal is deleted, you cannot undo
the writes, and if a disaster occurs and you lose your other
replicas, you will not be able to remove the writes from this
replica (as you can no longer synchronize to another replica).
The journal consists of snapshots (represented by small vertical lines
in the displayed image), which are collections of one or more writes
to storage. Since the snapshots are always in strict write-order,
writing the next snapshot to storage will create a valid image of the
data at the next point in time. By the same token, rolling back to the
previous snapshot (undoing the last write to storage) will create a
valid image of the storage data as it was at the previous point in time.
Schematic of logged image access
RecoverPoint data recovery procedures
75
Concepts
Image access allows you to pick the exact point in time at which you
wish to see the storage data. The selected snapshot marker (
) in
Figure 3 marks the snapshot you wish to access. In logged access,
RecoverPoint will roll back (or forward) the snapshots, applying
them to the image in storage until the selected image (point in time of
the data) is reached. Once this image is reached (it may take time if
there are many snapshots to undo or to distribute), you can access it
and even modify it (for instance, for testing or data analysis). Any
changes made to the image during image access will be recorded in
the image access log, so that the image can be restored to its exact
state before image access. When you disable image access, all writes
made during image access will be undone (according to the undo
information in the image access log), and the image will roll forward
to distribute writes in the journal to the replica.
Virtual access
Virtual access works on the same principle, except that the image is
not created by rolling back snapshots and applying them to the stored
image. Instead, the image is created on a virtual disk by pulling the
needed data either from the snapshots or from the storage. The
virtual disk is deleted when image access is disabled.
Virtual access is quicker than logged access, but is not suitable for
extensive processing or when large areas of the image must be
accessed.
Direct access
Direct access is an image access mode that does not impose a limit to
the amount of data that you can write to a replica's storage. This type
of image access provides better system performance when accessing
the replica, because no rollback information to the image access log is
being written in parallel with the ongoing disk I/Os.
During direct access, host applications write directly to the replica
storage. These changes cannot be automatically undone, because
when an image is directly accessed, the journal at the replica is
deleted. To restore consistency between the replica and the
production storage, a full-sweep synchronization is required.
However, all of the writes made while direct access is enabled are
stored as marking information, and the synchronization process after
direct image access is, therefore, much more efficient.
Direct access generally gives better performance than other types of
image access, especially when the replica is written-to many times.
However, the journal is deleted, you cannot undo the writes, and if a
disaster occurs and you lose your other replicas, you will not be able
76
EMC RecoverPoint Release 3.3 Administrator’s Guide
Concepts
to remove the writes from this replica (as you can no longer
synchronize to another replica).
If you wish to preserve a particular image of the replica, to give
yourself added protection, you can back up or clone the image before
beginning your offline processing.
Disabling image
access
Before disabling image access, shut down host applications accessing
the replica and unmount replica volumes from the replica host.
Disabling image access restores the storage state to No access.
Changes to the replica recorded in the image access log are
automatically undone, so that the replica is restored to the state it was
in before it was accessed.
Failover
Failing over a consistency group to a local or remote replica allows
system operations to continue as usual from the replica. Hosts
attached to the replica continue operations by running applications. If
the former production site is still functional, it now serves as a replica
of the (former) replica. Snapshots are now transferred from the
(former) replica to the (former) production journal and from the
production journal to the production storage. The journal of the
replica becomes invalid (since the replica is now the source).
The same failover procedure can be used for planned maintenance at
the production site while the replica site takes over normal
operations.
When the production storage has been restored or the planned
maintenance complete, system operations can be resumed at the
original production source by failing back.
Note the special case of Recover Production, in which the production
source is restored (resynchronized) from the selected replica.
Restoration starts from the snapshot (point of time) selected by the
user. From that point of time forward, the production source is
restored from the replica; that is, in the production source, data that is
newer than the selected point in time will be rolled back and
rewritten from the version in the replica. In this case, the replica’s
journal is preserved and remains valid.
The specific failover use cases and procedures for carrying out
different types of failovers are discussed in “Testing, Failover, and
Migration” on page 223.
RecoverPoint data recovery procedures
77
Concepts
RecoverPoint synchronization processes
The following sections describe the data flow and logic of the
RecoverPoint synchronization processes that are responsible for data
consistency.
The following sections deal with the topics:
◆
◆
◆
◆
◆
◆
◆
Initialization
“Initialization”
“Full sweeps”
“Volume sweeps”
“Long initializations”
“Short initializations”
“First-time initializations”
“Fast first-time initializations”
Initialization is the RecoverPoint process used to synchronize the
data of the replica volumes with their corresponding production
volumes, and ensure consistency.
Generally, all synchronization processes in RecoverPoint are called
Initialization, for specific scenarios see “What types of initialization
exist in RecoverPoint?” on page 78.
This section answers the questions:
◆
◆
◆
◆
◆
What types of
initialization exist in
RecoverPoint?
“What types of initialization exist in RecoverPoint?”
“When does initialization occur?”
“How does initialization work?”
“How can I tell a consistency group is being initialized?”
“What should I know about initialization?”
In specific cases, the initialization process can vary to suite the
particular purpose, in an efficient manner, while promoting high
performance.
In these cases initialization can be called:
◆
◆
◆
◆
◆
◆
78
“Full sweeps”
“Volume sweeps”
“Long initializations”
“Short initializations”
“First-time initializations”
“Fast first-time initializations”
EMC RecoverPoint Release 3.3 Administrator’s Guide
Concepts
When does
initialization occur?
Generally, consistency groups are initialized whenever:
◆
A user runs the Start Transfer command after transfer is paused.
◆
A user changes the Primary RPA setting, causing a flipover.
◆
RecoverPoint starts transfer after transfer was paused (for
example, after a WAN outage).
◆
RecoverPoint encounters OCWs or ICWs (over-complete writes
and incomplete writes).
The initialization triggers vary per initialization type (see “What
types of initialization exist in RecoverPoint?” on page 78). See each
initialization type for a list of specific triggers.
How does initialization
work?
The initialization mechanism works as follows:
1. The production and replica volumes are divided into an equal
number of data segments, each.
2. RecoverPoint checks the delta marking information and backlog
information to see which segments of the consistency group
replica volumes are dirty.
3. A small digital signature (hash) is created of all segments of both
the replica and production volumes.
4. The production and replica hashes are compared.
5. RecoverPoint checks these signatures (to see which of the dirty
segments are actually different).
6. RecoverPoint ONLY synchronizes (transfers the data of) the
segments of the replica volumes that are dirty, and whose values
are actually different than their corresponding production volume
segments.
How can I tell a
consistency group is
being initialized?
When initialization occurs, RecoverPoint lets you know it is
happening by:
◆
Logging the following events upon start and finish of
initialization:
• Synchronization started
• Synchronization completed
◆
Displaying the process state and progress. You can see this
information:
RecoverPoint synchronization processes
79
Concepts
• In the RecoverPoint Command Line Interface, by running the
get_group_states command, and verifying that the state of
transfer is Initializing.
• In the RecoverPoint Management Application, by clicking a
consistency group name in the Navigation Pane and
identifying the state of one or all copys’ Transfer in the
consistency group Status Tab is displayed as Init, followed by
the progress of the initialization process, in percent.
How long does
initialization take?
The time that it takes to transfer the data will vary, depending on the
size of the volumes being replicated, network resources, storage
performance, how consistent the replica volume data is with the
production volume source, at the point-in-time that the process is
triggered, and whether compression was enabled as well as the
compressibility of the production data.
During normal replication some consistency group segments differ,
and a partial synchronization is required. During first-time
initialization all of the consistency group segments differ, and a full
synchronization is required.
What should I know
about initialization?
Full sweeps
80
During initialization:
◆
The host applications can be either running or not. Initialization
of one consistency group does not interfere with the operation of
other consistency groups.
◆
All of the production host writes are stored in the replica journal's
image access log. These writes are applied to storage by the
system automatically after initialization is complete.
◆
No snapshots are created, and therefore, no consistent PITs are
available for recovery. In most cases, the initialization process is
short, so this is not an issue.
◆
There must be one complete image at the remote side, in order to
be able to fail over to the replica. This means that until the
initialization process is complete, you will not be able to fail over
in the case of a disaster at the production site.
A full sweep is an initialization process (see “Initialization” on
page 78), which is performed on all of the volumes in a consistency
group, when the RecoverPoint system cannot identify which blocks
are identical between the production and replica volumes, and must
EMC RecoverPoint Release 3.3 Administrator’s Guide
Concepts
therefore mark all blocks for all volumes in the consistency group, as
dirty.
This section answers the questions:
◆
◆
◆
◆
◆
When do full sweeps
occur?
“When do full sweeps occur?”
“How do full sweeps work?”
“How long do full sweeps take?”
“How can I tell that a full sweep is in progress?”
“What should I know about full sweeps?”
During normal operation of the RecoverPoint system, full sweep
initializations should only occur the first time a consistency group is
created. However, to guarantee consistency between a production
source and its replica, there may be other cases in which a full sweep
may be required.
Full sweeps are required when:
◆
A user enables a disabled consistency group.
◆
A user accesses the replica in Direct access mode.
◆
A user removes a journal from a consistency group.
◆
A user changes the Proportion of journal allocated for image
access log (20-80%) setting.
◆
There is no marking information due to:
• A splitter malfunction (in which case the consistency group is
still enabled but unavailable until the splitter functions again).
• An I/O load that exceeds the system limit (see the EMC
RecoverPoint and RecoverPoint/SE Release Notes for this limit).
• An RPA being unable to perform marking (for example, the
production journal is inaccessible).
• A double (hardware) disaster (for example, a concurrent
failure of a splitter and an RPA).
• A production journal loss or malfunction.
• A user runs the Set Markers command.
Full sweeps also occur when:
◆
A user replaces a Brocade switch (whether there is one path or
multi path to storage).
RecoverPoint synchronization processes
81
Concepts
How do full sweeps
work?
How long do full
sweeps take?
How can I tell that a
full sweep is in
progress?
◆
A user swaps LUN numbers on the storage array, when the LUNs
have already been exposed to RPAs (if not done according to
procedure).
◆
In Brocade, a user binds and performs LUN discovery on the
storage, causing a new ITL to be discovered to a specific volume.
Same as the initialization process, except that in a full sweep, ALL of
the volume segments in the consistency group are marked as dirty
(see “How does initialization work?” on page 79).
See “How long does initialization take?” on page 80 for a detailed
description.
When a full sweep occurs, RecoverPoint lets you know it is
happening by:
◆
Logging the event Next synchronization will be a full sweep
before the following initialization events:
• Synchronization started
• Synchronization completed
◆
Displaying the process state and progress. You can see this
information:
• In the RecoverPoint Management Application, by clicking a
consistency group name in the Navigation Page and
identifying the state of Transfer in the consistency group
Status Tab is Init, followed by the progress of the initialization
process, in percent.
• In the RecoverPoint Command Line Interface, by running the
get_group_states command, and verifying that the state of
transfer is Initializing.
Avoiding full sweep
when swapping LUN
numbers
Swapping LUN numbers when those LUNs that have already been
exposed to an RPA cluster should be avoided when possible. When it
cannot be avoided, it should be done according to the following
procedure, to avoid a full sweep of the entire consistency group.
Loss of the journal cannot be avoided.
1. Remove both LUNs from their respective replication sets. Do not
disable the consistency group.
2. Swap the LUNs (on the storage array).
82
EMC RecoverPoint Release 3.3 Administrator’s Guide
Concepts
3. Refresh the SAN view. Use one of the following procedures:
a. If using SANTap services. Refresh the SAN view of the
SANTap switch. Use the RecoverPoint CLI command
refresh_santap_view.
b. If using Brocade splitter agent, wait 5 minutes, to allow the
switch time to update its SAN view.
c. If using host-based splitter:
AIX hosts. Use AIX shell command cfgmgr to rescan the SAN.
Then from AIX command line, run rc.kdrv refresh_view.
Solaris hosts. Use Solaris shell command drvconfig to rescan
the SAN. Then from Solaris command line, run rc.kdrv
refresh_view.
Windows hosts. Click Start > Settings > Control Panel >
System. On Hardware tab, double-click Device Manager.
Right-click on Disk drives > Scan for hardware changes.
4. Add LUNs to their replication sets. They will automatically be
attached to their splitters.
A volume sweep of those LUNs occurs.
What should I know
about full sweeps?
Volume sweeps
See “What should I know about initialization?” on page 80 for a
detailed description.
A volume sweep is an initialization process (see “Initialization” on
page 78), which is performed on a specific volume in a consistency
group, whose algorithm is the same as that of a full sweep, except
that in a volume sweep, only the segments of specific volume/s are
marked as dirty.
Note: A volume sweep on all volumes in a consistency group is called a full
sweep.
This section answers the questions:
◆
◆
◆
◆
◆
◆
“What are volume sweeps used for?”
“When do volume sweeps occur?”
“How do volume sweeps work?”
“How long do volume sweeps take?”
“How do I know that a volume sweep is in progress?”
“What should I know about volume sweeps?”
RecoverPoint synchronization processes
83
Concepts
What are volume
sweeps used for?
Volume sweeps are used when the RecoverPoint system cannot
identify which blocks are identical between the production and
replica volume, and must therefore mark all blocks in the volume as
dirty.
When do volume
sweeps occur?
Volume sweeps occur whenever a user:
How do volume
sweeps work?
◆
Defines a new volume in RecoverPoint.
◆
Defines a new splitter in RecoverPoint.
◆
Enables a volume that has been disabled.
◆
Enables a splitter that has been disabled.
◆
Adds a new Replication Set to an enabled consistency group.
◆
In SANTap or Brocade, attaches a new host.
◆
Manually attaches a volume to a splitter.
◆
Adds a new replication set to an enabled consistency group that
does not contain any additional replication sets.
Same as the initialization process, except that in a volume sweep,
ALL of the segments of the specific volume/s are marked as dirty (see
“How does initialization work?” on page 79).
How long do volume
sweeps take?
See “How long does initialization take?” on page 80 for a detailed
description.
How do I know that a
volume sweep is in
progress?
When a volume sweep occurs, RecoverPoint lets you know it is
happening by:
◆
Logging the event Next synchronization will be a volume sweep
before the following initialization events:
• Synchronization started
• Synchronization completed
◆
Displaying the process state and progress. You can see this
information:
• In the RecoverPoint Management Application, by clicking a
consistency group name in the Navigation Page and
identifying the state of Transfer in the consistency group
Status Tab is Init, followed by the progress of the initialization
process, in percent.
84
EMC RecoverPoint Release 3.3 Administrator’s Guide
Concepts
• In the RecoverPoint Command Line Interface, by running the
get_group_states command, and verifying that the state of
transfer is Initializing.
What should I know
about volume
sweeps?
Long initializations
See “What should I know about full sweeps?” on page 83 for a
detailed description.
Also, when a volume sweep is triggered, all of the volumes of the
consistency group undergo a short initialization (see “Short
initializations” on page 85) in parallel with the volume sweep.
See “One-phase distribution” on page 102 to answers the questions:
◆
◆
◆
Also known as:
Short initializations
“When is one-phase distribution triggered?”
“How do I know one-phase distribution is happening?”
“What are the work-arounds to one-phase distribution?”
Long initialization is also known as One-phase distribution, Long init,
Long re-sync, Non-consistent init, Init nc, and Init non-consistent.
A short initialization is an initialization process (see “Initialization”
on page 78) that uses marking information to re-synchronize a copy's
replica volumes with their production sources. Because this
initialization process uses delta markers to synchronize the replica
with production, the initialization process is much faster and more
efficient.
This section answers the questions:
◆
◆
◆
◆
◆
“When do short initializations occur?”
“How do short initializations work?”
“How long do short initializations take?”
“How do I know that a short initialization is in progress?”
“What should I know about short initializations?”
RecoverPoint synchronization processes
85
Concepts
Also known as:
When do short
initializations occur?
Short initialization is also known as Short init, Short resync, Short
resynchronization, and Resynchronization.
Short initializations generally occurs when restarting transfer for a
consistency group after a pause in transfer.
How do short
initializations work?
See “How does initialization work?” on page 79 for a detailed
description.
How long do short
initializations take?
See “How long does initialization take?” on page 80 for a detailed
description.
How do I know that a
short initialization is in
progress?
See “How can I tell a consistency group is being initialized?” on
page 79 for a detailed description.
What should I know
about short
initializations?
See “What should I know about initialization?” on page 80 for a
detailed description.
First-time
initializations
A first-time initialization is a full sweep (see “Full sweeps” on
page 80) that happens when a consistency group is enabled.
First-time initializations generally happen when the system is new,
and all consistency group volumes need to undergo a full sweep
before they can be used for data recovery purposes.
Note: During first-time initialization, the journal is unnecessary, since the
replica does not contain previous data that can be used to construct a
complete image, and a complete image must be transferred before failover is
possible.
The following considerations apply to first-time initializations only:
86
◆
You can use the initialization from backup procedure to initially
synchronize the replica with production, saving the time it would
otherwise take to synchronize the data over a WAN or FC
connection (see “First-time initialization from backup” on
page 225).
◆
By default, RecoverPoint writes the first snapshot directly to the
replica storage, bypassing the journal. You can override the
default setting (using the Perform fast first-time initialization
EMC RecoverPoint Release 3.3 Administrator’s Guide
Concepts
setting, see “Fast first-time initializations” on page 87) to write
the initialization snapshot first to the journal (which is more
time-consuming but provides greater data protection).
◆
To enable failover during initialization, it is recommended to
disable the Allow distribution of snapshots that are larger than
capacity of journal volumes setting.
See “Initialization” on page 78 for a more detailed description of
initialization.
Fast first-time
initializations
A first-time initialization is a full sweep (see “Full sweeps” on
page 80) that happens when a consistency group is enabled. The
initialization snapshot is written directly to the replica storage (see
“First-time initializations” on page 86), bypassing the journal, and the
Perform fast first-time initialization setting is enabled.
Note: During first-time initializations, the journal is unnecessary, since the
replica does not contain previous data that can be used to construct a
complete image, and a complete image must be transferred before failover is
possible.
See “Initialization” on page 78 for a more detailed description of
initialization.
This section answers the questions:
◆
◆
◆
◆
◆
“When do fast first-time initializations occur?”
“How do fast first-time initializations work?”
“How long do fast first-time initializations take?”
“How do I know a fast first-time initialization is occurring?”
“What do I need to know about fast first-time initializations?”
When do fast first-time
initializations occur?
Whenever a consistency group is enabled for the first time, while the
Perform fast first-time initialization setting is enabled. By default,
the Perform fast first-time initialization setting is enabled.
How do fast first-time
initializations work?
The fast first-time initialization process is the same as the
Initialization process (see “How does initialization work?” on
page 79). However, in first-time initialization processes, RecoverPoint
writes the initialization snapshot directly to the storage, bypassing
the journal.
RecoverPoint synchronization processes
87
Concepts
How long do fast
first-time initializations
take?
How do I know a fast
first-time initialization
is occurring?
This process is substantially faster than the Initialization process
performed through the journal (when Perform fast first-time
initialization setting is disabled), see “How long does initialization
take?” on page 80.
When a fast first-time initialization occurs, RecoverPoint lets you
know it is happening by changing the state of Transfer to Init and the
Image state to Long resync.
To display these indicators:
What do I need to
know about fast
first-time
initializations?
88
◆
In the RecoverPoint Command Line Interface, run the
get_group_states command, to view the image and transfer
states.
◆
In the RecoverPoint Management Application, click a consistency
group name in the Navigation Page to display the consistency
group Status Tab to view the Image and Transfer states.
During fast first-time initializations, the distribution process is much
faster, but no history is saved in the journal. Also, from the start of the
process, and until the end of the process, the replica is not consistent
with its production source, Therefore, if a disaster were to occur
during this process, you would not be able to fail over to the replica
and a full sweep would be required.
EMC RecoverPoint Release 3.3 Administrator’s Guide
Concepts
RecoverPoint data flow
The following sections describe the data flow and logic of the
RecoverPoint phases that are responsible for replication.
RecoverPoint
replication phases
The three major phases performed by RecoverPoint to guarantee data
consistency and availability (RTO and RPO) during replication.
There are three RecoverPoint replication phases:
◆
◆
◆
“The write phase” (the splitting phase)
“The transfer phase”
“The distribution phase”
Each of these phases are processes performed by each consistency
groups Primary RPA, and controlled by the policies and settings set
by the user, through the Hardware Management Wizard, the
RecoverPoint Management Application and the RecoverPoint
Command Line Interface.
The write phase
The write phase is the RecoverPoint replication phase in which host
writes are intercepted by the splitter and received by the local RPA,
prior to transfer (see “The transfer phase” on page 90).
Generally, the flow of data for write transactions is as follows:
1. The production host writes data to the production volumes which
is intercepted by the splitter. The splitter sends the write data to
the RPA.
2. Immediately upon receipt of the write data, the local RPA returns
an ACK to the splitter.
3. The splitter then writes the data to the production storage
volume.
4. The storage system returns an ACK to the splitter upon
successfully writing the data to storage.
5. The splitter sends an ACK to the host that the write has been
completed successfully.
The sequence of events 1-5 can be repeated multiple times, and in
parallel, for multiple writes.
RecoverPoint data flow
89
Concepts
Note: This is the flow of data for host-based splitters. For intelligent fabric
and array splitters, the flow of data varies, per splitter.
The transfer phase
The transfer phase is the RecoverPoint replication phase in which
host writes are sent from a source RPA to a target RPA, after “The
write phase” and before “The distribution phase”.
The transfer phase differs for:
◆
◆
◆
“Non-distributed (regular) groups”
“Asynchronous distributed groups”
“Synchronous distributed groups”
For distributed and non-distributed (regular) consistency groups, the
transfer phase is over when an ACK is received by the source RPA.
Note: The following examples are of CRR configurations. In CDP
configurations (see “RecoverPoint configurations” on page 22), all of the
following steps are performed within the local RPA using inter-process
communications (IPC).
See “Consistency groups” on page 30 for more information on
distributed and non-distributed consistency groups.
Non-distributed
(regular) groups
For non-distributed consistency groups, the flow of data during the
transfer phase is as follows:
1. After processing the data (for example, applying the various
compression techniques), the source RPA sends the data to the
target RPA.
90
EMC RecoverPoint Release 3.3 Administrator’s Guide
Concepts
2. The target RPA writes the data to the journal.
3. Upon the successful writing of the data to the RPA or the journal
(depending on the value of the Measure lag when writes reach
the target RPA (opposed to journal setting), the target RPA
returns an ACK to the source RPA, see “Measure lag when writes
reach the target RPA (as opposed to the journal)” on page 150.
Asynchronous
distributed groups
For distributed consistency groups, in asynchronous replication
mode, the flow of data during the transfer phase is as follows:
1. The source Primary RPA divides the consistency group into four
segments.
2. The source Primary RPA routes the relevant consistency group
segments to the appropriate secondary RPAs at the source side.
3. After processing the data (for example, applying the various
compression techniques), the primary and secondary RPAs at the
source side send their consistency group segments to their
corresponding target RPAs.
RecoverPoint data flow
91
Concepts
4. Each target RPA writes the data of its consistency group segment
to a different stream in the replica journal.
Note: Upon the successful writing of the data to the target RPA or the
replica journal (depending on the value of the Measure lag when writes
reach the target RPA setting), each target RPA returns an ACK to its
corresponding source RPA, see “Measure lag when writes reach the
target RPA (as opposed to the journal)” on page 150.
5. All secondary RPAs at the source side send their ACKs back to
the Primary RPA.
Synchronous
distributed groups
For distributed consistency groups, in synchronous replication mode,
the flow of data during the transfer phase is as follows:
1. The source Primary RPA sends the consistency group data to the
Primary RPA at the target side.
92
EMC RecoverPoint Release 3.3 Administrator’s Guide
Concepts
2. The Primary RPA at the target side divides the consistency group
data into four segments, and routes the relevant consistency
group segments to the appropriate secondary RPAs.
3. Each target RPA writes the data of its consistency group segment
to a different stream in the target journal.
Note: Upon the successful writing of the data to the target RPA or the
replica journal (depending on the value of the Measure lag when writes
reach the target RPA setting), each secondary RPA at the target side
returns an ACK to the primary RPA at the target side, see “Measure lag
when writes reach the target RPA (as opposed to the journal)” on
page 150.
4. The primary RPA at the target side returns an ACK to the primary
RPA at the source side.
Note: During initialization, the data flow of synchronous distributed groups
is identical to that of asynchronous distributed groups, see “Asynchronous
distributed groups” on page 91.
The distribution
phase
The distribution phase is the RecoverPoint replication phase
responsible for the writing of the production image to the target
replica storage, which is performed by the target RPA, after “The
transfer phase”. Since the replica storage is being written to during
this process, during distribution, the state of the replica Storage is No
Access.
By default, the system distributes in five-phase distribution mode
(see “Five-phase distribution” on page 94). In rare cases the system
switches to three-phase distribution mode (also called
fast-forwarding, see “Three-phase distribution” on page 97), and in
some initialization scenarios, the system switches to one-phase
distribution mode (see “One-phase distribution” on page 102).
The replica journal history consists of snapshots that have already
been distributed to the replica storage and snapshots that are still
waiting for distribution in the queue of snapshots waiting for
distribution. When data is received by the RPA faster than it can be
distributed to the replica storage volumes, it accumulates in the
queue of snapshots waiting for distribution of the replica journal.
RecoverPoint data flow
93
Concepts
The Maximum Journal Lag setting dictates the maximum amount of
snapshot data (in MB or GB) that is permissible to retain in the replica
journal before distribution to the replica storage. In other words, the
amount of data that would have to be distributed to the replica
storage before failover to the latest image could take place, or (in
terms of RecoverPoint’s role in the RTO) the maximum time that
would be required in order to bring the replica up-to-date with
production.
Note: For distributed consistency groups, regardless of the distribution
mode, each target RPA is responsible for the distribution of its own
consistency group segments to the replica storage. See “Consistency groups”
on page 30 for more information on distributed and non-distributed
consistency groups.
The following sections deal with the topics:
◆
◆
◆
◆
Five-phase
distribution
“Five-phase distribution”
“Three-phase distribution”
“One-phase distribution”
“How do I monitor distribution performance in RecoverPoint?”
Five-phase distribution is the default distribution mode in
RecoverPoint.
This section answers the questions:
◆
◆
◆
“How does the default (five-phase) distribution process work?”
“How do I know five-phase distribution is happening?”
“How do I monitor the performance of five-phase distribution?”
How does the default (five-phase) distribution process work?
The five-phase distribution process works in the following way:
1. The target RPA writes the newest data (the most current writes
made by the host applications) to the beginning of the queue of
snapshots waiting for distribution in the replica journal.
2. The target RPA reads the oldest data at the end of the queue of
snapshots waiting for distribution to the replica storage.
3. The target RPA reads the current data of the replica volume.
94
EMC RecoverPoint Release 3.3 Administrator’s Guide
Concepts
4. The target RPA writes the current data of the replica volume to
the top of the distributed snapshot list, so that the replica volume
can be rolled back.
5. The target RPA writes the newest data to the replica storage.
How do I know five-phase distribution is happening?
The following indicators are displayed in the RecoverPoint
Management Application (GUI) during five-phase distribution:
◆
In the consistency group Status Tab; the Image value is either
Distributing pre-replication image or Distributed followed by
date and timestamp of the snapshot that is currently being
distributed.
RecoverPoint data flow
95
Concepts
◆
In the copy Journal Tab, the Current parameter value is either
Distributing pre-replication image or Distributed followed by the
date and timestamp of the snapshot that is currently being
distributed.
◆
The following indicators are displayed in the RecoverPoint
Command Line Interface (CLI) during five-phase distribution:
When the command get_group_state is run, the value of the
Journal parameter is DISTRIBUTING IMAGES TO STORAGE.
How do I monitor the performance of five-phase distribution?
To monitor the performance (and duration) of the distribution
process in five-phase distribution mode, see “How do I monitor
distribution performance in RecoverPoint?” on page 104.
96
EMC RecoverPoint Release 3.3 Administrator’s Guide
Concepts
Three-phase
distribution
Three-phase distribution is a distribution process that skips the steps
necessary for the storing of data in the distributed snapshot list of the
replica journal, in order to overcome journal or storage performance
issues, or enforce RTO policies set by a user.
The system only switches to three-phase distribution when the
write-rate of the production host is greater than the distribution-rate
of five-phase distribution. This, over time, can cause the queue of
snapshots waiting for distribution to reach the maximum journal
capacity (or the value of the maximum journal lag setting), and leave
no space for the distributed snapshot list to be saved. In three-phase
distribution, there are only three I/O operations as opposed to the
five of five-phase distribution, which decreases the minimum
throughput requirement needed to distribute write information to the
replica storage, and significantly speeds up the distribution process.
Typically, the system will do its best to stay in five-phase distribution
mode, and will go back to five-phase distribution as soon as possible
after entering three-phase distribution mode.
Note: Since the three-phase distribution process does not write the data to the
distributed snapshot list of the replica journal, no PITs can be recovered prior
to the time that the process is triggered. However, this is not cause for
data-loss concern as during this process, all write-data is saved in the queue
of snapshots waiting for distribution in the replica journal.
This section answers the questions:
◆
◆
◆
◆
◆
Also known as:
“When is three-phase distribution triggered?”
“When does three-phase distribution end?”
“What should I be aware of in three-phase distribution mode?”
“How does three-phase distribution work?”
“How do I know three-phase distribution is happening?”
Three-phase distribution is also known as: Fast-forwarding
When is three-phase distribution triggered?
The system switches to three-phase distribution mode (also called
fast-forwarding):
◆
When there are performance issues with either the replica or
journal storage.
◆
When the journal lag exceeds the Maximum Journal Lag setting
(that defines the RTO policy) in the Journal Copy Policy, see
“Copy Journal Policy Settings” on page 155.
RecoverPoint data flow
97
Concepts
When does three-phase distribution end?
If the system entered three-phase distribution mode because of
performance issues with either the replica or journal storage, the
system resumes five-phase distribution after a short period of time.
If the system entered three-phase distribution mode because of the
journal lag exceeded the Maximum Journal Lag setting, the system
resumes five-phase distribution as soon as the actual journal lag falls
below the value of the Maximum Journal Lag setting.
What should I be aware of in three-phase distribution mode?
During three-phase distribution, no undo information is saved in the
replica journal.
Also, when a three-phase distribution process is triggered, some of
the journal history is lost.
The exact amount of journal history that is lost, and why, can be
generally explained as follows.
The replica journal history consists of snapshots that have already
been distributed to the replica storage and snapshots that are still
waiting for distribution in the queue of snapshots waiting for
distribution. When data is received by the RPA faster than it can be
distributed to the replica storage volumes, it accumulates in the
queue of snapshots waiting for distribution of the replica journal.
The Maximum Journal Lag is the maximum amount of snapshot data
(in MB, or GB) that is permissible to retain in the replica journal
before distribution to the replica storage. In other words, the amount
of data that would have to be distributed to the replica storage before
failover to the latest image could take place, or (in terms of RTO) the
maximum time that would be required in order to bring the replica
up-to-date with production.
When the Maximum Journal Lag value is exceeded, in order to
accelerate the distribution process, the system starts clearing out all
snapshots irrelevant to the data that would have to be distributed to
the replica storage before failover to the latest image could take place,
to ensure the RTO policy.
98
EMC RecoverPoint Release 3.3 Administrator’s Guide
Concepts
For example:
Lets say production has made writes up to the present point-in-time
(Now), and all of the snapshots in the replica journal up until
point-in-time T1 are waiting for distribution, after which all
snapshots have already been distributed to the replica storage (up
until point-in-time T0), so the total journal lag (the maximum time
that would be required in order to bring the replica up-to-date with
production) is between point-in-time T1 and Now (or Now > T2 > T1),
where point-in-time T1 represents the state of data at the replica
storage Now.
Now let’s say that a Maximum journal lag policy of 1GB has been set
by the user (see “Copy Journal Policy Settings” on page 155) and the
journal lag exceeds the specified value.
In this case:
1. RecoverPoint discards all of the snapshots that have already been
distributed (T1>T0) to the replica storage, since this data has
already been applied to the replica.
2. To ensure the RTO policy is met, RecoverPoint starts distributing
in three-phase distribution mode until the journal lag falls below
the maximum journal lag policy (T2).
3. When the journal lag falls below the maximum journal lag policy
(T2), RecoverPoint resumes five-phase distribution.
Note: All of the snapshots between Now > T2 (the maximum journal lag),
and all following snapshots, are retained in the replica journal, and
available for data recovery.
RecoverPoint data flow
99
Concepts
How does three-phase distribution work?
The three-phase distribution process works in the following way:
1. The target RPA writes the newest data (the most current writes
being made by the host applications) to the beginning of the
queue of snapshots waiting for distribution in the replica journal.
2. The target RPA reads the oldest data at the end of the queue of
snapshots waiting for distribution to the replica storage.
3. The target RPA writes the data to the replica volume.
100
EMC RecoverPoint Release 3.3 Administrator’s Guide
Concepts
How do I know three-phase distribution is happening?
The following indicators are displayed in the RecoverPoint
Management Application (GUI) during three-phase distribution:
◆
In the consistency group Status Tab; the text (fast forward) is
displayed next to the current image time stamp.
◆
In the copy Journal Tab; (fast forward) is displayed next to the
current image time stamp.
◆
The following indicators are displayed in the RecoverPoint
Command Line Interface (CLI) during three-phase distribution:
RecoverPoint data flow
101
Concepts
When the command get_group_statistics is run; the value of the
Mode parameter in the copy Journal section is Fast Forward.
How do I monitor the performance of three-phase distribution?
To monitor the performance (and duration) of the distribution
process in three-phase distribution mode, see “How do I monitor
distribution performance in RecoverPoint?” on page 104.
One-phase
distribution
102
One-phase distribution is a distribution process in which the target
RPA writes the initialization data directly to the replica volume
(bypassing the journal). This process is used to save on initialization
time, in times in which the saving of a journal history is not critical
(for example, in first-time initialization, when the first snapshot being
transferred contains the whole image). When the initialization
EMC RecoverPoint Release 3.3 Administrator’s Guide
Concepts
snapshot is too large for the capacity of the journal dedicated on
storage, and the saving of a journal history is not critical, enabling
this distribution mode saves the cost of adding additional journal
volumes for the sole purpose of storing the initialization snapshot.
Note: During one-phase distribution, the distribution process is much faster,
but no history is saved in the journal. Also, from the start of the process, and
until the end of the process, the replica is not consistent with its production
source. Therefore, if a disaster were to occur during this process, you would
not be able to fail over to the replica until a full sweep was performed.
This section answers the questions:
◆
◆
◆
“When is one-phase distribution triggered?”
“How do I know one-phase distribution is happening?”
“What are the work-arounds to one-phase distribution?”
RecoverPoint data flow
103
Concepts
Also known as:
One-phase distribution is also known as: Long initialization, Long init,
Long re-sync, Non-consistent init, Init nc, Init non-consistent
When is one-phase distribution triggered?
The system switches to one-phase distribution mode only during
initialization, and only in the following cases:
During first-time initialization:
◆
When a group is first enabled (because the Perform fast first-time
initialization setting is enabled by default).
◆
During a long resync (also known as non-consistent initialization)
when the Allow distribution of snapshots that are larger than
the capacity of journal volumes setting is enabled (This option is
enabled by default, but can be changed by the user).
How do I know one-phase distribution is happening?
During one-phase distribution, for the replica being initialized:
◆
A warning event with the text initializing in long resync mode is
logged.
◆
The image state is displayed as Long resync.
You can display the image state:
• By running the get_group_states command in the
RecoverPoint Command Line Interface (CLI).
• In the consistency group Status Tab of the RecoverPoint
Management Application (GUI).
What are the work-arounds to one-phase distribution?
To avoid long initializations, perform one of the following procedures
in the RecoverPoint Administrator's Guide:
How do I monitor
distribution
performance in
RecoverPoint?
104
◆
Use the initialization from backup procedure.
◆
Enable the Allow distribution of snapshots that are larger than
the capacity of the journal setting.
◆
Add volumes to the replica journal - in this case, the additional
journal volumes (space) must be permanent or a full sweep will
occur when the volumes are removed (even though the user
really only needs this extra space for the duration of the init).
To monitor distribution process performance:
1. Run the detect_bottlenecks command in the RecoverPoint
Command Line Interface (CLI).
EMC RecoverPoint Release 3.3 Administrator’s Guide
Concepts
2. Select the 4) General detection including initialization and high
load periods with peak writing analysis option.
3. Answer yes to both Do you want an advanced overview? and Do
you want a detailed overview?
4. Specify the other required information.
5. Use your spacebar to scroll-down until you reach a section that
starts with the text: System overview of the copy:<copyname>
6. See “Five-phase distribution”. Note the process steps labeled as
Distributor phase 1 and Distributor phase 2, and note their
performance statistics.
7. See “Three-phase distribution”. Note the process labeled Fast
forward distribution duration, and note the statistics.
RecoverPoint data flow
105
Concepts
RecoverPoint workflows
The following sections deal with the topics:
◆
◆
◆
◆
“Configuring replication”
“Monitoring and managing RecoverPoint”
“Moving operations to another site”
“Event notification”
Configuring
replication
After completing the installation process (see EMC RecoverPoint
Deployment Manager Product Guide), configure the RecoverPoint
system to start replication. The entire process is described
step-by-step in “Starting Replication” on page 127.
Monitoring and
managing
RecoverPoint
Once replication has been configured and started, the user normally
does not need more than minimal involvement with the system.
Monitoring replication is very simple and described in “Managing
and Monitoring” on page 177.
RecoverPoint management activities provide access to the settings
affecting RecoverPoint operation. The settings, how they affect
operation, and how to access them is described in “Managing and
Monitoring” on page 177.
Moving operations
to another site
In case of disaster, or simply to perform routine maintenance on your
production site, you may wish to fail over operations to another site.
Failover use cases and the procedures for carrying out different types
of failovers are discussed in “Testing, Failover, and Migration” on
page 223.
Event notification
RecoverPoint supports the following types of event notification:
◆
◆
◆
◆
◆
◆
106
“E-mail notification”
“SNMP notification”
“Syslog notification”
“System reports”
“System alerts”
“Collecting system information”
EMC RecoverPoint Release 3.3 Administrator’s Guide
Getting Started
Getting Started
This section describes how to obtain and enter RecoverPoint license
and activation codes and how to administer user settings.
The topics in this section are:
◆
◆
◆
◆
Licensing overview..........................................................................
The Getting Started Wizard ............................................................
Managing RecoverPoint licences ...................................................
Access control ...................................................................................
Getting Started
108
110
113
119
107
Getting Started
Licensing overview
The temporary license key that ships with RecoverPoint appliances is
valid for seven days from initial installation. After that, you must
have an activated license to replicate with RecoverPoint appliances.
◆
To enable RecoverPoint for seven days, see “Defining your
license key in RecoverPoint” on page 113.
◆
To enable RecoverPoint permanently, see “Requesting an
activation code” on page 114 and “Defining your activation code
in RecoverPoint” on page 115.
◆
To upgrade RecoverPoint licenses, see “Upgrading your license”
on page 116.
◆
To reactivate RecoverPoint licenses, see “Re-activating your
license” on page 117.
◆
To display your RecoverPoint license information, see “Viewing
your license information” on page 118.
The license controls the following system capabilities:
Table 4
108
RecoverPoint license parameters
Parameter
Description
Range of Values
Expiration
When license expires
• Number of days until
expiration
• Permanent license
Storage Type
Supported
Types of storage arrays supported
• Unlimited
• Homogeneous
Storage Arrays
Number of storage arrays supported
at a single site
none
Cluster Size
Maximum number of RPAs installed
at each site
2–8
Remote
Replication
Maximum capacity of remote replica
none
EMC RecoverPoint Release 3.3 Administrator’s Guide
Getting Started
Table 4
RecoverPoint license parameters
Parameter
Description
Range of Values
Local
Replication
Maximum capacity of local replica
none
Compression
Is compression of data transferred
over the WAN supported?
Supported/not supported
Journal
compression
Whether journal compression is
enabled.
Supported/not supported
Licensing overview
109
Getting Started
The Getting Started Wizard
The first time the Management Application GUI is run after a
RecoverPoint installation, the Getting Started Wizard is displayed to
guide users through the steps necessary to configure the basic
RecoverPoint settings needed to deploy their applications properly.
This wizard is only displayed once, and is no longer accessible after
these initial settings have been defined.
Although the wizard is no longer accessible, all of the settings
defined through the wizard are. These settings and how they can be
accessed are discussed in detail in each relevant section.
There are three screens in the Getting Started Wizard:
1. Welcome screen - provides users with a brief description of the
wizard and its functionality, see “Welcome screen” on page 110.
2. Account Settings screen - allows users to enter and display their
licensing information and activate the product, see “Account
Settings screen” on page 110.
3. System Report Settings screen - allows users to configure the
RecoverPoint system report mechanism, see “System Report
Settings screen” on page 111.
Welcome screen
The first time you access the RecoverPoint Management Application,
the Getting Started Wizard is displayed. The Welcome Screen is the
first screen of the Getting Started Wizard. It provides users with a
brief description of the wizard and its functionality.
Click the Next button in the welcome screen to configure your
RecoverPoint license, see “Account Settings screen” on page 110.
Account Settings
screen
110
The Account Settings screen is displayed via the:
◆
Getting Started Wizard, the first time you open the Management
Application GUI after RecoverPoint is installed.
◆
System > System Settings > Account Settings Tab of the
Management Application GUI, each subsequent time the
Management Application GUI is displayed.
EMC RecoverPoint Release 3.3 Administrator’s Guide
Getting Started
In the Account Settings screen:
◆
To enable RecoverPoint for seven days, see “Defining your
license key in RecoverPoint” on page 113.
◆
To enable RecoverPoint permanently, see “Requesting an
activation code” on page 114 and “Defining your activation code
in RecoverPoint” on page 115.
◆
To upgrade RecoverPoint licenses, see “Upgrading your license”
on page 116.
◆
To reactivate RecoverPoint licenses, see “Re-activating your
license” on page 117.
◆
To display your RecoverPoint license information, see “Viewing
your license information” on page 118.
Click the Next button in the Account Settings screen to configure the
system alerts and reports mechanisms, see “System Report Settings
screen” on page 111.
System Report
Settings screen
The System Report Settings are displayed via the:
◆
Getting Started Wizard, the first time you open the Management
Application GUI after RecoverPoint is installed.
◆
System > System Settings > System Report Settings Tab of the
Management Application GUI, each subsequent time the
Management Application GUI is displayed.
In the System Report Settings screen:
◆
Specify the Transfer Method through which you want to send the
system report. Server addresses may be entered either in IP or
DNS format. If entered in IP format, both IPv4 and IPv6 addresses
are valid.
◆
To enable the automatic sending of weekly system reports, check
the System Reports checkbox.
◆
To include system alerts in the system report, check the System
Alerts checkbox.
◆
To encrypt the output with RSA encryption using a 256-bit key
before sending, check the Encrypt checkbox.
◆
To compress the output before sending, check the Compress
checkbox.
The Getting Started Wizard
111
Getting Started
Note: See “System reports” on page 260 and “System alerts” on page 266 for
more information about RecoverPoint reports and alerts.
Click the Finish button to close the Getting Started Wizard and
apply your changes.
112
EMC RecoverPoint Release 3.3 Administrator’s Guide
Getting Started
Managing RecoverPoint licences
The following sections guide you through the processes of:
◆
◆
◆
◆
◆
◆
Prerequisites
Defining your
license key in
RecoverPoint
Before you begin
Before performing the following tasks:
◆
Make sure you have the email sent to you by your EMC account
executive containing your RecoverPoint license information. You
will need to have your Account ID, Software Serial IDs, Company
Name, Contact Info (e-mail address) and RecoverPoint License
Key readily available.
◆
Make sure you are logged into the RecoverPoint Management
Application as admin.
To start using RecoverPoint, you must define a valid license key in
the RecoverPoint system.
Within seven calendar days of defining a valid license key in the
RecoverPoint system, you must also define a valid activation code. If
a valid activation code is not defined within seven calendar days,
dialog boxes will be blank and RecoverPoint will not work. See
“Requesting an activation code” on page 114.
Before you begin:
◆
◆
How to define your
license key in
RecoverPoint
“Defining your license key in RecoverPoint”
“Requesting an activation code”
“Defining your activation code in RecoverPoint”
“Upgrading your license”
“Re-activating your license”
“Viewing your license information”
Read “Licensing overview” on page 108
Perform the “Prerequisites” on page 113
To define your license key in the RecoverPoint system:
1. If you are installing the RecoverPoint license from the Getting
Started Wizard (see “The Getting Started Wizard” on page 110),
skip this step.
From the main menu of the RecoverPoint Management
Application GUI, select System > System Settings and click the
Account Settings link in the navigation area.
Managing RecoverPoint licences
113
Getting Started
The Account Settings screen is displayed.
2. Copy the Account ID, Software Serial IDs, Company Name, and
Contact Info (e-mail address) from the email containing your
license information to the appropriate fields.
3. Click the Update button.
The Updating license key dialog box is displayed.
a. Copy the license key from the email containing your
license information to the License Key field
b. Click the OK button to exit the dialog box.
4. Click the Apply button.
Your license key should now be displayed in the Licence Keys
section of the dialog box. RecoverPoint is now enabled for use for
seven calendar days.
To enable RecoverPoint permanently, go on to perform “Requesting
an activation code” on page 114.
Requesting an
activation code
Before you begin
To enable RecoverPoint permanently, you must define a valid license
key and activation code in the RecoverPoint system.
Before you begin:
◆
◆
How to request an
activation code
Perform the “Prerequisites” on page 113.
Perform the steps in “Defining your license key in RecoverPoint”
on page 113.
To request an activation code:
1. If you are installing the RecoverPoint license from the Getting
Started Wizard (see “The Getting Started Wizard” on page 110),
skip this step.
From the main menu of the RecoverPoint Management
Application GUI, select System > System Settings and click the
Account Settings link in the navigation area.
The Account Settings screen is displayed.
2. Click the Obtain Activation Code link.
The RecoverPoint Licensing Server Login page is displayed in
your default browser window.
114
EMC RecoverPoint Release 3.3 Administrator’s Guide
Getting Started
3. Open the email containing your license information. Copy your
license key and account ID, and paste them into the appropriate
fields on the Web page.
4. Click the Login button.
The RecoverPoint Licensing Server License Details page is
displayed.
5. Click the Obtain activation code button.
The RecoverPoint Licensing Server Obtain Activation Code
page is displayed.
6. Copy the required information from the email containing your
license information, into the appropriate fields.
7. Click the Obtain button.
Your request is processed and your activation code is
immediately sent to the specified email address.
To enable RecoverPoint permanently, proceed on to “Defining
your activation code in RecoverPoint” on page 115.
Defining your
activation code in
RecoverPoint
Before you begin
To enable RecoverPoint permanently, you must define a valid license
key and activation code in the RecoverPoint system.
Before you begin:
◆
◆
How to define your
activation code in
RecoverPoint
Perform the “Prerequisites” on page 113
Perform “Defining your license key in RecoverPoint” on page 113
and “Requesting an activation code” on page 114
To define your activation code in the RecoverPoint system:
1. If you are activating RecoverPoint from the Getting Started
Wizard (see “The Getting Started Wizard” on page 110), skip this
step.
From the main menu of the RecoverPoint Management
Application GUI, select System > System Settings and click the
Account Settings link in the navigation area.
The Account Settings screen is displayed.
2. Click the Update button.
The Updating license key dialog box is displayed.
Managing RecoverPoint licences
115
Getting Started
a. Enter the license key and the activation code you received
via email to the appropriate fields.
b. Click the OK button to exit the Updating license key
dialog box.
3. Click the Apply button.
Your activation code is displayed in the License Keys section to
the bottom of the screen. Your RecoverPoint product is now
permanently enabled.
Upgrading your
license
Before you begin
To upgrade your RecoverPoint license, you will have to request the
upgrade, and then re-define the license key and activation code in the
RecoverPoint system.
Before you begin:
◆
◆
How to upgrade your
RecoverPoint license
Read “Licensing overview” on page 108.
Perform the “Prerequisites” on page 113.
To upgrade your RecoverPoint license:
1. From the main menu of the RecoverPoint Management
Application GUI, select System > System Settings and click the
Account Settings link in the navigation area.
The Account Settings screen is displayed.
2. Click the Obtain Activation Code link.
The RecoverPoint Licensing Server Login page is displayed in
your default browser window.
3. Open the email containing your license information. Copy your
license key and account ID, and paste them into the appropriate
fields on the Web page.
4. Click the Login button.
The RecoverPoint Licensing Server License Details page is
displayed.
5. Click the Request to upgrade version button.
The RecoverPoint Licensing Server Request to Upgrade Version
page is displayed.
6. Copy the required information from the email containing your
license information, into the appropriate fields.
116
EMC RecoverPoint Release 3.3 Administrator’s Guide
Getting Started
7. Click the Send button.
Your request is processed and a new activation code is sent to the
specified email address within 48 hours.
8. When the new activation code arrives, perform the process
described in “Defining your activation code in RecoverPoint” on
page 115.
Your new activation code is displayed in the License Keys section
to the bottom of the screen, and your new RecoverPoint license is
now installed.
Re-activating your
license
Before you begin
If you format your repository volume, you may need to request a
new activation code from the RecoverPoint licensing server and then
re-activate your RecoverPoint license.
Before you begin:
◆
◆
How to re-activate
your RecoverPoint
license
Read “Licensing overview” on page 108.
Perform the “Prerequisites” on page 113.
To re-activate your RecoverPoint license:
1. From the main menu of the RecoverPoint Management
Application GUI, select System > System Settings and click the
Account Settings link in the navigation area.
The Account Settings screen is displayed.
2. Click the Obtain Activation Code link.
The RecoverPoint Licensing Server Login page is displayed in
your default browser window.
3. Open the email containing your license information. Copy your
license key and account ID, and paste them into the appropriate
fields on the Web page.
4. Click the Login button.
The RecoverPoint Licensing Server License Details page is
displayed.
5. Click the Request to reactivate license button.
The RecoverPoint Licensing Server Request to Reactivate
License page is displayed.
Managing RecoverPoint licences
117
Getting Started
6. Copy the required information from the email containing your
license information, into the appropriate fields.
7. Click the Send button.
Your request is processed and a new activation code is sent to the
specified email address within 48 hours.
8. When the new activation code arrives, perform the process
described in “Defining your activation code in RecoverPoint” on
page 115.
Your new activation code is displayed in the License Keys section
to the bottom of the screen, and your new RecoverPoint license is
now installed.
Viewing your
license information
See Table 4 on page 108 for a detailed explanation of the system
capabilities controlled by your RecoverPoint license.
To display your RecoverPoint license information:
1. If you are in the Getting Started Wizard (see “The Getting Started
Wizard” on page 110), skip this step.
From the main menu of the RecoverPoint Management
Application GUI, select System > System Settings and click the
Account Settings link in the navigation area.
The Account Settings screen is displayed.
2. Your RecoverPoint license information, and the RecoverPoint
capabilities defined by your license, are displayed in the License
Usage section of the Account Settings screen, after the following
procedures are performed:
• “Defining your license key in RecoverPoint” on page 113
• “Requesting an activation code” on page 114
• “Defining your activation code in RecoverPoint” on page 115
118
EMC RecoverPoint Release 3.3 Administrator’s Guide
Getting Started
Access control
This section discusses authentication and authorization of users in
the RecoverPoint system.
The following sections deal with the topics:
◆
◆
User authentication
Password security
“User authentication”
“User authorization”
RecoverPoint provides two independent mechanisms for
authenticating users: appliance-based authentication and
authentication via the organization’s LDAP (Lightweight Directory
Access Protocol) server. The two authentication mechanisms can be
used simultaneously, LDAP may be used exclusively, or
appliance-based authentication can be used exclusively.
The command-line interface (CLI) command set_security_level sets
the restrictions on passwords.
The possible settings are as follows:
◆
High: User passwords to access the RPA must have a minimum
of fourteen characters, they can only be reset once every 24 hours,
at least two must be lower case, at least two must be upper case,
and at least two must be non-alphabetical (either digits or special
characters); all user passwords expire in 60 days; the same
password cannot be reused until at least ten other passwords
have been used.
◆
Low: User passwords to access RPA must have a minimum of five
characters, they expire after 60 days.
Regardless of the security level, any user who tries unsuccessfully
three times to log on will be locked out. To unlock the user, use the
CLI command unlock_user. Only users with Security permission can
unlock a user.
!
IMPORTANT
The default security level is set to Low. It is recommended that
RecoverPoint administrators set the level to High to meet relevant
security standards, such as those of the USA Department of
Defence Security Technical Implementation Guides (DoD STIG).
Access control
119
Getting Started
Configuring
authentication
To configure RecoverPoint users, from the System menu, select
System Settings. Select User Settings, and click the Users tab. The
same commands are used to configure users, whether they are
authenticated by the RecoverPoint appliance or by an LDAP server.
To add, edit, or delete a RecoverPoint user:
1. From the Systems menu, select System Settings. In the System
Settings Navigation Pane, select User Settings.
2. To add a user, click Add. Select Local User or LDAP User/Group.
To configure a Local User:
Refer to Table 5 on page 120. Provide a Username and a
Password, according to “Password security” on page 119.
Select the Role. To limit access to specific consistency groups,
click Limited to consistency groups and select the consistency
groups to which you are granting this user access.
To configure an LDAP User or Group:
Refer to Table 5 on page 120. Select either a user name or a
group from the list. Select the Role. To limit access to specific
consistency groups, click Limited to consistency groups and
select the consistency groups to which you are granting this
user access.
To edit a user, click Edit and modify the password or permissions
as needed. To remove a user, select the user and click Remove.
Table 5
Add New User settings
Settings
Description
Local User
The following settings define a local new user on the RecoverPoint appliance.
User Name
Name of new user to add for appliance-based
authentication. A user name must start with a lower-case
alphabetic character. All subsequent characters must be
lower-case alphabetic, numeric, or hyphen (-). No other
characters are legal in user names.
Password
Password for new user. All printing ASCII characters are
legal in passwords.
Confirm Password
120
EMC RecoverPoint Release 3.3 Administrator’s Guide
Getting Started
Table 5
Add New User settings
Settings
Description
LDAP User/Group
To add a user or users that already exists in the Active Directory on the LDAP server, grant
them access to the RecoverPoint appliance, and define their role (permissions). To be able
to access RecoverPoint, every user in the Active Directory must be added to RecoverPoint
user’s list and assigned a role, or be a member of a group that has access.
User Name/s
Name of user to add to RecoverPoint user list.
Groups
Name of a group or groups (in Active Directory) to add to the
authorized users of RecoverPoint.
User Settings
Role
Predefined users
Table 6
Permissions for RecoverPoint are granted on the basis of
roles. Set the role for the new local user, LDAP user, or
LDAP group.
Limited to consistency
groups
When checked, limits the access of this local user, LDAP
user, or LDAP group to the specified consistency groups.
Consistency Group
Name
Check the consistency groups that the specified user or
group may access.
The RecoverPoint appliance is shipped with the following local users
already defined:
Predefined users
User
Role
Initial Password
Permissions
security-admin
security
security-admin
Security: changing
users and roles,
security levels, LDAP
configuration
admin
admin
admin
All, except security and
webdownload
boxmgmt
boxmgmt
boxmgmt
Upgrade
monitor
monitor
monitor
None (i.e., read only)
webdownload
webdownload
webdownload
Web download
You cannot remove the preconfigured users, and you cannot change
their permissions. It is recommended that you change the passwords.
If you wish to implement a purely LDAP-based authentication
Access control
121
Getting Started
system, you need not assign (that is, give out the passwords) of any
predefined users.
Only users with security permission can add users, and can remove
and edit permissions for users that have previously been added.
Configuring
LDAP-based
authentication
Table 7
To configure RecoverPoint to use the organization’s LDAP server for
authentication, go the RecoverPoint Management Application. From
the System menu, select System Settings. Select User Settings, and
click the LDAP Configuration tab. Enter settings in the LDAP
Configuration dialog box. Refer to Table 7 on page 122.
LDAP Configuration settings
Settings
Description
LDAP configurations:
Enable Active Directory
Support
Check to activate RecoverPoint authentication and
authorization using an LDAP server.
Primary LDAP server
IP address of the primary LDAP server
Secondary LDAP server
(optional) IP address of secondary LDAP server
Base Distinguished Name
Node in the LDAP tree from which to start a search for users:
dc=Klaba,dc=COM
Search Base
Distinguished Name
Root of the LDAP user search tree. The suffix of the Search
Base Distinguished Name must be the Base Distinguished
Name. The format will be similar to the following:
cn=Users,dc=Klaba,dc=COM
Binding Type:
To specify the type of binding (authentication against the Active Directory).
Use Anonymous
Select only If Active Directory is configured to permit
anonymous binding to query the LDAP server.
Use the following user:
If Active Directory is configured to allow binding only by a
specific user, use this option for binding to query LDAP
server.
Bind Distinguished
Name
Distinguished name to use for initial binding when querying
the LDAP server. The format of the Bind Distinguished Name
will be similar to the following:
cn=Administrator,cn=Users,dc=Klaba,dc=COM
The bind distinguished name can be any user on the LDAP
server who has read permission for the directory in the
defined search base.
122
EMC RecoverPoint Release 3.3 Administrator’s Guide
Getting Started
Table 7
LDAP Configuration settings
Settings
Password
Description
Password of the bind distinguished name to use for initial
binding when querying LDAP server
Directory Access Protocol
LDAP
To send LDAP query over non-secure connection.
LDAP over SSL
To send LDAP query over a secure connection.
User certificate from
file
Path to Active Directory certificate to use for secure
communication with LDAP server.
RecoverPoint only accepts LDAP certificates in PEM format.
To format and install the certificate created on the LDAP
server in PEM format, use the following procedure:
1. On the LDAP server, export a copy of the server
certificate from the Active Directory server. Use the
Certification Authority application's Copy File to … option
to export the certificate in Base-64 Encoded X.509 (.cer)
format.
2. Copy the server certificate to a system with OpenSSL
Certificate Authority software installed. You can use any
Linux or Windows system.
3. Log into the system where you copied the certificate, and
run the following command:
On Linux:
> /opt/symas/bin/openssl x509 -in
AD_certificate_name -out
OpenLDAP_certificate name
On Windows:
> openssl x509 -in
drive:/path/AD_file.cer -inform d -out
drive:/path/OpenLDAP_file.pem
This step creates the PEM certificate.
4. Install the certificate on each RecoverPoint appliance. At
the RecoverPoint Management Application, from the
menu select System > System Settings > User
Settings > LDAP Configuration tab. Enter the settings,
including User certificate from file (the path to the
certificate in PEM format).
Advanced settings
The advanced settings are optional and can be left at their default values.
Access control
123
Getting Started
Table 7
LDAP Configuration settings
Settings
Description
Search scope
Base: to limit a search to the search base distinguished
name. Use this option to shorten search times when you are
certain that all users are at the search base level.
One level: to limit a search to the search base distinguished
name and the level immediately below it. Use this option to
shorten search times when you are certain that all users are
within one level of the search base level.
Subtree: search entire subtree from the search base
distinguished name down.
Search time limit
Default 30 sec.
Username attributes
Name of the attribute that contains the username in the user
node of the Active Directory tree.
Example:
sAMAcountName
LDAP group attributes
Name of the attribute that contains the group name in the
user node of the Active Directory tree.
Example:
memberOf
User object class
Object class of users in the Active Directory tree:
Example:
user
When you have completed entering data in the LDAP Configuration
screen, click the Test configuration button to verify that you have
configured correctly.
User authorization
User authorization grants or denies users access to resources
managed by RecoverPoint. User authorization is identical, regardless
of whether the user was authenticated by RecoverPoint or by an
LDAP server. User authorization can be limited to specific
consistency groups. For details, refer to User Settings in Table 5 on
page 120.
Each RecoverPoint user is defined by a user name, a password, and a
role. A role is a named set of access permissions. By assigning a role to
users, the users receive all the access permissions defined by the role.
Table 9 on page 125 lists the permissions that may be granted or
denied to a role.
124
EMC RecoverPoint Release 3.3 Administrator’s Guide
Getting Started
Configuring roles
Table 8
Table 9
To configure roles, from the System menu, select System Settings.
Select User Settings, and click the Roles Tab. To delete a role, select a
role and click Delete. To change permissions of a role, click Edit and
select or deselect permissions as needed. To add a role, click Add and
enter settings in the Add New Role dialog box. Refer to Table 8 on
page 125 and Table 9 on page 125.
Add New Role settings
Setting
Description
Role name
Name of the new RecoverPoint role to add.
Permissions
Select the access permissions to be granted to all persons who are
assigned to this role.
Permissions that may be granted or denied
Permission
Description
Splitter Configuration
Add or remove splitters, and attach or detach splitters to
volumes.
Group Configuration
Create and remove consistency groups, and modify all group
settings except those that are included in the data transfer,
target image, and failover permissions, may bookmark images,
and resolve settings conflict.
Data Transfer
Enable and disable access to image, and undo writes to the
image access log.
Target Image
Enable and disable access to an image, resume distribution, and
undo writes to the image access log.
Failover
Modify replication direction (use temporary and permanent
failover), initiate failover, verify failover
System Configuration
Configure and manage e-mail alerts, SNMP, System Reports,
rules, licenses, serial ID, account ID, syslog, and other system
configuration settings.
Security
All commands dealing with roles, users, LDAP configuration,
and security level.
Access control
125
Getting Started
Table 9
126
Permissions that may be granted or denied
Permission
Description
Boxmgmt
Install RecoverPoint appliances.
Upgrade
RecoverPoint appliance maintenance, including upgrading to a
minor RecoverPoint release, upgrading to a major RecoverPoint
release, replacing an RPA, and adding new RPAs.
Web download
Download RecoverPoint installation packages from the EMC
web site.
EMC RecoverPoint Release 3.3 Administrator’s Guide
3
Starting Replication
Starting Replication
Once RPAs and (host-based, fabric-based, or storage-based) splitters
are installed and configured, you define what you want to replicate
and how.
The topics in this section are:
◆
◆
◆
◆
◆
Adding splitters................................................................................
Creating new consistency groups..................................................
Configuring replication policies ....................................................
Modifying existing settings and policies ......................................
Manually attaching volumes to splitters ......................................
Starting Replication
128
132
143
158
173
127
Starting Replication
Adding splitters
Before you begin, make sure you are well acquainted with the
concepts; “Splitters” on page 27 and “Documentation relevance per
RecoverPoint product” on page 14.
It is assumed that host-based, fabric-based, and array-based splitters
have been installed as needed. For details, see splitter installation in
the EMC RecoverPoint Deployment Manager Product Guide and
technical notes for specific splitters. Before you begin, add splitters to
the RecoverPoint system. Later on, when volumes are added, they
will be automatically attached to all of the splitters that have access to
that volume.
Note: In RecoverPoint/SE only: Each installed CLARiiON splitter is
automatically added to your configuration, and attached to all available
volumes. If you are using RecoverPoint/SE, there is no need to perform the
steps outlined in this section, and you can skip to “Creating new consistency
groups” on page 132.
For boot-from-SAN
groups:
When a consistency group is configured to boot from SAN, special
considerations and procedures are necessary, please contact EMC
Customer Service for more information.
For Brocade splitters:
If you added splitters based on a Connectrix AP-7600B or
PB-48K-AP4-18 switch, make sure you follow the instructions in
“Configure RecoverPoint for replication over the Connectrix device” in
EMC RecoverPoint Deploying RecoverPoint with Connectrix AP-7600B
and PB-48K-AP4-18 Technical Notes.
For SANTap splitters:
For SANTap splitters, switch login credentials must be defined for
each splitter added to the RecoverPoint system, see “Splitter
credentials” on page 269.
For CLARiiON splitters:
Although the two storage processors of a CLARiiON splitter are
listed as separate entities (CLARiiON Splitter 1-A and CLARiiON
Splitter 1-B), they are managed as a single entity. If you add or
remove a splitter, the second storage processor instance is
automatically added or removed. If you attach or detach a volume at
one instance, the same volume will automatically be attached or
detached at the other storage processor instance.
For CLARiiON splitters, Navisphere login credentials must be
defined for each splitter added to the RecoverPoint system, see
“Splitter credentials” on page 269.
128
EMC RecoverPoint Release 3.3 Administrator’s Guide
Starting Replication
Note: If a production storage volume is rolled back by a CLARiiON
SnapView session, the CLARiiON splitter will automatically initialize a full
synchronization (full sweep) of the production storage volume.
A single CLARiiON splitter can be shared by up to four 4 RPA
clusters. While attaching a CLARiiON splitter to a fifth RPA cluster
appears to succeed, the splitter is in an error state for the newly
attached RPA cluster. All splitter operations for this RPA cluster fail
and return the following error:
Maximum RPA clusters per splitter exceeded.
Use the Remove Splitter command to remove the CLARiiON splitter
from the fifth RPA cluster.
How to add splitters
to the RecoverPoint
system
Before you begin, make sure you are well acquainted with the
concept of “Adding splitters” on page 128.
To add splitters to your RecoverPoint system:
1. In the Navigation Pane, select Splitters.
Notice the splitters, if any, listed in the Component Pane; these
are the splitters that have already been added to the RecoverPoint
system.
2. Right-click on Splitters and select Add New Splitter.
Alternatively, click the Add New Splitter button.
The Add Splitter Wizard is displayed.
Note: If this consistency group contains a boot-from-SAN volume, skip
to Step f.
3. In the first screen of the Add Splitter Wizard:
Adding splitters
129
Starting Replication
a. Click the Rescan Splitters button, to refresh the list of
available splitters.
b. Select the splitter to add. You can also select multiple splitters,
or all splitters.
To select or deselect specific splitters; hold down the Ctrl key
on your keyboard, and select the first splitter. While still
holding down the Ctrl key, select the second, third, etc. file.
Clicking on a selected splitter deselects it.
To select a range of splitters; hold down the Shift key, and
click the first splitter. Scroll down to the last splitter you want
to select with the Shift key still down, and select it.
To select all splitters; click on any splitter, and click Ctrl+A.
c. Add a splitter for every host that writes to any volume in a
consistency group.
Best practice is to select and add all available splitters.
d. If you did not add any CLARiiON or SANTap splitters, skip
this step.
If you added a CLARiiON or SANTap splitter, the Enter login
credentials screen is displayed, prompting you to enter login
credentials for each CLARiiON or SANTap splitter added to
the system. For support purposes, it is recommended that you
enter credentials as soon as possible.
e. If you did not add splitters based on a Connectrix® AP-7600B
or PB-48K-AP4-18 switch, skip this step.
If you added splitters based on a Connectrix AP-7600B or
PB-48K-AP4-18 switch, make sure you follow the instructions
in “Deploy RecoverPoint for this switch” in EMC RecoverPoint
Deploying RecoverPoint with Connectrix AP-7600B and
PB-48K-AP4-18 Technical Notes.
f. If this group does not contain a boot-from-SAN volume, skip
this step.
If this splitter is the remote-site splitter for a boot-from-SAN
volume, check the Show boot-from-san peer for other site’s
host checkbox.
130
EMC RecoverPoint Release 3.3 Administrator’s Guide
Starting Replication
When you enable this setting, the list of splitters changes from
those at the remote site to those at the production site. This
happens because the boot-from-SAN volume at the remote
site does not exist yet. It is created subsequently by replicating
the production boot-from-SAN volume. Rather than
configuring the remote-site boot-from-SAN volume, you
specify the splitter on the production site and replicate the
entire boot-from-SAN volume with the splitter to the remote
site. You specify the remote-site splitter here for the benefit of
the RecoverPoint system.
Special considerations arise when attaching to a boot volume,
please contact EMC Customer Service for more information.
g. Click the Next button to view a summary.
h. Click the Finish button to exit the Add Splitter Wizard.
If an available splitter is not added, a warning is displayed in
the status line of the RecoverPoint Management Application.
4. Click on the warning link for more information.
If more splitters are available for addition, the Current System
Warnings dialog box is displayed, with the following warnings:
5. Click on the warning to display a list of splitters that have not
been added, and are still available for addition.
6. To add additional splitters to the RecoverPoint System, repeat this
process.
Adding splitters
131
Starting Replication
Creating new consistency groups
Before you begin, make sure you are well acquainted with the
concept of “Consistency groups” on page 30.
Note: In RecoverPoint only, you must define your splitters in the system
before creating a consistency group. If you have not added splitters to your
system, see “Adding splitters” on page 128.
The New
Consistency Group
Wizard
The New Consistency Group Wizard helps you to create consistency
groups, and guides you through the following tasks:
◆
Configuring, protection, resource allocation, stretch clusters,
advanced settings and policies, and compression, see “How to
configure a new consistency group” on page 133.
◆
Configuring copies: Production source and local replica or remote
replica (or both), see “How to configure the production copy” on
page 134 and “How to configure the replica copies” on page 135.
◆
Specifying replication sets and adding volumes to each copy, see
“How to add replication sets” on page 136.
◆
Adding journal volumes to each copy, see “How to configure
journals” on page 138.
To display the New Consistency Group Wizard:
◆
Select Consistency Groups in the Navigation Pane in the main
window of the RecoverPoint Management Application,
Right-click and select the Add Group option.
-or-
◆
132
Click the Add Group button in the toolbar above the Component
Pane of the RecoverPoint Management Application.
EMC RecoverPoint Release 3.3 Administrator’s Guide
Starting Replication
How to configure a
new consistency
group
This section describes how to create a new consistency group using
the New Consistency Group Wizard.
Before you begin, make sure you are well acquainted with the
concepts; “Consistency groups” on page 30, and “Documentation
relevance per RecoverPoint product” on page 14.
Note: In RecoverPoint/SE only: RecoverPoint/SE can automatically
provision journal volumes. To do so, it allocates dedicated RAID groups. Up
to six RAID groups can be configured per site. If you are planning to use the
automatic journal provisioning feature, make sure you have allocated five
free hard disks per RAID group before beginning the following procedure.
To create a new enabled consistency group:
Note: To create a disabled consistency group (and manually add splitters,
configure replication sets and volumes, attach volumes to the splitters,
and start replication later), click the Finish button after Step 2 and skip
Step 3.
1. Right-click on Consistency Groups in the Navigation Pane in the
main window of the RecoverPoint Management Application and
select the Add Group option.
The Define the Consistency Group and its Settings screen of the
New Consistency Group Wizard is displayed.
In RecoverPoint only:
• Creating a consistency group that contains a boot-from-SAN
volume involves many special considerations, please contact
EMC Customer Service for more information.
• Volumes attached to a CLARiiON splitter cannot be in the
same consistency group with volumes attached to a
host-based splitter. They can be in the same RPA cluster in
different consistency groups. If, however, volumes in one
consistency group at a site are attached to a CLARiiON
splitter, all consistency groups on that site must reside on a
CLARiiON array.
• The same consistency group can use a CLARiiON splitter at
one site and a different splitter at the other site.
Creating new consistency groups
133
Starting Replication
2. Enter the following information into the Define the Consistency
Group and its Settings screen of the New Consistency Group
Wizard.
Table 10
Consistency Group General Settings
Setting
Values and description
Name
Enter a descriptive name for the consistency group.
Primary RPA
Select which RPA you prefer to replicate the consistency group. When
the primary RPA is not available, the consistency group will switch to
another RPA in the RPA cluster. Whether data will transfer when
replication is switched to another RPA depends on the Allow data
transfer even when group is handled by non-preferred RPA policy
(Table 17 on page 149).
Note: Best practice is to ensure that synchronous groups are set to
use different RPAs than asynchronous groups. Mixing between the
two may result in low I/O rates for the synchronous groups. It is also
recommended that dynamic sync and purely synchronous
consistency groups reside on different RPAs, whenever possible. See
“Replication modes” on page 50 for more information on
synchronous, asynchronous, and dynamic sync groups.
The policy settings in the other sections of this screen are
optional. The default values provide a practical configuration. It
is recommended to accept the default settings unless there is a
specific business need to set other policies. These settings can be
changed at any later time by selecting the consistency group in
the Navigation Pane and clicking on its Policy Tab. See
“Configuring consistency group policies” on page 143.
Note: To create a disabled consistency group (and manually add splitters,
configure replication sets and volumes, attach volumes to the splitters,
and start replication later), click the Finish button now and skip the next
step.
3. Click the Next > button.
The Define the Production Copy and its Settings screen is
displayed. See “How to configure the production copy” on
page 134.
How to configure the
production copy
134
Before you begin, make sure you are well acquainted with the
concept of “Copies” on page 32.
EMC RecoverPoint Release 3.3 Administrator’s Guide
Starting Replication
To configure the production copy:
1. Specify the production site in the Production Site field.
2. Enter the following information into the General settings section:
Table 11
Copy General Settings
Setting
Values and description
Name
Enter a descriptive name for the copy in RecoverPoint.
Journal
Compression
Note: Not available in RecoverPoint/SE.
Default = none
The journal of the production storage is used only after failing over to
another replica, which becomes the production source, so that the
previous production source becomes a replica.
It is recommended to compress the journal when forcing
asynchronous replication. If the RPA is also the production source for
transmission across the WAN (to a remote replica) for some other
consistency group, compressing journals will affect the transfer rate
over the WAN and is not recommended.
Note: The following applies to journal compression at replicas, but not
to the production source:
• To change the value of journal compression, the consistency group
must be in one of the following states: disabled, direct image access,
or distributing.
• If the consistency group is writing to this journal volume while you
change the compression level, the existing journal will be lost.
The other policy settings are optional. The default values provide
a practical configuration. It is recommended to accept the default
settings unless there is a specific business need to set other
policies. To change these settings at a later time, click on the copy
in the Navigation Pane, and select its Policy Tab in the
Component Pane. See “Configuring copy policies” on page 152.
3. Apply your settings by clicking the Next > button.
The Define any Replica Copies and their Settings screen is
displayed, see “How to configure the replica copies” on page 135.
How to configure the
replica copies
Before you begin, make sure you are well acquainted with the
concept of “Copies” on page 32.
Note: In RecoverPoint/SE only: Because there is a limitation of one splitter
per site, in a three-copy configuration, make sure the local copy is stored on
the same CLARiiON array as the production copy.
Creating new consistency groups
135
Starting Replication
To configure replica copies:
1. To create a local replica, select the Create Local Copy at
<SiteName> checkbox, and enter a name for the replica in the
Name field.
2. To create a remote replica, select the Create Remote Copy at
<SiteName> checkbox, and enter a name for the replica in the
Name field.
The other policy settings are optional. The default values provide
a practical configuration. It is recommended to accept the default
settings unless there is a specific business need to set other
policies. If you do want to specify values for the available
settings, note that they are identical to the settings for the
production copy. Configure these settings, as required, according
to the instructions in “Configuring copy policies” on page 152.
3. Click the Next > button.
The Select the Production Volumes for which to Create
Replication Sets screen is displayed. See “How to add replication
sets” on page 136.
How to add
replication sets
Before you begin, make sure you are well acquainted with the
concepts; “Replication sets” on page 33, “The production volumes”
on page 36, “The replica volumes” on page 36, and “Documentation
relevance per RecoverPoint product” on page 14.
To add a new replication set to a consistency group:
1. Click the Rescan button to update the list of available volumes at
the production site.
2. Select one or more production volumes to replicate.
Note: In RecoverPoint only: Only the masked volumes at the specified
site are displayed in the available volumes list. Therefore, in the Volume
Details area, ensure the selected volumes are seen by all RPAs. If they are
not, mask the unseen LUNs to RecoverPoint WWNs, click the Rescan
button to update the list of available volumes, and redo this step.
3. Click the Next > button.
The first of the Add Volume from <SiteName> to <RSetNum>
screens is displayed.
4. Click the Rescan button to update the list of available volumes at
the replica site.
136
EMC RecoverPoint Release 3.3 Administrator’s Guide
Starting Replication
5. For each specified production volume, at each copy;
a. Note the production volume specified in the Production
Volume of <RSetNum> area at the top of the screen.
b. From the Volumes at <SiteName> that can be Added to
<RSetNum> list, select the volume that you want to replicate
the specified production volume to.
Note: The volume list only displays volumes at the site that are equal
to, or greater than, the specified production volume.
Use the Filter volumes by: fields to filter the volumes in the
list by Product, Vendor, Name, UID or LUN.
For best performance during failover, select a volume that is
the same size as the one specified in the Production volume of
<RSetNum> section. If a volume of the same size is not
available, select a volume that is as similar in size as possible.
Note: For CLARiiON splitter environments, you must select a
volume that is exactly the same size as the one specified in the
Production volume of <RSetNum> section.
Note: In RecoverPoint only: Only the masked volumes at the
specified site are displayed in the available volumes list. Therefore, in
the Volume Details area, ensure the selected volumes are seen by all
RPAs. If they are not, mask the unseen LUNs to RecoverPoint
WWNs, click the Rescan button to update the list of available
volumes, and redo this step.
c. Click the Next > button.
When all of the specified production volumes are assigned
volumes at each copy, the Review Replication Set Configuration
screen is displayed.
6. For each replication set, you can click the RSet<num> text in the
Name column of the Replication Sets table and enter a
descriptive name.
Note: In RecoverPoint only, clicking on a cell in any of the copy columns
(Production, Local, or Remote) displays the relevant volume’s
information in the Volume Details section at the bottom of the screen.
Creating new consistency groups
137
Starting Replication
7. Apply your settings by clicking the Next > button.
• If you are using RecoverPoint, the Select Journal Volumes
screen is displayed, see “How to configure journals” on
page 138.
• If you are using RecoverPoint/SE, the Select Journal
Provisioning Method screen is displayed, see “How to
configure journals in RecoverPoint/SE” on page 139.
How to replicate
Oracle
For instructions on how to replicate an Oracle database, including
using Oracle hot backup procedures with RecoverPoint bookmarks
for point-in-time snapshots and quick testing and disaster recovery,
refer to Replicating Oracle with EMC® RecoverPoint Technical Notes.
How to configure
journals
Before you begin, make sure you are well acquainted with the
concepts; “Journals” on page 33, “The production journal volume” on
page 36, “The replica journal volumes” on page 37, and
“Documentation relevance per RecoverPoint product” on page 14.
To configure journals:
1. For each copy;
a. Click the Rescan button to update the list of available
volumes.
b. Select the volumes that you want to add to the journal at the
copy site. Multiple volumes can be selected.
Use the Filter volumes by: fields to filter the volumes in the
list by Product, Vendor, Name, UID or LUN.
For best performance, select volumes that are identical in size.
If identically sized volumes are not available, select volumes
that are similar in size.
Note: In RecoverPoint only: Only the masked volumes at the
specified site are displayed in the available volumes list. Therefore, in
the Volume Details area, ensure the selected volumes are seen by all
RPAs. If they are not, mask the unseen LUNs to RecoverPoint
WWNs, click the Rescan button to update the list of available
volumes, and redo this step.
c. Click the Next > button.
The Create Consistency Group screen is displayed.
138
EMC RecoverPoint Release 3.3 Administrator’s Guide
Starting Replication
2. Review the settings in the Create Consistency Group screen, and
verify that they are correct.
3. Make sure you have read “Starting replication” on page 141.
Note: If you do not wish to start data transfer immediately, uncheck the
Start data transfer immediately checkbox. Before you start transfer to
any replica, make certain that the replica volumes are unmounted from
any hosts and any volume groups are deported from the logical volume
manager (AIX, HP-UX, Windows, and Solaris have volume managers
built into the operating system; Veritas Volume Manager can be used
with any of these operating systems).
4. To create the consistency group and apply all of the specified
settings, click the Finish button.
Swapping LUN
numbers of journal
volumes
Swapping LUN numbers for LUNs that have already been exposed to
an RPA cluster should be avoided when possible. When it cannot be
avoided, it should be done according to the following procedure.
Loss of the journal cannot be avoided.
1. Disable the consistency group.
Disabling the consistency group causes a full sweep of the
consistency group when it is enabled. This procedure will also
cause journal loss.
2. Remove journals from the consistency group.
3. Swap the LUNs (on the storage array).
4. Add LUNs as journals.
5. Enable the consistency group.
A full sweep of the consistency group occurs.
How to configure
journals in
RecoverPoint/SE
Before you begin, make sure you are well acquainted with the
concepts; “Journals” on page 33, “The production journal volume” on
page 36, “The replica journal volumes” on page 37, and
“Documentation relevance per RecoverPoint product” on page 14.
In RecoverPoint/SE you can:
◆
manually configure journal volumes by following the steps in “To
manually configure journal volumes” on page 141.
Creating new consistency groups
139
Starting Replication
◆
allow RecoverPoint/SE to automatically provision and configure
them for you by following the steps in “To automatically
provision journal volumes” on page 140.
To automatically provision journal volumes
To allow RecoverPoint/SE to automatically provision and configure
the journal volumes for you (default option):
Note: Make sure you have allocated five free hard disks per RAID group, per
site, before beginning the following procedure.
1. If this is the first consistency group you are creating, specify the
number of RAID groups to create in the Select number of raid
groups to create field. Otherwise, skip this step.
Note: Once these RAID settings are defined, they cannot be modified or
undone, as this option will not be displayed again.
2. Decide whether you wish to provide a pre-defined value for the
size of each copy journal, or have RecoverPoint/SE calculate the
required journal size based on bandwidth and a required
protection window.
• To pre-define the journal size at each copy, click the
Predefined Journal Size radio button and enter a value for the
journal size (in GB).
Note: The specified journal size will be applied to every journal of
every copy.
• To have RecoverPoint/SE calculate the required journal size
based on bandwidth and a required protection window, click
the Bandwidth radio button, and enter both values (see
“Required protection window” on page 153). The required
journal size is displayed under the required protection
window as the Calculated journal size.
Note: The specified calculated journal size will be applied to every
journal at every copy.
d. Click the Next > button.
The specified RAID groups are created, and cannot be modified.
The Create Consistency Group screen is displayed.
140
EMC RecoverPoint Release 3.3 Administrator’s Guide
Starting Replication
3. Review the settings in the Create Consistency Group screen, and
verify that they are correct.
Note: Make sure you have read “Starting replication” on page 141. If you
do not wish to start data transfer immediately, uncheck the Start data
transfer immediately checkbox. Before you start transfer to any replica,
make certain that the replica volumes are unmounted from any hosts and
any volume groups are deported from the logical volume manager (AIX,
HP-UX, Windows, and Solaris have volume managers built into the
operating system; Veritas Volume Manager can be used with any of these
operating systems).
4. To create the consistency group and apply all of the specified
settings, click the Finish button.
!
CAUTION
If you selected to start data transfer immediately, a full sweep
synchronization process begins on all volumes in the consistency
group.
To manually configure journal volumes
To manually configure journal volumes:
1. Select the Manually Select Journal Volumes radio button.
2. Click the Next > button.
The Select Journal Volumes screen is displayed.
3. Follow the instructions in “How to configure journals” on
page 138.
Starting replication
When a consistency group, copy, or volume is defined in
RecoverPoint for the first time, its volumes are initialized, see
“First-time initialization” on page 224.
By default, RecoverPoint writes the first snapshot directly to the
replica, without first writing it to the journal. You can override the
default (refer to Perform fast first-time initialization in Table 17 on
page 149) to write the initialization snapshot first to the journal. This
option is more time-consuming but provides greater data protection.
Once first-time initialization is completed, each consistency group
will be in one of the following states:
Creating new consistency groups
141
Starting Replication
◆
Replicating: The consistency group is enabled, the splitter is
replicating to the RPAs, the RPAs are transferring to the replica
journal or journals. The snapshots from the journal are then
distributed to the replica storage.
◆
Marking: The consistency group is enabled, the splitter is
replicating to the RPAs, but the RPAs are unable to transfer to the
replica journal. The location of the changes is stored in RPA1,
RPA2, as well as on the production journal volume. When contact
with the remote site is restored, the remote replica is
synchronized, but only at those locations that were marked as
having changed. Then transfer and replication can resume.
The following can cause the RPA to go to marking mode:
• WAN unavailable
• RPAs at remote site not available (for instance, loss of power)
• Transfer disabled manually
• High load (temporary bottleneck in replication environment)
◆
142
No marking/no replication: the splitter does not write to the
RPAs. This can be caused by a manually disabled consistency
group or by a disaster at the production site (no RPAs available).
EMC RecoverPoint Release 3.3 Administrator’s Guide
Starting Replication
Configuring replication policies
Replication with the RecoverPoint system is policy-driven. A
replication policy, based on the particular business needs of your
company, is uniquely specified for each consistency group, and each
copy. The policy comprises a set of settings that collectively governs
the way in which replication is carried out. Replication behavior
changes dynamically during system operation in light of the policy,
the level of system activity, and the availability of network resources.
The replication policy settings are presented in “Configuring
consistency group policies” on page 143 and “Configuring copy
policies” on page 152.
Configuring
consistency group
policies
The tables in this section describe the available policy settings for
consistency groups, they include:
◆
◆
◆
◆
◆
◆
Table 12
“Consistency Group General Settings”
“Consistency Group Compression Policy Settings”
“Consistency Group Protection Policy Settings”
“Consistency Group Resource Allocation Policy Settings”
“Consistency Group Stretch Cluster / SRM Support Policy
Settings”
“Consistency Group Advanced Policy Settings”
Consistency Group General Settings
Setting
Values and description
Name
The name of the consistency group.
Primary RPA
The RPA that you prefer to replicate the consistency group. When the
primary RPA is not available, the consistency group will switch to
another RPA in the RPA cluster. Whether data will transfer when
replication is switched to another RPA depends on the Allow data
transfer even when group is handled by non-preferred RPA policy
(Table 17 on page 149).
Note: Best practice is to ensure that synchronous groups are set to
use different RPAs than asynchronous groups. Mixing between the
two may result in low I/O rates for the synchronous groups. It is also
recommended that dynamic sync and purely synchronous
consistency groups reside on different RPAs, whenever possible. See
“Replication modes” on page 50 for more information on
synchronous, asynchronous, and dynamic sync groups.
Configuring replication policies
143
Starting Replication
Table 13
Table 14
144
Consistency Group Compression Policy Settings
Setting
Values and description
Enable
Compression
Default=enabled
To compress data before transferring over the WAN. Can reduce
transfer time significantly.
Only available if license supports compression. Not relevant for CDP
(single-site) configurations.
Compression
Level
Default = 10
Compression decreases transfer time, but increases computational
effort.
1: Highest level of compression; requires more RPA resources
10: Fastest compression
Consistency Group Protection Policy Settings
Setting
Values and description
Asynchronous
Default=enabled
When enabled, RecoverPoint replicates consistency group data
asynchronously, see “Asynchronous replication mode” on page 51.
Synchronous
Default=disabled
When enabled, RecoverPoint replicates consistency group data
synchronously, see “Synchronous replication mode” on page 51.
Dynamic by
latency
Default=disabled
Only relevant for synchronous replication mode.
When enabled, RecoverPoint alternates between synchronous and
asynchronous replication modes, as necessary, according to latency
conditions (the number of milliseconds or microseconds between the
time the data is written to the local RPA and the time that it is written
to the RPA or journal at the remote site), see “Dynamic sync mode”
on page 52.
Start async replication above: When the specified limit is reached,
RecoverPoint automatically starts replicating asynchronously, see
“Asynchronous replication mode” on page 51.
Resume sync replication below: When the specified limit is
reached, RecoverPoint goes back to replicating synchronously, see
“Synchronous replication mode” on page 51.
EMC RecoverPoint Release 3.3 Administrator’s Guide
Starting Replication
Table 14
Consistency Group Protection Policy Settings
Setting
Values and description
Dynamic by
throughput
Default=disabled
Only relevant for synchronous replication mode.
When enabled, RecoverPoint alternates between synchronous and
asynchronous replication modes, as necessary, according to
throughput conditions (the total writes that reach the local RPA, per
copy, in kb/s), see “Dynamic sync mode” on page 52.
Start async replication above: When the specified limit is reached,
RecoverPoint automatically starts replicating asynchronously, see
“Asynchronous replication mode” on page 51.
Resume sync replication below: When the specified limit is
reached, RecoverPoint goes back to replicating synchronously, see
“Synchronous replication mode” on page 51.
System Optimized
Lag
Default = enabled
This setting defines the RPO of the consistency group.
To have RecoverPoint determine the best lag for an efficient and
practical solution. If any other solution is needed, please contact EMC
Customer Service.
Lag
Default = disabled
This setting defines the RPO of the consistency group, and is set
manually, in MB, GB, writes, seconds, minutes, or hours.
In RecoverPoint, lag starts being measured when a write made by the
production host reaches the local RPA, and stops being measured
when the write reaches either the target RPA, or the target journal,
depending on the Measure lag when writes reach the target RPA
(opposed to journal) setting.
Note: When the Allow regulation setting is disabled, the selected
RPO is not guaranteed, but the system will try it's best to replicate
within the RPO setting, without affecting host performance.
Configuring replication policies
145
Starting Replication
Table 14
Consistency Group Protection Policy Settings
Setting
Values and description
Allow Regulation
Default = disabled
Allows RecoverPoint to control the acknowledgement of writes back
to the host in the case of bottlenecks or insufficient resources that
would otherwise prevent RecoverPoint from replicating data.
When enabled, slows host applications when approaching the lag
policy limit. When the system cannot replicate the current incoming
write-rate while guaranteeing the lag setting, the system delays
acknowledgements to guarantee that RPO is always enforced.
Additionally, if there is a bottleneck in the system, the system will
regulate host applications instead of entering a high load state.
Note: For BFS groups (consistency groups configured to boot from
the SAN) in Windows KDriver environments, enabling this policy is
discouraged.
When disabled, the system will use policies and limits as guidelines
and make an effort to meet them.
In synchronous replication mode (see “Synchronous replication
mode” on page 51), although the Allow Regulation checkbox is
disabled, this policy is automatically enabled, and cannot be modified.
In dynamic sync mode (see “Dynamic sync mode” on page 52), the
Allow Regulation checkbox is enabled, but the user setting only
applies when the group is replicating asynchronously. During
synchronous replication, host applications are always regulated.
Note: Since host applications are always regulated in sync replication
mode, replicating synchronously with BFS groups (consistency
groups configured to boot from the SAN) in Windows KDriver
environments, is discouraged.
Minimize
146
Default = minimize lag
Only relevant for remote replication over the WAN or Fibre Channel.
Lag: To keep the lag (difference) between sites to a minimum; system
will use as much bandwidth as needed. Lag is the maximum offset
between writing data to the local RPA and writing it to the RPA or
journal at the remote site. Intervals between snapshots will depend on
available bandwidth and I/O load.
Bandwidth: To use as little bandwidth as possible while making sure
that maximum lag policy is not exceeded (by keeping the lag in the
RPA memory for as long as possible before reaching the specified lag
setting).
NOTE: If Minimize = Bandwidth, in CRR specifications, lag must be
set to System Optimized Lag; otherwise the system will issue an
error.
EMC RecoverPoint Release 3.3 Administrator’s Guide
Starting Replication
Table 15
Consistency Group Resource Allocation Policy Settings
Setting
Values and description
Priority
Default = Normal
Only relevant for remote replication over the WAN or Fibre Channel,
when two or more consistency groups are using the same Primary
RPA.
Select the priority assigned to this consistency group. The priority
determines the amount of bandwidth allocated to this consistency
group in relation to all other consistency groups.
Possible values are: Idle, Low, Normal, High, and Critical
If a consistency group is set to Idle, all other consistency groups that
are set to a greater value will receive a greater share of the RPA
resources in the event of contentions. The consistency group set to
Idle will only receive resources when no other consistency group
requires them.
Note: Consistency groups with Idle settings are still provided some
resources when other groups are replicating on the same RPA, even if
the consistency group’s Primary RPA is heavily loaded. However,
unless the consistency group’s Primary RPA is very heavily loaded,
the effects of an Idle setting may not be noticeable.
Bandwidth
Limitation
Default = unlimited
Used to limit the bandwidth that is made available to the consistency
group. Only relevant for remote replication over the WAN.
Note: Bandwidth limitation is not supported for CDP configurations,
or remote replication over Fibre Channel.
Unlimited: This consistency group may use as much available
bandwidth as needed to meet policies.
Limited: This consistency group may use up to the specified amount
of bandwidth.
The feature works as follows:
• System sums bandwidth limitations of all consistency groups on a
single RPA, and limits the outgoing throughput from the RPA to
that sum.
• If consistency groups with settings of ‘limited’ and ‘unlimited’ run
on the same RPA, the effect is that there is no limit to the outgoing
throughput from the RPA.
Note: When limiting the bandwidth, ensure that groups with limited
bandwidth are not configured to run on an RPA with other consistency
groups whose bandwidth is unlimited or this feature will not work. See
“Consistency Group General Settings” on page 143 for more
information on setting the primary RPA.
Configuring replication policies
147
Starting Replication
Table 16
148
Consistency Group Stretch Cluster / SRM Support Policy Settings
Setting
Values and description
Desired cluster
mode
Default = none
Use
RecoverPoint/CE
Default = disabled
Check this option to enable stretch cluster support.
Group is
managed by
RecoverPoint CE.
RecoverPoint can
only monitor.
Only relevant if Use RecoverPoint/CE is enabled.
Check this option to activate stretch cluster support. When activated,
hosts in a Microsoft Cluster can automatically fail over from one site to
the other. RecoverPoint assures that the application data is in the
identical state at the original site and the failover site, so that the
failover is transparent to the application.
When activated, all RecoverPoint user-initiated capabilities are
disabled. The user cannot access images, change policies, or change
volumes. Bookmarks cannot be created in the RecoverPoint
Management Application, but they can be created using the
RecoverPoint command-line interface bookmark commands.
Group is in
maintenance
mode. It is
managed by
RecoverPoint, CE
can only monitor.
Only relevant if Use RecoverPoint/CE is enabled.
Check this option for planned or unplanned maintenance of the
RecoverPoint system. When activated, stretch cluster support is
disabled and user-initiated RecoverPoint capabilities are enabled.
When activated, hosts are not able to fail over to the other site,
although they can still fail over to another host within the same site.
When activated, RecoverPoint user-initiated capabilities, such as
image access, image testing, changing policies and creating
bookmarks are available.
Use SRM
Default = disabled
Check this option to enable VMware SRM support. This option is valid
when a RecoverPoint Storage Replication Adapter for VMware Site
Recovery Manager is installed on the vCenter Servers.
For more information about the RecoverPoint Adapter for VMware
SRM, refer to the EMC RecoverPoint Adapter for VMware Site
Recovery Manager Release Notes, available on Powerlink.
Group is
managed by SRM.
RecoverPoint can
only monitor.
Only relevant if Use SRM is enabled.
Check this option to activate VMware SRM support. When activated,
VMware Site Recovery Manager manages the group and can perform
failover and test failover from one site to the other.
When activated, all RecoverPoint user-initiated capabilities are
disabled. The user cannot access images, change policies, or change
volumes. Bookmarks cannot be created in the RecoverPoint
Management Application, but they can be created using the
RecoverPoint command-line interface bookmark commands.
EMC RecoverPoint Release 3.3 Administrator’s Guide
Starting Replication
Table 16
Table 17
Consistency Group Stretch Cluster / SRM Support Policy Settings
Setting
Values and description
Group is in
maintenance
mode. It is
managed by
RecoverPoint,
SRM can only
monitor.
Only relevant if Use SRM is enabled.
Check this option for planned or unplanned maintenance of the
RecoverPoint system. When activated, VMware SRM support is
disabled and user-initiated RecoverPoint capabilities are enabled.
When activated, all RecoverPoint user-initiated capabilities, such as
image access, image testing, changing policies, and creating
bookmarks are available.
Consistency Group Advanced Policy Settings
Setting
Values and description
Reservations
Support
Default = enabled
Enable only if hosts are clustered or if one of the hosts runs AIX
without reservations disabled.
For Reservations Support settings for AIX hosts with host-based
splitters, refer to EMC® RecoverPoint Deploying RecoverPoint with
AIX hosts.
Allow data
transfer even
when group is
handled by
non-preferred RPA
Default = enabled
Each RPA cannot transfer the data of more than a specific number of
consistency groups (see the Consistency groups in cluster setting
in the General configuration limits table of the EMC RecoverPoint
and RecoverPoint/SE Release Notes of each RecoverPoint version
for this limit). If a consistency group contains both a local and remote
replica, it counts as two consistency groups toward this limit. To
make sure that if an RPA fails, it can always switch over to another
RPA, the system will not allow more than the maximum number of
consistency groups to be configured in the entire system when Allow
data transfer even when group is handled by non-preferred RPA
= enabled.
When the primary RPA is unavailable for any reason, the consistency
group is switched over to a non-preferred RPA.
Enable to have non-preferred RPA perform all aspects of replication,
including transferring data to the replica or replicas.
Disable to prevent the non-preferred RPA from transferring data.
Marking will continue as usual. When operation is restored to the
primary RPA, data transfer will resume.
Configuring replication policies
149
Starting Replication
Table 17
Consistency Group Advanced Policy Settings
Setting
Values and description
Measure lag when
writes reach the
target RPA (as
opposed to the
journal)
Default=enabled
Only relevant for CRR configurations, and synchronous replication to
a CDP copy. Not relevant for asynchronous replication to a CDP
copy.
Note: It is recommended to leave this setting as is.
By default, your RecoverPoint system is configured to measure lag
and generate ACKs when writes reach the remote RPA.
Disable this setting to instruct RecoverPoint to measure lag and
generate ACKs when writes reach the remote journal, instead of the
remote RPA.
When enabled, this policy provides faster performance in both
synchronous and asynchronous replication modes, by reducing both
latency and lag. When Allow Regulation is enabled (see “Allow
Regulation” on page 146), and lag is reduced, so is the potential
requirement to regulate the host applications.
In synchronous replication mode (see “Synchronous replication
mode” on page 51) write performance is substantially higher when
this policy is enabled. However, with this policy enabled,
RecoverPoint does provide a slightly lower level of data security in
the rare case of a simultaneous local and remote RPA disaster.
You can check the difference in write performance by selecting a
group in the Navigation Pane and clicking the Statistics Tab in the
Component Pane. Note the number of Incoming Writes in the
bottom right section of the Statistics Tab during normal replication,
and then compare this to the number of Incoming Writes in the same
section, when Measure lag when writes reach the target RPA (as
opposed to the journal) is disabled.
150
EMC RecoverPoint Release 3.3 Administrator’s Guide
Starting Replication
Table 17
Consistency Group Advanced Policy Settings
Setting
Values and description
Distribute group
Default=disabled
Note: Both enabling and disabling this setting causes the journal of
all copies in the consistency group to be lost.
Allows group writes to be distributed across multiple RPAs,
significantly heightening the maximum available RPA throughput,
and therefore, allowing for a significantly larger group. For throughput
and IOPS performance statistics (during synchronous and
asynchronous replication) and feature limitations, see the EMC
RecoverPoint and RecoverPoint/SE Release Notes. For more
information on distributed consistency groups, see “Distributed
consistency groups” on page 58.
Enable to specify secondary RPAs.
When enabled, a minimum of one, and a maximum of three
secondary RPAs can be selected.
There is only a small improvement in performance when a group is
run on three RPAs. However, there is a steep improvement in
performance when a group is run on four RPAs.
Note: Before changing this setting, make sure all preferred RPAs
(both primary and secondary) are connected by Fibre Channel and
can see each other in the SAN. For more information, see “What
should I know before setting a group as distributed?” on page 59.
Snapshot
Granularity
Default = fixed (per second)
Fixed (per write): To create a snapshot for every write operation,
over a specific (local or remote) link.
Fixed (per second): To create one snapshot per second, over a
specific (local or remote) link.
Dynamic: To have the system determine the snapshot granularity of
a specific (local or remote) link, according to available resources.
Note: For distributed consistency groups, the snapshot granularity of
all links in the consistency group can be no finer than one second.
See “Distributed consistency groups” on page 58 for more
information.
Configuring replication policies
151
Starting Replication
Table 17
Consistency Group Advanced Policy Settings
Setting
Values and description
Perform fast
first-time
initialization
Default = enabled
Only relevant for initializations that occur for the first time.
Note: This setting is set per (local or remote) link.
During normal replication, RecoverPoint transfers data from
production to the replica journal, and then from the replica journal to
the replica storage. During first-time initialization, sending data
through the journal is unnecessary, since the replica does not
contain previous data that can be used to construct a complete
image, and a complete image must be transferred before failover is
possible.
When this policy is enabled, RecoverPoint transfers data directly to
the replica storage. The data is not stored in the journal first, and
consequently, the initialization process is substantially shorter. In this
case, the replica is not consistent with production until the transfer of
the whole image to the replica storage is complete. Therefore, if a
disaster were to strike at the production site before the transfer of the
image was complete, you would not be able to fail over to the replica.
When this policy is disabled, RecoverPoint transfers data to the
replica journal, and only then from the replica journal to the replica
storage. Disabling this policy is useful, for example, when disabling
and enabling an existing consistency group, causing the group to be
initialized. In this case, RecoverPoint may be able to use the existing
data at the replica site (journal and storage) to construct a complete
image, which is required for failover purposes. To enable failover
during initialization, it is recommended to also disable the Allow
distribution of snapshots that are larger than capacity of journal
volumes policy (see “Allow distribution of snapshots that are larger
than capacity of journal volumes” on page 157).
Configuring copy
policies
The tables in this section describe the available policy settings for
copies, they include:
◆
◆
◆
◆
152
“Copy General Settings”
“Copy Protection Policy Settings”
“Copy Journal Policy Settings”
“Copy Advanced Policy Settings”
EMC RecoverPoint Release 3.3 Administrator’s Guide
Starting Replication
Table 18
Copy Protection Policy Settings
Setting
Values and description
Required
protection
window
Default = disabled
The protection window indicates how far in time the replica image can
be rolled back. Click the checkbox to define a required protection
window. Specify the length of the required protection window.
When the required protection window is defined, the status of the
Current and the Predicted Protection Windows are displayed in the
Journal Tab of the copy (see “The Journal Tab” on page 196). In
addition, the system will raise an event in any of the following cases:
• Current Protection Window becomes insufficient.
• Current Protection Window becomes sufficient.
• Predicted Protection Window becomes insufficient.
• Predicted Protection Window was insufficient and becomes
sufficient.
Enable
RecoverPoint
snapshot
consolidation
Default = disabled
Snapshots are consolidated to allow for the storage of a longer history
in the copy Journal (see “Automatic snapshot consolidation” on
page 41).
The system will raise an event in any of the following cases:
• The specified group is not distributing.
• The specified snapshots have not reached the copy storage.
• The consolidation times are not far apart enough for significant
changes to have occurred. There must be a least 1 GB of space
between the snapshots being consolidated.
• Another consolidation is in progress on the same RPA.
NOTE: Snapshot consolidation cannot be enabled for a group that is
part of a group set, see “Automatic periodic bookmarking” on
page 237. When RecoverPoint snapshot consolidation is enabled,
the Predicted Protection Window is not calculated. For snapshot
consolidation, the minimum journal size is 30GB, see “Journal size
with snapshot consolidation” on page 34.
Do not
consolidate any
snapshots for at
least
Default = 2 days
Minimum = 12 hours. The period during which snapshot data is not to
be consolidated. The period’s start time is always today, and the
period’s end time is n hours / days / weeks / months ago.
The following conditions apply:
• Must be a minimum of12 hrs.
• If no daily or weekly consolidations are specified, the remaining
snapshots are consolidated monthly.
Configuring replication policies
153
Starting Replication
Table 18
154
Copy Protection Policy Settings
Setting
Values and description
Consolidate
snapshots that
are older than x
to one snapshot
per day for y days
Default = 5 days
Snapshots are consolidated every ~24 hours.
Select the Indefinitely checkbox to consolidate all subsequent
snapshots in ~24 hour intervals.
The following conditions apply:
• If the Indefinitely checkbox is not selected, and no weekly
consolidations are specified, the remaining snapshots are
consolidated monthly.
• If the Indefinitely checkbox is selected, weekly and monthly
consolidations are disabled, and the remaining snapshots are
consolidated daily.
Consolidate
snapshots that
are older than x
to one snapshot
per week for y
weeks
Default = 4 weeks
Snapshots are consolidated every ~168 hours.
Select the Indefinitely checkbox to consolidate all subsequent
snapshots in ~168 hour intervals.
The following conditions apply:
• If the Indefinitely checkbox is not selected, the remaining
snapshots are consolidated monthly.
• If the Indefinitely checkbox is selected, monthly consolidations are
disabled, and the remaining snapshots are consolidated weekly.
EMC RecoverPoint Release 3.3 Administrator’s Guide
Starting Replication
Table 19
Copy Journal Policy Settings
Setting
Values and description
Maximum Journal
Lag
Default = unlimited
Defines the data access aspects of RTO for each copy.
Note: RTO includes the time for trying to fix the problem without a
recovery, the recovery itself (RecoverPoint's role in an organization's
RTO, which is controlled by this setting), tests and the communication
to the users. Decision time for users is not included.
When data is received by the RPA faster than it can be distributed to
storage volumes, it accumulates in the journal. The Maximum
Journal Lag is the maximum amount of snapshot data (in bytes, KB,
MB, or GB) that is permissible to hold in the replica journal before
distribution to the replica storage. In other words, the amount of data
that would have to be distributed to the replica storage before failover
to the latest image could take place. In terms of RTO this is the
maximum time that would be required in order to bring the replica
up-to-date with production.
When the Maximum Journal Lag value is reached, the system
switches to fast-forward (three-phase) distribution mode, and no
longer retains rollback information. As soon as the lag is within the
allowed limits, rollback data is retained again.
See “Three-phase distribution” on page 97 for more information.
Proportion of
journal allocated
for image access
log (%)
Default = 20
A host may access its journal for testing. To test, it may be necessary
for the host to write to the journal. These writes are written to the
image access log and rolled back as soon as testing is completed.
Proportion of journal allocated for image access log determines
how much may be written to the journal.
Journal size limit
(GB)
Default = 1200
For RecoverPoint to function correctly, the journal size limit must be
set for each journal. If you need to increase any journal size beyond
this limit, modify the size limit accordingly. Maximum size of any
volume of the journal = 2 TB; maximum total journal size = 10 TB.
Configuring replication policies
155
Starting Replication
Table 20
Copy Advanced Policy Settings
Setting
Values and description
Host OS
Default = Other/Mixed
Select the operating system of the host writing to the volumes in the
consistency group. If the hosts are not all running the same operating
system, select Others/Mixed. If one or more of the volumes in the
consistency group resides on an ESX server, set as follows:
• VMWare ESX Windows - when running a guest Windows OS with
a fabric-based or CLARiiON splitter
• Windows - when running a guest Windows OS with a host-based
splitter
• VMWare ESX - when running a guest OS other than Windows,
with a fabric-based or CLARiiON splitter
Note: RecoverPoint does not support the replication of virtual
machines running a guest OS other than Windows, with a host-based
splitter.
156
Reservations
Policy
Default = auto
The default value should be used except in the following case.
Reservations Policy = SCSI-2 is required if all the following are true:
• Windows host-based splitter and Microsoft Cluster Servers earlier
than Windows 2008 clusters
• DMX storage with microcode before 5772
Other values should not be used unless specifically instructed to do
so by EMC Customer Service.
Failall variant
Default = auto
This value determines the type of error message returned to a host
that is trying to access a volume for which image access is not
enabled.
Do not change this value unless specifically instructed to do so by
EMC Customer Service.
EMC RecoverPoint Release 3.3 Administrator’s Guide
Starting Replication
Table 20
Copy Advanced Policy Settings
Setting
Values and description
Allow distribution
of snapshots that
are larger than
capacity of
journal volumes
Default = enabled
In order to fail over to a replica, there must be one complete and
consistent image at the replica site. This image can consist of multiple
snapshots, contained solely in the replica journal, solely in the replica
storage, or a combination of both.
In certain situations (for example, after a lengthy communication
outage on the WAN, or a first-time initialization) RecoverPoint may
need to transfer a snapshot that is larger than the capacity of the
journal.
When this policy is enabled, RecoverPoint starts writing the data of
the snapshot (that is larger than the replica journal) to the replica
storage while the additional data from the same snapshot is still being
received by the replica journal. In this case, if a disaster were to strike
at the production site before the complete image was transferred to
the replica storage, it would not be possible to fail over to the replica.
If the ability to fail over to the pre-distribution image in the case of a
disaster during the initialization process is a requirement, disable this
policy. When this policy is disabled, the system automatically pauses
transfer when the last complete image is about to be removed from
the replica, providing the opportunity to:
• increase the journal’s capacity (see “How to modify an existing
journal” on page 170). When transfer is re-enabled, RecoverPoint
will synchronize the writes that were made after transfer was
paused.
• prepare a backup. After the backup is prepared, re-enable this
policy, secure in the knowledge that if a disaster should strike the
production site during the initialization process (before the
complete image was transferred to the replica storage), you could
restore the last complete and consistent image from backup.
Configuring replication policies
157
Starting Replication
Modifying existing settings and policies
Before you begin, make sure you are well acquainted with the
concepts of “RecoverPoint Management Application” on page 29,
“Consistency groups” on page 30 and “Documentation relevance per
RecoverPoint product” on page 14. After a consistency group has
been created in the RecoverPoint system, use the following
procedures to modify its settings and policies.
The following sections deal with the topics:
◆
◆
◆
◆
How to modify an
existing consistency
group
The following sections deal with the topics:
◆
◆
◆
◆
◆
◆
◆
How to modify the
group replication
policy
158
“How to modify an existing consistency group”
“How to modify an existing copy”
“How to modify an existing replication set”
“How to modify an existing journal”
“How to modify the group replication policy”
“How to add a new copy”
“How to remove a copy”
“How to add a new replication set”
“How to remove a replication set”
“How to enable a consistency group and start transfer”
“How to disable a consistency group”
To modify the policy settings of an existing consistency group, select
the consistency group in the Navigation Pane, and click its Policy
Tab in the Component Pane. All of the “Configuring consistency
group policies” on page 143 can be modified through the consistency
group’s Policy Tab.
EMC RecoverPoint Release 3.3 Administrator’s Guide
Starting Replication
How to add a new
copy
To add a new copy to an existing consistency group:
1. Either:
• Right-click on the consistency group name in the Navigation
Pane and select the Add copy option.
or
• Select a consistency group name in the Navigation Pane, and
click the Add Copy button above the Component Pane.
2. Enter a name for the new copy in the Name field.
The other policy settings are optional. The default values provide
a practical configuration. It is recommended to accept the default
settings unless there is a specific business need to set other
policies. If you do want to specify values for the available
settings, note that they are identical to the settings for the
production copy. Configure these settings, as required, according
to the instructions in “Configuring copy policies” on page 152.
3. Click the Finish button.
How to remove a
copy
To remove a copy, either:
◆
Right-click on the copy name in the Navigation Pane and select
the Remove Copy option.
or
◆
Select a copy in the Navigation Pane, and click the Remove Copy
button above the Component Pane.
See also: “Copy commands” on page 193
Modifying existing settings and policies
159
Starting Replication
How to add a new
replication set
To add a new replication set to an existing consistency group:
1. Select a consistency group name in the Navigation Pane, and
click the Add Replication Set button above the Component
Pane.
The Add Replication Set Wizard is displayed.
2. Follow the instructions in “How to add replication sets” on
page 136.
How to remove a
replication set
To remove a replication set from an existing consistency group:
1. Select the consistency group in the Navigation Pane
2. Click the Replication Sets Tab in the Component Pane
3. Select a replication set
4. Click the Remove Replication Sets button above the Component
Pane
How to enable a
consistency group
and start transfer
To enable a disabled consistency group, and (optionally) start
transfer:
1. Either:
• Right-click on the consistency group name in the Navigation
Pane and select the Enable group option.
or
• Select Consistency Groups in the Navigation Pane. In the
Consistency Group Tab in the Component Pane, select one or
more disabled consistency groups and click the Enable Group
button above the Component Pane.
160
EMC RecoverPoint Release 3.3 Administrator’s Guide
Starting Replication
The Enabling group dialog box is displayed.
Note: Before you start transfer to any replica, make certain that the
replica volumes are unmounted from any hosts and any volume
groups are deported from the logical volume manager (AIX, HP-UX,
Windows, and Solaris have volume managers built into the operating
system; Veritas Volume Manager can be used with any of these
operating systems).
2. Unless you wish to initialize the replica from a backup, or group
configuration is not complete, click Yes. The replica will
synchronize with the source (full sweep) and transfer will start. If
you wish to initialize the replica from a backup, clear the Start
data transfer immediately checkbox. Click Yes. Initialize the
replica from a backup, use the Clear Markers command (Table 25
on page 193). Unmount any hosts from the replica, deport any
volume groups, then activate transfer with the Start Transfer
button.
Modifying existing settings and policies
161
Starting Replication
By default, RecoverPoint writes the first snapshot directly to the
replica, without first writing it to the journal. You can override the
default (refer to Perform fast first-time initialization in Table 17
on page 149) to write the initialization snapshot first to the
journal. This option is more time-consuming but provides greater
data protection.
Once first-time initialization is completed, each consistency
group will be in one of the following replication modes:
• Replicating: The consistency group is enabled, the splitter is
replicating to the RPAs, the RPAs are transferring to the replica
journal or journals. If Image access is disabled (default state),
the snapshots from the journal are also distributed to the
replica storage.
• Marking: The consistency group is enabled, the splitter is
replicating to the RPAs, but the RPAs are unable to transfer to
the replica journal. The location of the changes is stored in
RPA1, RPA2, as well as on the production journal volume.
When contact with the remote site is restored, the remote
replica is synchronized, but only at those locations that were
marked as having changed. Then transfer and replication can
resume.
The following can cause the RPA to go to marking mode:
– WAN unavailable
– RPAs at remote site not available (for instance, loss of
power)
– Transfer disabled manually
– High load (temporary bottleneck in replication
environment)
• No marking/no replication: the splitter does not write to the
RPAs. This can be caused by a manually disabled consistency
group or by a disaster at the production site (no RPAs
available).
162
EMC RecoverPoint Release 3.3 Administrator’s Guide
Starting Replication
How to disable a
consistency group
To disabled an enabled consistency group, either:
◆
Right-click on the consistency group name in the Navigation
Pane and select the Disable group option.
or
◆
Select Consistency Groups in the Navigation Pane. In the
Consistency Group Tab in the Component Pane, select one or
more enabled consistency groups and click the Disable Group
button above the Component Pane.
Note: Disabling a group stops all replication, deletes journals, and causes
full sweep on all copies in the group when group is re-enabled.
How to modify an
existing copy
The following procedures will guide you through the process of
modifying an existing copy:
◆
◆
◆
◆
◆
“How to modify the copy replication policy”
“How to remove a copy”
“How to disable a copy”
“How to enable a copy”
“How to modify the journal of a copy”
Modifying existing settings and policies
163
Starting Replication
How to modify the
copy replication
policy
How to remove a
copy
To modify the policy settings of an existing copy, select the copy in
the Navigation Pane, and click its Policy Tab in the Component
Pane. All of the “Configuring copy policies” on page 152 can be
modified through the copy’s Policy Tab.
To remove a copy, either:
◆
Right-click on the copy name in the Navigation Pane and select
the Remove Copy option.
or
◆
Select a copy in the Navigation Pane, and click the Remove Copy
button above the Component Pane.
See also: “Copy commands” on page 193
How to disable a copy
To disable an enabled copy, either:
◆
Right-click on the copy name in the Navigation Pane and select
the Disable Copy option.
or
◆
Select a copy in the Navigation Pane, and click the Disable Copy
button above the Component Pane.
Note: The Production copy cannot be disabled. Caution: Disabling a
copy stops all replication, deletes journals, and causes full sweep
when copy is re-enabled.
See also: “Copy commands” on page 193
164
EMC RecoverPoint Release 3.3 Administrator’s Guide
Starting Replication
How to enable a copy
To enable a disabled copy, either:
◆
Right-click on the copy name in the Navigation Pane and select
the Enable Copy option.
or
◆
Select a copy in the Navigation Pane, and click the Enable Copy
button above the Component Pane.
Note: Enabling a copy causes a full sweep synchronization of the
volumes at the copy. Before you start transfer to any replica, make
certain that the replica volumes are unmounted from any hosts and any
volume groups are deported from the logical volume manager (AIX,
HP-UX, Windows, and Solaris have volume managers built into the
operating system; Veritas Volume Manager can be used with any of these
operating systems).
See also: “Copy commands” on page 193
How to modify the
journal of a copy
To change the storage capacity of a journal at a copy by adding or
removing journal volumes:
Note: If you are using this procedure because the capacity of your
group's copy journals is less than the minimum journal size required for
distributed groups (see the EMC RecoverPoint and RecoverPoint/SE Release
Notes for this limit), after finishing the procedure, disable and then
re-enable the consistency group (causing a full sweep). See “Distributed
consistency groups” on page 58 and “Full sweeps” on page 80 for more
information.
1. Select the group that the journal belongs to in the Navigation
Pane and click the Add/Edit Journal Volumes button above the
Component Pane, see “Copy commands” on page 193.
The Journal Volumes Wizard is displayed.
2. For each copy;
a. Click the Rescan button to update the list of available
volumes.
b. Select the volumes that you want to add to the journal at the
copy site. Multiple volumes can be selected.
For best performance, select volumes that are identical in size.
If identically sized volumes are not available, select volumes
that are similar in size.
Modifying existing settings and policies
165
Starting Replication
Use the Filter volumes by: fields to display the relevant
volumes.
Note: In RecoverPoint only: Only the masked volumes at the
specified site are displayed in the available volumes list. Therefore, in
the Volume Details area, ensure the selected volumes are seen by all
RPAs. If they are not, mask the unseen LUNs to RecoverPoint
WWNs, click the Rescan button to update the list of available
volumes, and redo this step.
c. Click the Next > button.
When all of copies in the consistency group are assigned journals,
the Journal Volumes Summary Page is displayed.
3. Apply your settings by clicking the Finish button.
How to modify an
existing replication
set
In the RecoverPoint system, data consistency and write-order fidelity
are maintained across all volumes, by replication sets (see “Replication
sets” on page 33), and the maximum possible volume capacity is
defined by the physical size of the smallest volume of the replication
set. Therefore, when you wish to add storage capacity to a particular
volume, you must resize all of the volumes in the replication set.
When changing the storage capacity of volumes in RecoverPoint, the
following considerations apply:
◆
When a LUN’s storage capacity changes on the host storage
system, the volume representing the LUN in RecoverPoint must
be removed from, and then re-added to, the RecoverPoint system,
for the change in LUN size to be reflected in RecoverPoint.
◆
When storage sub-systems do not support the resizing of LUNs
on storage without losing the data contained on them, contact
EMC Customer Service for instructions on how to resize volumes.
The following procedures will guide you through the process of
modifying existing replication sets:
◆
◆
How to rename or
remove a replication
set
“How to rename or remove a replication set”
“How to enlarge the storage capacity of replica volumes”
To rename or remove a replication set:
1. Select a consistency group in the Navigation Pane.
2. Select the consistency group’s Replication Sets Tab in the
Component Pane.
166
EMC RecoverPoint Release 3.3 Administrator’s Guide
Starting Replication
3. Double-click the replication set whose settings you want to
modify. The Volumes Configuration dialog box is displayed.
In the Volumes Configuration dialog box, you can change the
replication set name or remove the whole replication set and add
another in its place. You can also select one of the copies in the
Navigation area to remove a volume at the copy and replace it
with another.
Note: For CLARiiON splitter environments, select replica volumes that
are exactly the same size as the Production volume.
How to enlarge the
storage capacity of
replica volumes
Read “How to modify an existing replication set” on page 166 before
starting this procedure.
To enlarge the storage capacity of replica volumes:
1. In RecoverPoint:
Disable the group whose replication set you wish to resize, to
avoid a configuration settings conflict. To do so;
a. Right-click on the group name in the Navigation Pane; and
select the Disable group option.
Note: The replica journal history is erased, and the replica can no
longer be rolled back to a previous point-in-time.
b. Select the group containing the replication set that you wish to
resize, in the Navigation Pane; select the Replication Sets
Tab, and double-click on the replication set.
The Volumes Configuration dialog box is displayed.
Modifying existing settings and policies
167
Starting Replication
c. In the Volumes Configuration dialog box; click the Remove
Replication Set button.
d. Click the Apply button, but do not exit the Volumes
Configuration dialog box.
The replication set and its volumes are removed from the
RecoverPoint system.
2. At the replica’s SAN: Dedicate more storage resources to the
required LUNs, enlarging their storage capacity.
3. In RecoverPoint:
While still in the Volumes Configuration dialog box; Configure
the new replication set. To do so;
a. Click the Add New Replication Set button.
b. For each copy;
a. In the Replication Sets node of the Navigation Pane; select
a copy, and click the Add Volume button.
The Select Volume dialog box is displayed.
b. In the Select Volume dialog box; click the Rescan button.
The RecoverPoint SAN discovery utility automatically
detects the change in the physical size of the volume.
Now add the new replication set, and its new volumes, to
the RecoverPoint system.
Note: For CLARiiON splitter environments, select replica
volumes that are exactly the same size as the Production volume.
c. Double-click on the resized LUN. Click the OK button to
apply the changes and exit the Select Volume dialog box.
The new replication set is created, and the resized LUNs
are added (as volumes) to the replication set. The size of
the replication set is defined by the physical size of the
smallest volume of the replication set.
c. Right-click on the disabled group’s name in the main
Navigation Pane; and select the Enable group option. Make
sure that the Start data transfer immediately checkbox is
checked, and click the OK button.
168
EMC RecoverPoint Release 3.3 Administrator’s Guide
Starting Replication
Note: To ensure consistency between replica volumes and their
production sources, a full sweep synchronization is triggered on all
volumes in the consistency group.
For Windows only:
Perform the following steps only if the volume is on a Windows host.
4. At the production host: Extend the production volumes in
Windows. To do so;
If the production host is a Windows machine, at the production
host: Follow the instructions in “How to use Diskpart.exe to
extend a data volume in Windows Server 2003, in Windows XP,
and in Windows 2000” at:
http://support.microsoft.com/kb/325590/en-us
5. In RecoverPoint:
Bookmark the point-in-time that you wish to apply to the replica,
give it a name that signifies the end of this process. To do so;
a. In the Navigation Pane; right-click on the consistency group
containing the new, larger volumes and select Bookmark
Image. The Create a bookmark image for group dialog box is
displayed. In the Create a bookmark image for group dialog
box; type a bookmark name into the edit box, and click the OK
button. This will become the first (pre-replication) image of
the new volumes.
b. From the Image Access menu, select Enable Image Access
and select the bookmark that you created in the last step. After
selecting the required image; select Logged Access, and click
the Next button. In the summary screen, click the Finish
button.
6. At the replica host:
Ensure there are no writes to the replica’s storage, then clear
the OS cache, invalidating the Windows partition table cache
of the old replica volumes. To do so;
a. First-time initialization can cause changes in the partition
table of the replica volume (or volumes). In a Windows
environment, you must clear the OS cache before changing the
partition table of a replica volume. To clear the OS cache, shut
down all host applications, disable the LUN (or LUNs) on
which the volume resides, and re-enable it. You can do so
Modifying existing settings and policies
169
Starting Replication
either using the relevant commands in the RecoverPoint kutils
utility (see “Kutils Reference” on page 311), or from the “Disk
drives” interface of the Windows Device Manager.
b. Shut down all host applications, unmount all volumes.
7. In RecoverPoint:
Disable access to the replica image, apply all writes from the
image access log to the replica storage, and start transfer. To do so;
From the Image Access menu, select Disable Image Access. Click
the OK button.
How to modify an
existing journal
There are two ways to change the storage capacity an existing
journal:
◆
Adding additional journal volumes to a journal in
RecoverPoint; this process does not trigger a full sweep
synchronization, nor does it erase all history in the existing
journal. After this procedure, the replica can still be rolled back to
a previous point-in-time.
Note: The setting of a regular consistency group as a distributed
consistency group, when the capacity of the group's copy journals is less
than the minimum journal size required for distributed groups (see the
EMC RecoverPoint and RecoverPoint/SE Release Notes for this limit),
necessitates the addition of additional journal volumes, followed by a
disabling and re-enabling of the consistency group, which does cause a
full sweep. See “Distributed consistency groups” on page 58 and “Full
sweeps” on page 80 for more information.
See “How to modify the journal of a copy” on page 165 for
detailed instructions on how to perform this procedure.
◆
Resizing an existing journal volume’s LUN on storage; this
process does trigger a full sweep synchronization, and it does
erase all history in the existing journal. After this procedure, the
replica cannot be rolled back to a previous point-in-time.
See “How to resize an existing journal volume” on page 171 for
detailed instructions on how to perform this procedure.
170
EMC RecoverPoint Release 3.3 Administrator’s Guide
Starting Replication
How to resize an
existing journal
volume
Read “How to modify an existing journal” on page 170 before
starting this procedure.
To resize journals by resizing an existing journal volume:
1. In RecoverPoint:
a. Enable Image Access in Logged Access mode. Choose the
‘Select the latest image’ option, see “Accessing a replica” on
page 242.
Note: This ensures that all writes up to the current point-in-time are
transferred from the replica journal to the replica storage, and only
the host writes from this point forward, will need to be synchronized
at the end of this procedure.
b. Right-click on the copy name in the Navigation Pane; and
select the Disable copy option.
Note: The replica journal history is erased, and the replica can no
longer be rolled back to a previous point-in-time.
c. Select the group that the journal volume belongs to in the
Navigation Pane, select the Replication Sets Tab, and
double-click on one of the replication sets in the replication set
list.
The Volumes Configuration dialog box is displayed.
d. In the Volumes Configuration dialog box; select the required
journal volume’s VOL ID in the Navigation Pane, and click
the Remove Volume button.
e. Click the Apply button, but do not exit the Volumes
Configuration dialog box.
The journal volume is removed from the RecoverPoint system.
2. At the replica’s SAN: Dedicate more storage resources to the
required LUN.
Modifying existing settings and policies
171
Starting Replication
3. In RecoverPoint:
While still in the Volumes Configuration dialog box;
a. In the Journals node of the Navigation Pane, select the copy
containing the journal volume, and click the Add New Journal
Volume button.
The Select Volume dialog box is displayed.
b. In the Select Volume dialog box; click the Rescan button.
The RecoverPoint SAN discovery utility automatically detects
the change in the physical size of the volume.
c. Double-click on the resized LUN. Click the OK button to
apply the changes and exit the Select Volume dialog box.
The volume is added to the RecoverPoint system.
Then, in the Main Interface Navigation Pane;
d. Right-click on the disabled copy’s name in the main
Navigation Pane; and select the Enable copy option. Make
sure that the Start data transfer immediately checkbox is
checked, and click the OK button.
Note: Data transfer is briefly paused for the group, and a short full
sweep synchronization may occur, but only the writes that occurred
after image access was enabled to the replica, will be synchronized.
172
EMC RecoverPoint Release 3.3 Administrator’s Guide
Starting Replication
Manually attaching volumes to splitters
Before you begin, make sure you are well acquainted with the
concepts; “Splitters” on page 27 and “Documentation relevance per
RecoverPoint product” on page 14.
Volumes need only be manually attached to splitters that are added
to RecoverPoint after the addition of the volumes to a replication set.
When volumes are added to a replication set, they are automatically
attached to all splitters that see them in the SAN.
For boot-from-SAN
groups:
When a consistency group is configured to boot from SAN, special
considerations and procedures are necessary, please contact EMC
Customer Service for more information.
For Brocade splitters:
If you added splitters based on a Connectrix AP-7600B or
PB-48K-AP4-18 switch, make sure you follow the instructions in
“Configure RecoverPoint for replication over the Connectrix device” in
EMC RecoverPoint Deploying RecoverPoint with Connectrix AP-7600B
and PB-48K-AP4-18 Technical Notes.
For CLARiiON splitters:
For CLARiiON splitters, Navisphere login credentials must be
defined for each splitter added to the RecoverPoint system, see
“Splitter credentials” on page 269.
If you attach a volume to an RPA cluster, and that volume is already
attached to a different RPA cluster that shares the same CLARiiON
splitter, the volume appears to attach successfully to the second RPA
cluster, but then faults to the “Attached to other RPA cluster/s” error
state. The volume cannot be used by the RPA cluster to which it was
just attached. To correct this error, use the Detach command on the
Splitter Properties dialog box to detach the volume from the RPA
cluster.
Note: To avoid this problem, a volume should be masked to a single RPA
cluster. A volume that is masked for one RPA cluster should not be masked
for another RPA cluster.
For SANTap splitters:
For SANTap splitters, switch login credentials must be defined for
each splitter added to the RecoverPoint system, see “Splitter
credentials” on page 269.
If you added fabric splitters with SANTap service, create AVTs. Refer
to the EMC RecoverPoint Deploying RecoverPoint with SANTap Technical
Notes, “Creating AVTs” and “Attaching volumes” sections for details.
Manually attaching volumes to splitters
173
Starting Replication
To manually attach volumes to splitters:
1. Select Splitters in the Navigation Pane and double-click a splitter
in the Component Pane.
The Splitter Properties dialog box is displayed.
Note: If you have a SANTap or CLARiiON splitter, you can click the
Credentials button in the Splitter Properties dialog box to enter login
credentials at this time, see “Splitter credentials” on page 269.
2. Click the Rescan button.
3. Click the Attach button.
The volumes that can be attached to the splitter are displayed.
Select one or more volumes to attach to the splitter. Click OK to
attach to selected volume to the splitter.
!
CAUTION
When you attach a volume to a splitter, RecoverPoint ensures
consistency between the replica and the production source by
performing a full sweep synchronization of the volume.
Checking the Attach as Clean checkbox overrides the default
synchronization process by informing the system that the
replica volume being attached to the splitter is known to be an
exact image of its corresponding production volume. If Attach
as Clean is checked, and the replica volume being attached is
inconsistent with its corresponding production volume, it will
remain inconsistent. To ensure consistency, best practice to use
the default RecoverPoint synchronization process.
RecoverPoint automatically detects all paths from the splitter to
the volume. If no path exists between the splitter and the volume,
you cannot attach the volume to that splitter.
4. If there are volumes that have not yet been attached to a splitter,
the following warning is displayed in the status line of the main
RecoverPoint Management Application interface window.
174
EMC RecoverPoint Release 3.3 Administrator’s Guide
Starting Replication
a. Click on the warning for more information. If there are
volumes which are not attached to their splitter, the following
is displayed.
b. Click on this warning to display the list of splitters for which
volumes are still available to be attached. To attach the
volumes, in the Navigation Pane, select Splitters. In the
Component Pane, double-click the splitter name. Click Attach
to view the volumes available to be attached to that splitter.
5. After adding splitters and attaching volumes to them, enable the
consistency group. To do so, follow the instructions in “How to
modify an existing consistency group” on page 158.
You cannot replicate a volume until it is attached to a splitter. You
cannot fail over to a local or a remote replica of a volume until it is
attached to a splitter.
For descriptions of the available splitter commands, see “Splitter
management” on page 199.
Manually attaching volumes to splitters
175
Starting Replication
176
EMC RecoverPoint Release 3.3 Administrator’s Guide
4
Managing and Monitoring
Managing and
Monitoring
This section describes how to manage and monitor replication in
RecoverPoint. The topics in this section include:
◆
◆
RecoverPoint Management Application ...................................... 178
Monitoring and analyzing system performance ......................... 213
Managing and Monitoring
177
Managing and Monitoring
RecoverPoint Management Application
Almost all of the information necessary for the routine monitoring
and managing of the RecoverPoint system is displayed through the
RecoverPoint Management Application.
The information, commands, and settings used to monitor and
manage RecoverPoint are displayed in numerous panes and tabs.
How to use the panes to monitor and manage replication, disaster
recovery, and related commands is described in the following
sections.
Figure 5
178
RecoverPoint Management Application
EMC RecoverPoint Release 3.3 Administrator’s Guide
Managing and Monitoring
Note: The RecoverPoint system will enter maintenance mode when
undergoing one of the following operations: minor version upgrade, major
version upgrade, replace an RPA in an existing cluster, and add new RPAs to
existing clusters. When in this mode, RecoverPoint can only monitor the
system; user-initiated capabilities are disabled. The title bar will indicate the
name of the site that is maintenance mode and the operation that is being
performed. Once the system exits this mode, the RecoverPoint GUI will
return to standard managing and monitoring function. For more information
on these maintenance operations, refer to the EMC RecoverPoint Deployment
Manager Product Guide.
The System Pane
The System Pane provides an overview of system health at a glance.
The pane shows the status of major components of the RecoverPoint
system environment, including the hosts, switches, storage devices,
RPAs at two sites, and the WAN connection.
When the system detects a problem with one of these major
components, the System Pane displays the following:
◆
An error icon
on each component that is not functioning
properly. For a warning, this icon is yellow
.
◆
The first line of the warning or error message is displayed when
you place your cursor over the image of the currently faulty
component. Click the component’s image to display additional
details about the specific error or warning.
◆
The status of a component, particularly until you have completed
logging in to the system, may also be Unknown. In that case, the
icon is displayed on the component.
RecoverPoint Management Application
179
Managing and Monitoring
Multipath monitoring
When multipathing monitoring is active, the system analyzes
network errors at the level of individual paths, and generates a
warning in the System Pane (“The System Pane” on page 179)
whenever there is not full redundancy between the RPA and splitters
or volumes. Full redundancy is defined as follows:
◆
For RPA-to-volume links, there must be at least two distinct paths
between each RPA and volume; that is, each RPA has access to at
least two storage WWNs (and controllers, where relevant) via
non-overlapping paths.
◆
For RPA-to-splitter links, there must be at least two distinct paths
between each RPA and each splitter using different RPA ports
and host (or switch) WWNs.
When RPA Multipath Monitoring is enabled, the system issues a
warning upon logging in regarding any existing links without full
redundancy. In addition, warning events are written to the log.
By default, multipath monitoring is active for all replicas, for links
both to storage and to splitters.
To enable or disable RPA Multipath Monitoring:
1. At the RecoverPoint Management Application, from the System
menu, select System Settings > Miscellaneous Settings.
2. Select or clear the checkboxes to select the desired options. Then
click Apply.
Note: In the RecoverPoint Management Application, warnings are
displayed in the System Pane. Path information is displayed in the Volume
Configuration and Splitter Properties dialog boxes.
The Traffic Pane
180
The Traffic Pane displays the amount of SAN and WAN traffic
passing through the RPAs.
EMC RecoverPoint Release 3.3 Administrator’s Guide
Managing and Monitoring
The Navigation
Pane
The Navigation Pane allows you to navigate to the different tabs
available in the Component Pane. In the Navigation Pane, click the
component on which you wish to focus. The corresponding
component’s information is displayed in the Component Pane, in
one or more dedicated tabs, see “The Component Pane” on page 185.
The Navigation Pane does not display monitoring information, but it
does provide access to a very large number of management
commands by right-clicking on system component names in the
navigation tree.
Multiple group
management
The Navigation Pane is context sensitive, therefore, most of the
commands used for the management of all groups are displayed
when you right-click on the Consistency Groups node.
Note: Additional management commands can be found in the button area
above the Component Pane, when Consistency Groups is selected, see
“Multiple consistency group commands” on page 186.
RecoverPoint Management Application
181
Managing and Monitoring
Specific group
management
Most of the commands used for the management of a specific
consistency groups are displayed when you right-click on a specific
consistency group name.
Note: Additional management commands can be found in the button area
above the Component Pane, when a specific consistency group is selected,
see “Specific consistency group commands” on page 188.
182
EMC RecoverPoint Release 3.3 Administrator’s Guide
Managing and Monitoring
Copy management
Most of the commands used for the management of replica copies are
displayed when you right-click on a specific copy name.
Note: Additional management commands can be found in the button area
above the Component Pane, when a copy is selected, see “Copy commands”
on page 192.
RecoverPoint Management Application
183
Managing and Monitoring
Splitter management
Most of the commands used for the management of splitters are
displayed when you right-click on Splitters in the Navigation Pane.
Note: Additional management commands can be found in the button area
above the Component Pane, when the Splitter Tab is selected, see “Splitter
commands” on page 199.
RPA management
Volume management
vCenter Server
management
The commands used for the management of RPAs are displayed
above the Component Pane when you click on RPAs in the
Navigation Pane, see “RPA commands” on page 202.
The commands used for the management of Volumes are displayed
above the Component Pane when you click on Volumes in the
Navigation Pane, see “Volume commands” on page 203.
Most of the commands related to vCenter Servers are displayed when
you right-click on vCenter Servers in the Navigation Pane.
Note: Additional management commands can be found in the button area
above the Component Pane, when the vCenter Servers Tab is selected, see
“vCenter Server commands” on page 205.
184
EMC RecoverPoint Release 3.3 Administrator’s Guide
Managing and Monitoring
Log management
Most of the commands used for the management of event logs are
displayed when you right-click on Logs in the Navigation Pane.
Note: Additional management commands can be found in the button area
above the Component Pane, when the Logs Tab is selected, see “Logging
commands” on page 210.
The Component
Pane
All RecoverPoint commands can be accessed through the button area
above the Component Pane.
The Component Pane is context-sensitive, and the specific tabs that
are displayed in the Component Pane depend on the entity selected
in the Navigation Pane.
General commands
Table 21
Button
The following commands are accessed through the context sensitive
menu that is displayed when you right-click any component in the
Navigation Pane.
General commands
Name
Displayed
Description
Refresh
Always
Refresh refreshes the display. Refresh is useful when changes
have been via the command-line interface. If the display is not
updated automatically, Refresh forces an update.
Expand All
Always
Expands and lists names of all replicas under the consistency
group name.
Collapse All
Always
Collapses names of all replicas under the consistency group
name.
RecoverPoint Management Application
185
Managing and Monitoring
Multiple consistency
group management
The following tools are available for the management of multiple
consistency groups.
Multiple consistency group commands
Most of the commands listed here are accessible through the context
sensitive menu that is displayed when you right-click on the
Consistency Groups node in the Navigation Pane.
Note: Additional commands are also displayed through buttons that are
displayed at the top right corner of the Component Pane when the
Consistency Groups node is selected, and the criteria for display are met.
Table 22
Button
186
Multiple consistency group commands
Name
Displayed
Description
Add Group
Always
Displays the New Consistency Group Wizard, to guide you
through the process of creating a new consistency group, see
“How to configure a new consistency group” on page 133.
Group Sets
Always
Allows you to create, edit, and delete group sets, see “Automatic
periodic bookmarking” on page 237.
Remove Group
Only available from button
area at the top of the
Component Pane: only if
one or more specific
consistency groups are
selected in the Consistency
Group Tab of the
Component Pane.
Deletes the selected consistency group. Caution: Cannot be
undone.
Disable / Enable
Group
Only available from button
area at the top of the
Component Pane if one or
more consistency groups are
selected in the Consistency
Group Tab of the
Component Pane.
Displayed as:
• Disable Group if selected
groups are enabled.
• Enable Group if selected
groups are disabled.
Disables or enables the selected consistency groups.
Caution: Disabling a consistency group stops all replication,
deletes journals, and causes full sweep if enabled again.
EMC RecoverPoint Release 3.3 Administrator’s Guide
Managing and Monitoring
Table 22
Button
Multiple consistency group commands
Name
Displayed
Description
Apply Parallel
Bookmark
Only available from button
area at the top of the
Component Pane: only if
one or more specific
consistency groups are
selected in the Consistency
Group Tab of the
Component Pane.
Note: All selected groups
must be enabled and transfer
must be active.
Applies the same bookmark with the same name to all selected
groups. The bookmarked snapshots will be as close together in
time as possible, see “Applying bookmarks to multiple groups
simultaneously” on page 236.
The Consistency Groups Tab
When the Consistency Groups node is selected in the Navigation
Pane, the Consistency Group Tab is displayed in the Component
Pane.
Figure 6
Consistency Groups Tab
The Consistency Groups Tab displays the status of every consistency
group. Use this screen for monitoring all consistency groups
simultaneously. When RecoverPoint is configured to transfer data
synchronously the word (Synchronized) is displayed next to the
transfer state in the transfer column of this table, see Table 14,
“Consistency Group Protection Policy Settings,”.
RecoverPoint Management Application
187
Managing and Monitoring
Specific consistency
group management
The following tools are available for the management of specific
consistency groups.
Specific consistency group commands
Most of the commands listed here are accessible through the context
sensitive menu that is displayed when you right-click on a specific
consistency group name in the Navigation Pane.
Note: Additional commands are also displayed through buttons that are
displayed at the top right corner of the Component Pane when a specific
consistency group is selected, and the criteria for display are met.
Table 23
Button
Specific consistency group commands
Name
Displayed
Description
Remove Group
Only if a consistency group is
selected.
Deletes the consistency group. Caution: Cannot be undone.
Disable / Enable
Group
Only if a consistency group is
selected.
Displayed as:
• Disable if group is
enabled.
• Enable if group is
disabled.
Disables or enables the selected consistency group.
Caution: Disabling a consistency group stops all replication,
deletes journals, and causes full sweep if enabled again.
Pause / Start
Transfer
Displayed as:
• Pause if RPA is
transferring writes.
• Start if RPA transfer is
paused.
Applies to all replicas of the consistency group.
Causes transfer of writes from the host to the replica to pause or
start. The journal continues to distribute snapshots.
Usage: Best practice is not to use this command. Use
consistency group policies to set policies for use of bandwidth.
Pause Transfer may be used when WAN bandwidth is very
limited and you wish to give the largest bandwidth possible to
another consistency group. In that case, you may temporarily
pause transfer for lower-priority consistency groups.
When Pause Transfer is activated, the Pause Transfer button
becomes the Start Transfer button. As soon as possible, use the
Start Transfer button to resume normal transfer.
Bookmark Image Only if group is enabled.
188
Displays the Create a bookmark image for <Group> dialog
box, enabling you to add a named bookmark to the current
snapshot. When the snapshot is closed, it will be listed with the
bookmark name that you give it.
EMC RecoverPoint Release 3.3 Administrator’s Guide
Managing and Monitoring
Table 23
Button
Specific consistency group commands
Name
Displayed
Description
Clear Markers
Only when right-clicking on a
group name in the
Navigation Pane when
transfer to at least one copy
in group is paused.
Select for which replica or
replicas to clear markers.
For the selected consistency groups, Clear Markers clears all
markers of the selected copy from the production journal volume;
that is, it treats the selected replica as identical to the production
source. Caution: To clear markers, the production source and
the selected replica volume must be absolutely identical. If they
are not, the inconsistencies will remain.
The Clear Markers command should be used only with extreme
caution. It is useful when a production source and a replica have
been synchronized manually by initializing from backup and
adequate bandwidth is not available to synchronize using the
storage network. Otherwise, the best practice is not to use this
command.
Set Markers
Only when right-clicking on a
group name in the
Navigation Pane when
transfer to at least one copy
in group is paused.
Select for which replica or
replicas and for which
replication sets to set
markers.
Set Markers causes the selected replica or replicas to be
resynchronized by a full sweep. When replicas are inconsistent,
the system will automatically cause synchronization also without
invoking the Set Markers command.
The Set Markers command is useful if you have accidentally
cleared markers; or if you attached as clean and then realize that
the replica may not be clean. Otherwise, best practice is not to
use this command.
Add Replication
Set
Only in the button area above
the Component Pane.
Displays the New Replication Set Wizard, to guide you through
the process of adding a new replication set to the consistency
group, see “How to add a new replication set” on page 160.
Remove
Replication Set
Only in the button area above
the Component Pane, if the
Replication Sets Tab, and
one of the replication sets in
the table, are selected.
Removes the selected replication set, and all of its volumes, from
the RecoverPoint configuration.
Add/Edit Journal
Volumes
Only in the button area above
the Component Pane.
Displays the Journal Wizard, to guide you through the process
of adding or removing volumes from the journal of a copy, see
“How to modify the journal of a copy” on page 165.
RecoverPoint Management Application
189
Managing and Monitoring
The Status Tab
To display a graphical representation of the consistency group
transfer and failover status, select the consistency group in the
Navigation Pane. Then in the Component Pane, click the Status Tab.
The following information is displayed:
Table 24
190
Status Tab
Num
Label
Displays
1
<Group>: Running on <RPA2>
Group name and Primary RPA setting, see “The Policy Tab” on page 192.
2
<Local Copy> at <New York>
Copy name and Site Name, see “The Policy Tab” on page 199.
EMC RecoverPoint Release 3.3 Administrator’s Guide
Managing and Monitoring
Table 24
Status Tab
Num
Label
Displays
3
Replication modes and states
A visual representation of the current replication modes and states.
• A dashed green line means that the system is replicating asynchronously
• A dashed green line on top of a solid line means that the system is replicating
synchronously
• A greyed out line means that replication has stopped.
Note: The lines move in the direction of replication.
4
Transfer :
Data transfer state: initializing (with percent completion), paused (by user or by
system) or active. When replicating synchronously, the word synchronized is
displayed in active transfer mode.
5
Role :
The current role of the copy with regards to failover and regulation.
Before failover: Production source and local/remote replica
After failover: Replica at production and local/remote source
During regulation: Regulated
Storage :
The state of image access, see “Image access” on page 73.
Image :
The image currently being distributed to storage, see “Image access” on page 73.
The Statistics Tab
To display traffic and replication performance statistics, select the
consistency group in the Navigation Pane. Then in the Component
Pane, click the Statistics Tab.
The following traffic statistics are displayed in the Traffic sub-tab:
◆
◆
◆
◆
Total traffic
Application traffic
Initialization traffic
Incoming writes
If the consistency group contains a remote copy, the following
replication performance statistics are displayed in a separate
Replication Performance sub-tab:
◆
◆
◆
◆
Bandwidth reduction
Time lag
Writes lag
Data lag
RecoverPoint Management Application
191
Managing and Monitoring
The Replication Sets Tab
To understand the concept of replication sets, read “Replication sets”
on page 33. To understand the concept of volumes in RecoverPoint,
read “Volumes” on page 36.
To display the consistency group replication sets and their volumes;
select a consistency group in the Navigation Pane and click the
Replication Sets Tab in the Component Pane.
The volumes of each replication set are displayed. The total
replication set size (the size of the smallest volume in the replication
set) and volume ID of each volume of the replication set, are also
displayed. Double-clicking on a replication set opens the Volumes
Configuration dialog box, described in “How to enlarge the storage
capacity of replica volumes” on page 167 and “How to resize an
existing journal volume” on page 171.
The Journals Tab
To display the configuration information of existing journals in a
consistency group, select a consistency group in the Navigation
Pane. Then, in the Component Pane, click the Journals Tab.
The following information is displayed:
◆
◆
The number of volumes that make up the journal at each copy
The total size of the journal at each copy
Note: To edit a journal, click the Add/Edit Journal Volumes button (
at the top of the Component Pane.
)
The Policy Tab
To display the consistency group policies and settings, select a
consistency group in the Navigation Pane. Then, in the Component
Pane, click the Policy Tab. The policies and settings for the
consistency group are displayed. Policies and Settings are described
in “Creating new consistency groups” on page 132.
Copy management
The following tools are available for the management of copies.
Copy commands
Most of the commands listed here are accessible through the context
sensitive menu that is displayed when you right-click on a specific
copy name in the Navigation Pane.
192
EMC RecoverPoint Release 3.3 Administrator’s Guide
Managing and Monitoring
Note: Additional commands are also displayed through buttons that are
displayed at the top right corner of the Component Pane when a specific
consistency group is selected, and the criteria for display are met.
Table 25
Button
Copy commands
Name
Displayed
Description
Add Copy
Only if a specific group is
selected in the Navigation
Pane, and the group contains
zero or one copies.
Displays the New Copy Wizard, to guide you through the
process of adding a new copy to the consistency group, see
“How to add a new copy” on page 159.
Remove Copy
Always
Deletes the copy.
Caution: Cannot be undone.
Disable / Enable
Copy
Only if one or more copies
are selected.
Displayed as:
• Disable if copy is
enabled.
• Enable if copy is
disabled.
Disables or enables the selected copy.
Caution: Disabling a copy stops all replication, deletes journals,
and causes full sweep when copy is re-enabled.
Enable Image
Access / Access
Another Image
Displayed as:
• Enable Image Access if
image access is disabled
(during replication)
• Access Another Image if
image access is already
enabled
Enables hosts to access an image of a replica. After image
access is enabled, use Access Another Image to select another
snapshot to recover.
See “Accessing a replica” on page 242.
RecoverPoint Management Application
193
Managing and Monitoring
Table 25
Button
194
Copy commands
Name
Displayed
Description
Disable Image
Access
Only if image access was
enabled to the selected copy.
Stops hosts from being able to access images of a replica, and
enables RecoverPoint to resume replication.
Unmount the volume from the host at this replica before
disabling.
To disable image access.
• If you were in Logged access mode, any writes made directly
to the LUN while image access was enabled will be
discarded. Distribution from the journal to the storage will
continue from the accessed image forward.
• If you were in Virtual access mode, the virtual LUN and any
writes to it will be discarded. Distribution will continue from the
last snapshot that was distributed before the image access.
• If you were in Virtual access with Roll image in
background, the virtual LUN and any changes to it and any
writes made directly to storage will be discarded. Distribution
will continue from whatever snapshot the system has rolled to.
See “Accessing a replica” on page 242.
Undo Writes
Only if image access was
enabled to the selected copy.
To undo the writes recorded in the image access log without
disabling image access.
Roll to Image
Only available if image
access was enabled, and
Virtual access without Roll
image in background was
selected.
To roll the stored replica to the selected image.
Enable Direct
Access
Only if image access was
enabled to the selected copy.
To enable hosts to access an image of a replica, without
imposing a limit on the amount of data that you can write to
storage. In addition, Direct Image Access gives better system
performance when accessing the replica, because no rollback
information to the image access log is being written in parallel
with the ongoing disk I/Os. Hence, this option may be preferred if
you want to carry out processing that generates a high volume of
write transactions at the replica. It can also be used for testing
the replicated images of BFS groups. See “Direct Image Access”
on page 246.
Move to
Previous Point in
Time
Only if image access was
enabled to the selected copy.
Roll the stored image back one snapshot.
Move to Next
Point in Time
Only if image access was
enabled to the selected copy.
Roll the stored image forward one snapshot.
EMC RecoverPoint Release 3.3 Administrator’s Guide
Managing and Monitoring
Table 25
Button
Copy commands
Name
Displayed
Description
Start / Pause
Transfer
Displayed as:
• Pause if RPA is
transferring writes.
• Start if RPA transfer is
paused.
Causes transfer of writes from the host to the replica to pause or
start. The journal continues to distribute snapshots.
Usage: Best practice is not to use this command. Use
consistency group policies to set policies for use of bandwidth.
Pause Transfer may be used when WAN bandwidth is very
limited and you wish to give the largest bandwidth possible to
another consistency group. In that case, you may temporarily
pause transfer for lower-priority consistency groups.
When Pause Transfer is activated, the Pause Transfer button
becomes the Start Transfer button. As soon as possible, use the
Start Transfer button to resume normal transfer.
-
Failover
Only if image access was
enabled to the selected copy.
Caution: This command will erase the journal at this site.
To use the selected (local or remote) replica as the source.
Transfer from production will stop.
If the system has only a local or only a remote copy, but not both,
this replica will automatically become the production source and
the production source will become the local or remote replica.
If the system has three copies (production, local, and remote),
transfer to the third copy will not be resumed until production is
restored as the source.
In a three-copy configuration, to convert the current source to the
production source, select Set local copy as production or Set
remote copy as production.
-
Recover
Production
Only if image access was
enabled to the selected copy.
Repairs the production source using the replica as the source.
Recover Production is only available if the replica’s journal is still
in tact; therefore, Recover Production is not available if you used
Direct Image Access or after distributing a snapshot that is
larger than the capacity of the journal (refer to Table 19 on
page 155).
Transfer from production source will be paused. Transfer to a
third copy will not resume until production is restored as the
source. Host access to the selected replica will be blocked. You
will only be able to restore the production source from the
selected replica.
While being restored, the role of the production replica will be
“Production (being restored).” When the restore is completed,
enable image access at the production source, and select the
failover option Resume production. The production journal is
discarded.
RecoverPoint Management Application
195
Managing and Monitoring
Table 25
Copy commands
Button
Name
Displayed
Description
-
Set
Local/Remote
Copy as
Production
Only after image access was
enabled, and a failover was
performed on the selected
copy, in a three-copy
configuration.
Sets the current replica as the production source.
If the local replica is converted to the production source and there
is a remote replica, the remote replica will require a full sweep. If
the remote replica is converted to the production source and
there is also a local replica, you must delete either the original
production source or the local replica, before the remote replica
can become a production source. In other words, having two
remote replicas is not supported.
Resume
Production
Only after image access was
enabled, and after either
failover or a recover
production was performed
on the selected copy.
Restores the production copy as the data source.
Clear Markers
Only available from
right-clicking on a copy name
in the Navigation Pane
when transfer to the copy is
paused.
Select for which replica or
replicas to clear markers.
Clears all markers of the selected copy from the production
journal volume; that is, it treats the selected replica as identical to
the production source. Caution: To clear markers, the production
source and the selected replica volume must be absolutely
identical. If they are not, the inconsistencies will remain.
The Clear Markers command should be used only with extreme
caution. It is useful when a production source and a replica have
been synchronized manually by initializing from backup and
adequate bandwidth is not available to synchronize using the
storage network. Otherwise, the best practice is not to use this
command.
Set Markers
Only available from
right-clicking on a copy name
in the Navigation Pane
when transfer to the copy is
paused.
Select for which replica or
replicas and for which
replication sets to set
markers.
Causes the selected replica or replicas to be resynchronized by a
full sweep. When replicas are inconsistent, the system will
automatically cause synchronization also without invoking the
Set Markers command.
The Set Markers command is useful if you have accidentally
cleared markers; or if you attached as clean and then realize that
the replica may not be clean. Otherwise, best practice is not to
use this command.
The Journal Tab
To display journal information for a replica, select that replica under
its consistency group in the Navigation Pane. In the Component
Pane, click the Journal Tab.
196
EMC RecoverPoint Release 3.3 Administrator’s Guide
Managing and Monitoring
The following tables describe the information displayed in the
Journal Tab for a replica.
Table 26
Journal Tab: Image information
Info
Description
Current
Snapshot currently being distributed to the journal.
Storage
Storage access status. The current condition of a volume at the replica storage can be:
• No access, during normal replication.
• Direct Access, during direct image access.
• Logged Access, during logged access.
Table 27
Journal Tab: Journal information
Info
Description
Journal Lag
Amount of data (represented by snapshots) in the replica journal that has not yet been distributed to the
replica storage. The maximum journal lag setting defines the current recovery point objective. In the event
of a disaster, this is the maximum amount of data loss that may be incurred.
Compression Level This feature is relevant for both synchronous and asynchronous replication.
Default = none
When enabled, instructs the target RPA of the consistency group to compress the snapshots in the journal
so that more images (that is, a longer protection window) can be saved with the same journal capacity
(saving storage cost). Enabling this option is encouraged if you have cost considerations, a low incoming
write-rate, and/or limited bandwidth.
Take note of the following:
• Compression is not available in RecoverPoint/SE.
• Compression is not relevant for the production journal (since the production journal does not contain
snapshots).
• Enabling journal compression while a consistency group is enabled results in the loss of all snapshots in
the journal.
• Compression impacts the CPU resources of the target RPA of the consistency group, and can impact
that RPA’s ability to sustain its write-load. If the target RPA of the consistency group for which you want
to enable this option is also transferring the data of other consistency groups across the WAN, note that
enabling this setting will affect the RPAs transfer rate. See the EMC RecoverPoint Release Notes for
throughput limitations.
Required
Protection Window
Default = disabled
Indicates how far in time the replica image can be rolled back.
RecoverPoint Management Application
197
Managing and Monitoring
Table 27
Journal Tab: Journal information
Info
Description
Current Protection
Window
Indicates how far the replica journal can be rolled back. If the Required Protection Window is defined
(Table 18 on page 153), the Current Protection Window will be in one of the following statuses (indicated in
parentheses after the Current Protection Window):
• Sufficient: Image can be rolled back far enough to meet the Required Protection Window
• Insufficient: Image cannot be rolled back far enough to meet the Required Protection Window
• Extending: Replication has not been running long enough to be roll backed as far as the Required
Protection Window.
If the Required Protection Window is not defined, the status will be N/A.
Predicted
Protection Window
System’s prediction of the eventual size of the protection window. Note that there is no guarantee on how
long it will take to reach the predicted protection window, and no guarantee that it will ever be reached
(conditions may change before it is reached).
If the Required Protection Window is defined, the Predicted Protection Window is in one of the following
statuses:
• Sufficient: Predicted Protection Window is large enough to meet the Required Protection Window
• Insufficient: Predicted Protection Window is not large enough to meet the Required Protection Window
If the Required Protection Window is not defined or replication has not been running long enough to make
predictions, no status will be indicated. It can take 24 hours or longer of journal entries before the system
finishes calculating the predicted protection window.
Note: When RecoverPoint snapshot consolidation is enabled, the Predicted Protection Window value
is displayed as N/A. (see “Snapshot consolidation” on page 39).
Space Saved by
Consolidation
Amount of space saved by snapshot consolidation. This value is updated only after a consolidation process
completes.
Table 28
198
Journal Tab: Sample images information
Info
Description
Time
Closing time of the snapshot.
Size
Size of snapshot.
Bookmark Details
For bookmarks, displays the bookmark icon and bookmark name.
For consolidated snapshots, displays the consolidated snapshot icon and consolidation type (manual, daily,
weekly, monthly). A tool tip indicates how much space was saved by the consolidation.
Consolidation
Policy
Consolidation policy applied to this snapshot (Never Consolidate, Survive Daily, Survive Weekly, Survive
Monthly). A blank value indicates the default policy of Always Consolidate.
EMC RecoverPoint Release 3.3 Administrator’s Guide
Managing and Monitoring
If a snapshot consolidation job is in process, the following
information is displayed:
Journal Tab: Snapshot Consolidation Progress information
Table 29
Info
Description
Consolidation type
Type of consolidation: Manual, Daily, Weekly, Monthly.
Consolidation
range
Start and end times of snapshots being consolidated.
Progress
Completion percentage. A pending status indicates that the consolidation is waiting for additional snapshots
in the journal to be distributed to the user volume. The consolidation will begin automatically once the
snapshots have been distributed.
Stop
Cancels the consolidation process. The consolidation process stops after it completes processing the PIT it
is currently working on. Stopping a consolidation process returns the journal to the same state it was in
before the consolidation started.
The Policy Tab
To display policies and settings of a replica, in the Navigation Pane,
under a consistency group, select a replica. In the Component Pane,
click the Policy Tab.
The general and protection settings, and advanced policies for the
selected replica can be modified. Policies and Settings are described
in “How to configure the production copy” on page 134 and “How to
configure the replica copies” on page 135.
Splitter management
The following tools are available for the management of splitters.
Splitter commands
The commands listed here are available through buttons that are
displayed at the top right corner of the Component Pane.
Table 30
Button
Splitter commands
Name
Displayed
Description
Add Splitter
Always
To add a splitter. For instructions, refer to “Adding splitters” on
page 128. After adding the splitter, use the Splitter Property
dialog box to attach the volumes to the splitter.
Remove Splitter
Only if a specific splitter is
selected.
To remove the selected splitter from the system. Only splitters
that are not attached to volumes can be removed.
RecoverPoint Management Application
199
Managing and Monitoring
Table 30
Button
Splitter commands
Name
Displayed
Description
Show Splitter
Properties
Only if a specific splitter is
selected.
To attach volumes to splitters, see “Manually attaching volumes
to splitters” on page 173.
Rescan Splitters
Always
To rescan the SAN for available splitters before attaching a
splitter, see “Manually attaching volumes to splitters” on
page 173.
The Splitter Tab
To display splitter information, from the Navigation Pane, select
Splitters. All splitters that have been added to RecoverPoint are
displayed in the RPAs Tab of the Component Pane, with the
following information:
◆
Splitter status
◆
Name of splitter
Note: The multi-cluster icon next to a CLARiiON splitter indicates that it
is attached to multiple RPA clusters. A tooltip indicates the number of
RPA clusters attached to the splitter (maximum of 4).
◆
Site
◆
Splitter Type (host/fabric/storage). Host OS is shown for host
splitters, switch type is shown for fabric splitters, and storage
type is shown for storage-based splitters.
◆
RPA Link status
To manage a splitter, double-click on its name. The Splitter
Properties dialog box is displayed. From the dialog box, you can
attach volumes to splitters and detach them.
200
EMC RecoverPoint Release 3.3 Administrator’s Guide
Managing and Monitoring
The dialog box displays the following additional information:
◆
Name of consistency group
◆
Status of consistency group
◆
Splitter type
◆
Status of link to RPA
◆
Site
◆
Paths between RPA and splitter
• RPA#, RPA port, WWN of HBA port on host
◆
Attached volumes
• Boot volume indicator
• Consistency group
• Copy
• Replication set
• Path from splitter to storage
• Storage channel
• Storage target
• LUN (logical unit number)
• User access to storage
Note: If a volume is masked to more than one RPA cluster sharing the
same CLARiiON splitter, it can be attached to more than one RPA cluster.
The volume, however, can only be used by the first RPA cluster to which
it is attached. It is in an error state for all other RPA clusters, indicated by
the "Attached to other RPA cluster/s" state in the Volume Access field. To
correct this error, use the Detach command on the Splitter Properties
dialog box to detach the volume from the RPA cluster. To avoid this
problem, a volume should be masked to a single RPA cluster. A volume
that is masked for one RPA cluster should not be masked for another
RPA cluster.
To manage splitters from the Splitter Properties dialog box, refer to
instructions in “Modifying existing settings and policies” on
page 158.
RecoverPoint Management Application
201
Managing and Monitoring
RPA management
The following tools are available for the management of RPAs.
RPA commands
To display detailed information about a particular RPA, including
performance statistics, double-click on the RPA name or the RPA
Properties icon (
). The following information and statistics are
displayed:
◆
Version - Indicates RecoverPoint software release running on the
RPA.
◆
Hardware Platform - Indicates the RPA hardware platform.
◆
RPA status - Indicates if the RPA is connected.
◆
Repository Volume - Indicates if the RPA can access the
Repository Volume.
◆
Storage Link - Indicates if the RPA can access storage.
◆
LAN Interface - Indicates if the physical LAN port is alive, the
interface card is functional, and communication with other local
RPAs exists.
◆
Communication with the remote site - Indicates if the RPA has
access to the other site.
◆
Data Link - Indicates if data can be replicated to the
corresponding RPA on the second site.
◆
WAN Interface - Indicates if the physical WAN port is alive, the
interface card is functional, and communication with other local
RPAs exists.
The following statistics are displayed:
Total traffic
Application traffic
◆ Initialization traffic
◆ Incoming writes
◆ Bandwidth reduction
◆ CPU usage
In addition, the Fabric Interface settings are displayed for each RPA
HBA port. These settings are needed when replacing a faulty RPA.
◆
◆
◆
◆
◆
202
Port number
Port WWN
Node WWN
EMC RecoverPoint Release 3.3 Administrator’s Guide
Managing and Monitoring
The RPAs Tab
To display RPA information, from the Navigation Pane, select RPAs.
The following information is displayed in the RPAs Tab of the
Component Pane:
◆
Status
Indicates if the RecoverPoint appliance, LAN interface card, and
WAN interface card are alive.
◆
Site (location of the RPA)
◆
RPA ID
◆
WAN IP address of the RPA
◆
Management IP address of the RPA
◆
Connectivity
Indicates if communication to all RPAs in the cluster is alive and if
the Storage Link, Repository Volume, Data Link, and
communication with remote site are all alive; if one RPA is down,
connectivity of every RPA in the cluster will report an error.
Volume management
The following tools are available for the management of volumes.
Volume commands
The commands in the following table are available through a button
that is displayed at the top right corner of the Component Pane.
Table 31
Button
Volume commands
Name
Displayed
Description
Volume
Properties
Only if a specific volume is
selected.
Displays properties of the specific volume. Same as
double-clicking on the volume in the Component Pane.
The Volumes Tab
To display volume information, in the Navigation Pane, select
volumes. All volumes in all consistency groups are displayed.
The following settings are displayed for each volume:
◆
◆
◆
◆
◆
Status of the volume
Site
Consistency group
Copy
Replication set
RecoverPoint Management Application
203
Managing and Monitoring
◆
◆
Volume type (Repository, Replication, or Journal)
Volume size
To obtain more detailed information about a specific volume,
double-click on its row. The Volume Properties dialog box is
displayed with the following additional information:
◆
For each path between the volume and the RPA:
• RPA number
• RPA port number
• RPA port WWN
• Storage controller
• Serial number
• LUN
vCenter Server
management
◆
Storage vendor
◆
Storage system
◆
Volume ID
◆
UIDs
◆
Size of the volume
To display data from the VMware vCenter Server in the RecoverPoint
Management Application, go to the Navigation Pane and select
vCenter Servers. In addition to displaying ESX servers and all their
virtual machines, datastores, and RDM drives, the RecoverPoint
vCenter Server view also displays the replication status of each
volume. The RecoverPoint vCenter Servers view is for monitoring
only (read-only).
The Navigation Pane displays the vCenter Servers object, and under
it, all VMware vCenter Servers registered with RecoverPoint. The
view displays data extracted from the VMware vCenter Server
together with RecoverPoint replication data. For detailed information
about a particular vCenter Server, in the Navigation Pane, click the
vCenter Server's IP address.
204
EMC RecoverPoint Release 3.3 Administrator’s Guide
Managing and Monitoring
vCenter Server commands
When vCenter Servers is selected in the Navigation Pane, the
commands in the following tables are available through the buttons
that are displayed at the top right corner of the Component Pane.
Table 32
Button
vCenter Server commands
Name
Displayed
Description
Add vCenter
Server
Always
To create a connection between RecoverPoint and a VMware
vCenter Server, which allows RecoverPoint to display the
VMware view of volumes configured for replication. To add a
vCenter Server, in the dialog box that is displayed, enter the
information shown in Table 34 on page 206.
Edit vCenter
Server
When a specific vCenter
Server is selected
To modify the credentials of an existing connection between
RecoverPoint and a VMware vCenter Server. Use when one or
more credentials of the vCenter Server has changed. In the
dialog box that is displayed, the information in Table 35 on
page 207 can be edited.
Remove vCenter
Server
When a specific vCenter
Server is selected
To remove a connection between RecoverPoint and a VMware
vCenter Server.
Rescan
Always
To obtain the latest information from the VMware vCenter
Servers registered with RecoverPoint, and update the
RecoverPoint Management Application.
When a specific vCenter Server is selected in the Navigation Pane,
the commands in the following table are available through the
buttons that are displayed at the top right corner of the Component
Pane.
Table 33
Button
vCenter Server detail commands
Name
Displayed
Description
Edit vCenter
Server
When a specific vCenter
Server is selected
To modify the credentials of an existing connection between
RecoverPoint and a VMware vCenter Server. Use when one or
more credentials of the vCenter Server has changed. In the
dialog box that is displayed, the information in Table can be
edited.
Remove vCenter
Server
When a specific vCenter
Server is selected
To remove a connection between RecoverPoint and a VMware
vCenter Server.
Expand
Always
To expand the display of virtual machines running on each ESX
server and the storage devices the virtual machines are
accessing.
RecoverPoint Management Application
205
Managing and Monitoring
Table 33
Button
vCenter Server detail commands
Name
Displayed
Description
Collapse
Always
To collapse the display of virtual machine and storage device
details under each ESX server.
The following commands are available in the RecoverPoint
Management Application menu bar under the vCenter Server menu
item:
◆
◆
◆
◆
“Add”
“Edit”
“Remove”
“Rescan”
Add
To create a connection between RecoverPoint and a VMware vCenter
Server, which allows RecoverPoint to display the VMware view of
volumes configured for replication. To add a vCenter Server, in the
dialog box that is displayed, enter the following information:
Table 34
206
Add vCenter Server Settings
Setting
Description
Site
RecoverPoint site where the vCenter Server server is located.
IP
IP address of the vCenter Server. This is also the display name of
the vCenter Server in the Navigation Pane.
Port
TCP port number of the vCenter Server.
Username
vCenter Server username.
Password
vCenter Server password.
EMC RecoverPoint Release 3.3 Administrator’s Guide
Managing and Monitoring
Table 34
Add vCenter Server Settings
Setting
Description
Certificate
The best practice is to configure the vCenter Server to require
the use of a certificate.
If you wish to specify a certificate, browse to and select the
certificate file.
Default certificate locations:
• Windows 2003 Server:
C:\Documents and Settings\All Users\
Application Data\VMware\VMware VirtualCenter\SSL\rui.crt.
• Windows 2008 Server:
C:\Users\All Users\
Application Data\VMware\VMware VirtualCenter\SSL\rui.crt.
Once RecoverPoint has read the certificate, it does not need
further access to the location. For more information about the
location of the vCenter’s security certificate, refer to Replacing
vCenter Server Certificates VMware vSphere 4.0, available at
www.vmware.com.
Edit
To modify the credentials of an existing connection between
RecoverPoint and a VMware vCenter Server. Use when one or more
credentials of the vCenter Server have changed. In the dialog box that
is displayed, the following information can be edited.
Table 35
Edit vCenter Server Settings
Setting
Description
IP
IP address of the vCenter Server.
Port
TCP port number of the vCenter Server.
Username
vCenter Server username.
Password
vCenter Server password.
RecoverPoint Management Application
207
Managing and Monitoring
Table 35
Edit vCenter Server Settings
Setting
Description
Certificate
The best practice is to configure the vCenter Server to require
the use of a certificate.
If you wish to specify a certificate, browse to and select the
certificate file.
Default certificate locations:
• Windows 2003 Server:
C:\Documents and Settings\All Users\
Application Data\VMware\VMware VirtualCenter\SSL\rui.crt.
• Windows 2008 Server:
C:\Users\All Users\
Application Data\VMware\VMware VirtualCenter\SSL\rui.crt.
Once RecoverPoint has read the certificate, it does not need
further access to the location. For more information about the
location of the vCenter’s security certificate, refer to Replacing
vCenter Server Certificates VMware vSphere 4.0, available at
www.vmware.com.
Remove
To remove a connection between RecoverPoint and a VMware
vCenter Server.
Rescan
To rescan VMware vCenter Servers registered with RecoverPoint.
The vCenter Servers Tab
When vCenter Servers is selected in the Navigation Pane, the
Component Pane displays all vCenter Servers registered with
RecoverPoint, with their site name and username. When, in the
Navigation Pane, the IP address of an individual vCenter Server is
selected, detailed information is displayed about the virtual machines
running on that ESX server, and their volumes and raw device
mappings.
The following information is displayed for an individual vCenter
Server:
◆
Each ESX server in the vCenter Server and its IP address
◆
Each virtual machine running in the ESX server, with the
following details:
• Replication status: fully configured for replication ( ),
partially configured for replication ( ), or not configured for
replication ( ).
208
EMC RecoverPoint Release 3.3 Administrator’s Guide
Managing and Monitoring
• IP address
◆
Every LUN and raw device mapping accessed by each virtual
machine, with the following details:
• Replication status: fully configured for replication (
configured for replication ( ).
) or not
• For LUNs or devices configured for replication by
RecoverPoint, the following are displayed: consistency group,
copy (Production, Local, Remote), replication set, and which
datastore for each LUN or raw device mapping is configured
for replication.
System Monitoring
management
RecoverPoint monitors selected setting values to let the user know
how close they are to their limits. The limits may be determined by
the system, policies, licensing, or limitations of external technologies.
To view monitored settings and their limits, in the Navigation Pane,
select System Monitoring. Monitored settings are displayed in the
Component Pane. Displayed settings are divided into three
categories: System, Group, and Splitters. Select the tab of the category
of settings you wish to view.
You can sort settings by any column displayed by clicking in the
column header. You can filter monitored settings by, the Filter by:
field, selecting a category for filtering, then entering text to filter by in
the text box provided.
The System Tab
The following information is shown:
◆
◆
◆
◆
◆
◆
Severity: OK, minor, major, critical. Indicates how close a
monitored setting is to its limit.
Description: name of the setting value
Site: replica site of the displayed setting
RPA: RPA of the displayed setting
Status: value of the setting and its limit
Status bar: graphic display of the setting value and its limit.
RecoverPoint Management Application
209
Managing and Monitoring
The Group Tab
The following information is shown:
◆
◆
◆
◆
◆
Severity: OK, minor, major, critical. Indicates how close a
monitored setting is to its limit.
Description: name of the setting value
Group name: Name of the consistency group involved
Status: value of the setting and its limit
Status bar: graphic display of the setting value and its limit.
The Splitters Tab
The following information is shown:
◆
◆
Severity: OK, minor, major, critical. Indicates how close a
monitored setting is to its limit.
Description: name of the setting value, including:
• Number of RPA clusters attached to a splitter
• Total number of volumes attached to a splitter
◆
◆
◆
◆
◆
Event log
management
Site: replica site of the displayed setting
Host Name: name of the host involved
Status: value of the setting and its limit
Status bar: graphic display of the setting value and its limit.
Context: additional limitations specific to a particular type of
splitter (for example, Number of Brocade ITLs per DPC)
RecoverPoint logs events that occur in the RecoverPoint system. The
event log can be viewed. In addition, RecoverPoint offers several
options (email, SNMP, and syslog) for configuring event notification
(“Notification of Events” on page 251). The system events are listed
and described in “Events” on page 277.
Logging commands
The following commands are available through buttons that is
displayed at the top right corner of the Component Pane.
Table 36
Button
210
Log commands
Name
Displayed
Description
Log Event
Properties
Only if a specific event log is
selected.
Opens the Event Details dialog box, which displays additional
information about an individual log.
EMC RecoverPoint Release 3.3 Administrator’s Guide
Managing and Monitoring
Table 36
Button
Log commands
Name
Displayed
Description
Log Filter
Always
Filters the logs displayed according to the criteria you specify.
Filters are saved per user. Filtering criteria include:
• From (time)
• To (time)
• Topics to include
• Scope
• Level
• Text filter
Table 37
Log filtering settings
Setting
Values
Time
Select the time range of events you wish to display.
Topics
Select which events to include in the display, according to the
following topics:
• Site
• RPA
• Consistency group
• Splitter
• Management
Scope
Normal: To report only the root cause for an entire set of detailed
and advanced events. In most cases, these events are sufficient for
effective monitoring of system behavior.
Detailed: This category includes all events, with respect to all
components, that are generated for use by users.
Advanced: In specific cases, for instance, for troubleshooting a
problem, EMC Customer Service may ask you to retrieve information
from the advanced log events. These events contain information that
is intended primarily for the technical support engineers.
Level
Info: Messages are informative in nature, usually referring to
changes in the configuration or normal system state.
Warning: Message indicates a warning, usually referring to a
transient state or an abnormal condition that does not degrade
system performance.
Error: Message indicates an important event that is likely to disrupt
normal system behavior and/or performance.
Description Text
Enter words in the description text by which to filter displayed events.
Match all and Match any options are available.
RecoverPoint Management Application
211
Managing and Monitoring
The Logs Tab
To view RecoverPoint system events, in the Navigation Pane, select
Logs. The event log is displayed in the Component Pane, with most
recent events first. You can sort events by any column displayed by
clicking in the column header. For additional information about a
single event, double-click on it. When viewing a single event, use the
Prev and Next buttons to browse events.
Events can be filtered by time (From and To can be specified), by
topic, by scope, by level, and by descriptive text. Note that by default,
only events with Normal scope are displayed. To view all events that
may be relevant to users, use the Filter Log command to change
Scope to Detailed.
212
EMC RecoverPoint Release 3.3 Administrator’s Guide
Managing and Monitoring
Monitoring and analyzing system performance
The following table lists the components of the RecoverPoint system
that can be monitored from the Management Application. From the
Management Application monitoring panes, no changes can be made
to monitored settings.
Table 38
Components monitored from the Management Application
Information Type
Displayed
See
System SAN and WAN traffic
overview
Always
“The Traffic Pane” on page 180
System health overview
Always
“The System Pane” on page 179
vCenter Servers
vCenter Servers selected in the Navigation
Pane
“The vCenter Servers Tab” on page 208
System monitoring, system
limits
System Monitoring selected in the
Navigation Pane
• “The System Tab” on page 209
• “The Group Tab” on page 210
• “The Splitters Tab” on page 210
Consistency group event logs
Logs selected in the Navigation Pane
“The Logs Tab” on page 212
System event logs
System > Collect System Information
selected from the main system menu
“Collecting system information” on page 268
Consistency group transfer
and failover status
When a consistency group is selected in the
Navigation Pane, Status Tab
“The Status Tab” on page 190
Consistency group
performance statistics
When a consistency group is selected in the
Navigation Pane, Statistics Tab
“The Statistics Tab” on page 191
Monitoring and
analyzing system
performance
The RecoverPoint system produces extensive statistics that can be
used to analyze system performance. You can use the results of these
analyses to ensure that system capacity is sufficient, to adjust system
settings for optimal system performance, and to plan future
expansions. These include:
◆
◆
◆
◆
◆
“Detecting bottlenecks”
“Exporting statistics”
“Exporting consolidated statistics”
“Statistics analysis tool”
“Throttling I/O”
Monitoring and analyzing system performance
213
Managing and Monitoring
Detecting bottlenecks
This feature returns statistics about RecoverPoint system
performance, by group, RPA (“box”), and site. The quantity and type
of statistics depends on the filters specified in the CLI
detect_bottlenecks command.
The standard output includes the following set of statistics:
◆
WAN (or Fibre Channel) throughput
◆
Incoming data rate
◆
Output data rate, during (and not during) initialization
(synchronization)
◆
Compression CPU utilization
◆
Percentage of time in transfer (group only)
◆
Percentage of time in initialization (group only)
◆
Lag (RPO) during transfer (group only)
◆
RPA utilization (RPA only)
◆
Link utilization (group only)
◆
Line latency between sites (site only)
◆
Packet loss (site only)
More detailed statistics are used primarily by EMC Customer Service,
for analysis of system performance and problems. For more
information on the types of statistics and filters available, refer to the
EMC RecoverPoint CLI Reference Guide.
More important for the normal user, this feature analyzes the system
data in an effort to detect the existence of any of a set of predefined
problem types, called “bottlenecks”. The types of bottlenecks are
presented in the following table.
Table 39
Bottlenecks
Type
Output/Notes
For boxes and sites
214
Box balance
“Boxes are not balanced.”, with data on the load handled by each
box at the site.
NOTE: Box balance is checked only if the time period defined is
greater than 30 minutes.
Compression
“Compression level is too high. The box resources cannot handle
the current level.”
EMC RecoverPoint Release 3.3 Administrator’s Guide
Managing and Monitoring
Table 39
Bottlenecks
Type
Output/Notes
SAN target
“Box may be regulating the application. Consider reducing box
load”, with data on the total amount of incoming data, the number of
writes, and the amount of incoming data per write.
Box utilization
“Box utilization reached 80%.”
For consistency groups and links
Slow production
journal
“Writing to the local journal volume was slow during this period.”,
with data on the delay factor.
Journal phase 1
“Journal is unable to handle the incoming data rate.”, with the
required I/O rates for the journal and the replication volumes at
local or remote copies, for both normal and fast-forward distribution
modes.
Journal phase 2
“Journal and replication volumes are unable to handle the incoming
data rate.”, with data on the required I/O rates for the journal and
the replication volumes at local or remote copies, for both normal
and fast-forward distribution modes.
Journal regulation
“Remote storage is too slow to handle incoming data rate and
regulate the distribution process.” with data on the required I/O
rates for the journal and the replication volumes at local or remote
copies, for both normal and fast-forward distribution modes.
Unknown
“Target site cannot handle the incoming data rate.”, with the
distribution problem required I/O rates for the journal and the replication volumes at the
remote site, for both normal and fast-forward distribution modes.
Slow WAN
“WAN is too slow.”, with data on total throughput for the site, the
identity of the boxes at which the problem appeared, and the
throughput of that box (or boxes).
NOTE: A slow WAN bottleneck is detected by group, but generates
data by site and box.
Slow read source
“Reading rate from the source replication volume/s during
synchronization is too slow.”, with the reading rate.
Slow read target
“Reading rate from the target replication volume/s during
synchronization is too slow.”, with the reading rate.
Link utilization
“Link utilization reached 80%.”
In addition to the analysis of overall system behavior, the same type
of analysis can be performed on specialized system behavior; such as,
periods of initialization (synchronization), high load periods, and
peak periods of incoming data (writes).
Monitoring and analyzing system performance
215
Managing and Monitoring
In some cases, an action to correct the problem is explicitly
recommended as part of the bottleneck report. In all cases, the
detection of a bottleneck is intended to lead to a correction of the
problem and an improvement in system performance.
Exporting statistics
This feature is used primarily to write unprocessed system statistics
to a CSV file, according to the specified filters.
The standard output includes the following set of statistics:
◆
◆
◆
◆
◆
◆
◆
WAN throughput
Total incoming data rate
Initialization (synchronization) output rate
CPU utilization
Compression ratio
Percentage of time in transfer (group only)
Percentage of time in initialization (group only)
The feature can be activated by the export_statistics command in the
CLI, which offers the filtering settings: from, to,
include_global_statistics, site, RPA, group, categories, frequency, and file.
The standard output (as listed above) is produced when the categories
setting is set to the value, OVERVIEW.
More detailed statistics are used primarily by EMC Customer Service,
for remote analysis of system performance and problems.
Exporting
consolidated statistics
The export_consolidated_statistics CLI command provides data
series for a selection of important RecoverPoint operational statistics.
It enables advanced users, customer support representatives, and
implementation specialists to analyze system traffic and workload
trends, to identify correlation between spikes in two or more settings,
and to discover the root causes of high loads and other significant
system behaviors.
With the export_consolidated_statistics command, you specify the
granularity at which to collect statistics (minute, hour, and/or day)
and, for each granularity, the time frame over which to collect the
statistics. The resulting CSV file organizes the output for each entry
according to the standard bottleneck detection settings.
The following table lists the statistics that are output by the
export_consolidated_statistics command. All statistics are valid for
216
EMC RecoverPoint Release 3.3 Administrator’s Guide
Managing and Monitoring
the sampled time interval, that is, the difference between the “time
from” and “time to” values on that row.
Table 40
Consolidated statistics output
Statistic
Unit of
measure
Description
Incoming writes rate for linka
MB/sec
Throughput of writes that arrived at a link on the production side.
Incoming IOs rate for link
IOs/sec
Number of writes that arrived at a link on the production side. Average IO
size can be computed as (Incoming writes rate for link) / (Incoming IOs
rate for link).
Non - initialization output rate for link
MB/sec
Rate at which normal write traffic was transferred for the link.
Initialization output rate for link
MB/sec
Rate at which data was actually transferred for the link for purpose of
synchronization.
Data synchronization rate for link
MB/sec
Rate at which data was checked for possible transfer for the link for
purpose of synchronization. In first-time initialization, this is the rate at
which the data from the replication set volumes is transferred. In other
cases, comparison of signatures increases the rate.
Compression CPU utilization
%
Portion of processor time used to compress the incoming data over a
link. This statistic is relevant only for remote links.
Percentage time in transfer
% of time
Portion of time link was in active transfer state.
Percentage time of initialization
% of time
Portion of time link was in initializing transfer state.
RPO - lag in time between replicas
during transfer after init
sec
Actual recovery point objectiveb for link, as measured in seconds. It is
measured only when the transfer state for the link was active.
RPO - lag in data between replicas
during transfer after init
MB
Actual recovery point objective for link, as measured in megabytes of
data. It is measured only when the transfer state for the link was active.
RPO - lag in IOs between replicas
during transfer after init
IOs
Actual recovery point objective for link, as measured in number of IOs. It
is measured only when the transfer state for the link was active.
Group-Link utilization
%
Aggregate measure of portion of RPA capacity used by the link, based on
number of IOs, writes throughput, and CPU utilization.
Distributor receiver regulation
duration
% of time
Portion of time that the distribution process for a copy was forced to
regulate the incoming data rate. A high value indicates a slow storage
journal volumes relative to the rate of the incoming data.
Distributor phase 1 thread load
% of time
Portion of time used to receive data and write it to the journal volumes for
the link. The value is dependent on the performance of the journal
volumes.
Monitoring and analyzing system performance
217
Managing and Monitoring
Table 40
218
Consolidated statistics output
Statistic
Unit of
measure
Distributor phase 1 effective speed
MB/sec
Rate at which data was received and written to the journal volumes for
the link. The value is dependent on the performance of the journal
volumes.
Distributor phase 2 thread load
% of time
Portion of time used to read data from the journal volumes and write to
the replica volumes, using either normal or fast-forward distribution
mode. The value is dependent on the performance of the journal and
replica volumes.
Distributor phase 2 effective speed
MB/sec
Rate at which data was read from the journal volumes and written to the
replica volumes, using either normal or fast-forward distribution mode.
The value is dependent on the performance of the journal and replica
volumes.
Fast forward distribution duration
% of time
Portion of time at copy when distribution was in fast-forward mode.
WAN throughput from box (over IP)
or
Cross-site throughput from box (over
FC)
Mb/sec
Outgoing throughput from RPA to remote site.
Total incoming writes rate for box
MB/sec
Throughput of writes that arrived at an RPA on the production side. For
CLR, this is the sum of throughputs for all CDP and CRR links on this
RPA (which is double the actual incoming writes rate).
Incoming IOs rate for box
IOs/sec
Number of writes that arrived at an RPA on the production side. For CLR,
this is the sum of incoming IOs for all CDP and CRR links (which is
double the actual number of incoming writes). Average IO size can be
computed as (Total incoming writes rate for box) / (Incoming IOs rate for
box).
Non - initialization output rate for box
(average over all period)
MB/sec
Rate at which normal write traffic was transferred by the RPA.
Initialization output rate for box
(average over all period)
MB/sec
Rate at which data was actually transferred by the RPA for purpose of
synchronization.
Data synchronization rate for box
(average over all period)
MB/sec
Rate at which data was checked for possible transfer by the RPA for
purpose of synchronization. In first-time initialization, this is the rate at
which the data from the replication set volumes is transferred. In other
cases, comparison of signatures increases the rate.
Compression CPU utilization
%
Portion of processor time used to compress the incoming data over all
links on the RPA.
Replication process CPU utilization –
per box
%
Portion of processor time used for the replication process on the RPA,
including compression and other replication activities.
Description
EMC RecoverPoint Release 3.3 Administrator’s Guide
Managing and Monitoring
Table 40
Consolidated statistics output
Statistic
Unit of
measure
Description
Distributor receiver regulation
duration
% of time
Portion of time that the distribution process across all copies served by
the RPA was forced to regulate the incoming data rate. A high value
indicates slow storage journal volumes relative to the rate of the incoming
data.
Distributor phase 1 effective speed
MB/sec
Rate at which data was received and written to the journal volumes for all
copies served by the RPA. The value is dependent on the performance of
the journal volumes.
Distributor phase 2 effective speed
MB/sec
Rate at which data was read from the journal volumes and written to the
replica volumes for all copies served by the RPA, using either normal or
fast-forward distribution mode. The value is dependent on the
performance of the journal and replica volumes.
SAN target thread load
% of time
Portion of time used by RPA for initial processing of writes, before
assigning them to the relevant links.
Box utilization
%
Aggregate measure of portion of RPA capacity used, based on both IO
load and processor utilization.
WAN throughput from site (over IP)
or
Cross-site throughput from site (over
FC)
Mb/sec
Combined outgoing throughput from a site (to remote site).
Total incoming writes rate for site
MB/sec
Throughput of writes that arrived on the production side. For CLR, this is
the sum of throughputs for all CDP and CRR links at the site (which is
double the actual incoming writes rate).
Incoming IOs rate for site
IOs/sec
Number of writes that arrived on the production side. For CLR, this is the
sum of incoming IOs for all CDP and CRR links at the site (which is
double the actual number of incoming writes). Average IO size can be
computed as (Total incoming writes rate for box) / (Incoming IOs rate for
box).
Non - initialization output rate for site
(average over all period)
MB/sec
Rate at which normal write traffic was transferred by all RPAs at the site.
Initialization output rate for site
(average over all period)
MB/sec
Rate at which data was actually transferred by all RPAs at the site for
purpose of synchronization.
Data synchronization rate for site
(average over all period)
MB/sec
Rate at which data was checked for possible transfer by all RPAs at the
site for purpose of synchronization. In first-time initialization, this is the
rate at which the data from the replication set volumes is transferred. In
other cases, comparison of signatures increases the rate.
Compression CPU utilization
%
Portion of processor time used to compress the incoming data over all
links for all RPAs at a site.
Monitoring and analyzing system performance
219
Managing and Monitoring
Table 40
Consolidated statistics output
Unit of
measure
Statistic
Description
Distributor receiver regulation
duration
% of time
Portion of time that the distribution process across all copies at a site was
forced to regulate the incoming data rate. A high value indicates slow
storage journal volumes relative to the rate of the incoming data.
Distributor phase 1 effective speed
MB/sec
Rate at which data was received and written to the journal volumes for all
copies at the site. The value is dependent on the performance of the
journal volumes.
Distributor phase 2 effective speed
MB/sec
Rate at which data was read from the journal volumes and written to the
replica volumes for all copies at a site, using either normal or fast-forward
distribution mode. The value is dependent on the performance of the
journal and replica volumes.
Line latency between sites
msec
Time it takes for data to be transferred from a site to the other site.
Packet loss
% of
packets
Measure of the reliability of the line between a site and the other site, as
measured in portion of packets that must be resent. Latency and packet
loss impact the effective throughput on a line between two sites.
a. In an RPA, a “link” is a logical channel from production to copy, either local or remote, for a given
consistency group.
b. An “actual recovery point objective” is the time/data/writes that were waiting to be transferred over the
link to the copy, averaged over the sampled time interval.
Statistics analysis tool
Using the export_consolidated_statistics command in the CLI
together with the proprietary RecoverPoint MS Excel-based statistics
analysis tool provides an effective and unified way for assessing
current system performance, as well as planning to ensure for
adequate future system capacity.
The statistics analysis tool enables graphical representation of the
statistics in the CSV files created by the
export_consolidated_statistics command.
A typical use of the tool would be to show a plot, over time, of “RPO
- lag in data between replicas during transfer after init” alongside
“WAN throughput from site”. If the result shows that the lag
increases concurrently with a decrease in WAN throughput, then the
user can conclude that in order to meet the RPO objective
consistently, WAN bandwidth or quality of service must be increased.
Adding the “Box utilization” statistic to the same plot can be used to
confirm that the problem is not caused by the alternative explanation
of an overworked appliance.
220
EMC RecoverPoint Release 3.3 Administrator’s Guide
Managing and Monitoring
These data can be analyzed using the statistics analysis tool, which
can be configured to show desired time frames, consistency groups,
appliances, sites or links, or any combination of these filters, in
various graphical ways. The tool is provided both as a useful way to
easily access and interpret the data series, and as a reference
implementation showing how third parties can build their own tools
to process the data in more specialized way. Users who require more
advanced BI-like analysis, can also load the CSV file/s into most
database systems, typically using database vendor-supplied loading
utilities, and leverage SQL or SQL-based reporting engines to analyze
the data.
The statistics analysis tool is available on the Powerlink website, at:
http://Powerlink.EMC.com.
Throttling I/O
A RecoverPoint initialization or synchronization process can
temporarily put a heavy I/O load on a storage device. In some
situations, it is recommended to restrict use of the storage device by
these processes in order to allow other processes (including, for
example, normal application writes to a RecoverPoint replication
volume that belongs to a different consistency group, but which is
located on the same device) to function with minimum disruption.
Use the config_io_throttling command in the CLI to set the
maximum rate at which any RPA can read from any storage device.
Monitoring and analyzing system performance
221
Managing and Monitoring
222
EMC RecoverPoint Release 3.3 Administrator’s Guide
5
Testing, Failover, and Migration
Testing, Failover, and
Migration
This section explains how to test and work from replicas; and how to
recover from disasters and how to migrate to a different storage
system. The topics in this section include:
◆
◆
◆
◆
Use cases............................................................................................
Bookmarking ....................................................................................
Accessing a replica...........................................................................
Failover commands..........................................................................
Testing, Failover, and Migration
224
235
242
249
223
Testing, Failover, and Migration
Use cases
The following use cases present the most common uses of image
access and failover, with a brief outline of the process for each use
case. The details for each step are given later in this section.
The concept of bookmarking is explained in “Bookmarks” on
page 38. The concepts of image access and failover are explained in
“Image access” on page 73 and “Failover” on page 77.
First-time
initialization
When a consistency group is initialized for the first time, the
RecoverPoint system cannot identify which blocks are identical
between the production and replica volumes, and must therefore
mark all blocks for that volume. This is true both following the
creation of a new consistency group and following the addition of a
volume (or volumes) to an existing group. While initializing, the
RecoverPoint system efficiently determines which blocks are actually
different between the production and replica copies, and sends only
the data for those blocks to the replica storage, as the initialization
snapshot.
The volumes at the local and remote site can be initialized while the
host applications are either running or not. Initialization of one
consistency group does not interfere with the operation of other
consistency groups.
Initialization can be carried out automatically over IP or Fibre
Channel.
Alternatively, you can back up the current production data, manually
transfer it to the remote site, and then copy the production image to
the replica volumes, where it becomes the pre-replication image.
The state of data transfer prior to any type of initialization is always
paused.
Note: First-time initialization can cause changes in the partition table of the
replica volume (or volumes). In a Windows environment, you must clear the
OS cache before changing the partition table of a replica volume. To clear the
OS cache, disable the LUN (or LUNs) on which the volume resides, and
re-enable it. You can do so either using the relevant commands in the
RecoverPoint kutils utility (see “Kutils Reference” on page 311), or from the
“Disk drives” interface of the Windows Device Manager.
224
EMC RecoverPoint Release 3.3 Administrator’s Guide
Testing, Failover, and Migration
First-time
initialization from
backup
It is possible to initialize new consistency groups by creating a
backup of your production volumes, physically transferring them to
the remote site, and copying the backup images onto the remote
storage volumes.
During this process, applications at the production site can be
running or not. Although transfer is paused during this process,
unless the production host is completely shut down during the
creation and physical transfer of the production image to the remote
site, the host applications keep writing to storage. In this time, the
production and replica volumes become inconsistent, and any writes
to production volumes during this process are subsequently
synchronized, upon completion of the process, and start of transfer.
Only synchronizing the changes made to the production volumes
during this time results in a relatively small amount of additional
traffic, and takes substantially less time than to synchronize over IP
or Fibre Channel.
To initialize a new consistency group from a backup image, when the
production and replica volumes are not consistent:
At the production site:
1. Ensure that all splitters that have access to replication volumes in
the group are attached to those volumes, see “Adding splitters”
on page 128.
2. Create a new consistency group, replication set, and define
replication and journal volumes, see “Creating new consistency
groups” on page 132 and “Modifying existing settings and
policies” on page 158.
3. Select the new consistency group’s name in the Navigation Pane,
right-click, and select Pause Transfer to stop the transfer of
replicated data from production to the replica.
4. Select the new consistency group’s name in the Navigation Pane,
right-click and select Clear Markers to inform the system that the
copy volume at the remote site is known to be identical to its
corresponding production volume, and a full volume sweep
synchronization is not required.
The Clear Markers dialog box is displayed.
Use cases
225
Testing, Failover, and Migration
In the Clear Markers dialog box, select the remote copy, and click
the OK button.
5. Create a block-based backup of the production volumes.
6. Physically transfer the backup to the remote site.
At the remote site:
7. From the Image Access menu, select Enable Image Access and
specify the image that you wish to access. Do not select the
pre-replication image. After selecting the required image:
Select Logged Access, and click the Next button. In the summary
screen, click the Finish button.
8. From the Image Access menu, select Enable Direct Access to
enable the remote host to directly access the image selected in
Step 7.
!
CAUTION
In a three copy configuration, the journal of the third copy is
erased, and all history for the third copy is lost.
9. Restore the backup onto the remote replica volumes. The backup
image becomes the pre-replication image.
10. From the Image Access menu, select Disable Image Access, and
check the Start data transfer immediately checkbox to resume
replication. Click the OK button to finish the process.
226
EMC RecoverPoint Release 3.3 Administrator’s Guide
Testing, Failover, and Migration
!
CAUTION
Upon start of transfer, the system synchronizes the data of the
replica volume with the corresponding production data, which
has presumably changed in the time it took to create the
backup, and manually transfer it.
At the local site, in a three
copy configuration:
11. In a three copy configuration, you will also need to start transfer
to the third copy. To do so, select the Start Transfer icon.
First-time failover
If you are performing this procedure on an AIX host, see the EMC
RecoverPoint AIX Technical Notes for the correct procedure.
Volumes attached to Windows hosts require additional steps when
failing over for the first time. Best practice is to run a planned failover
as soon as possible after initialization of a consistency group; that is,
as soon after the transfer status changes from Initializing to Active.
Windows hosts
Before accessing a replica for the first time, it is necessary to update
the disk information stored by the replica host.
During initialization of a replication volume, the source volume is
replicated in its entirety, including disk information (disk signature,
partition table, and other disk information). This means that the disk
information on the replica disk will have been created by the
production host and not by the replica host. It is therefore necessary
to invalidate the replica host’s disk information.
To update the disk information of a replica:
1. Enable image access to the replica.
2. From the Windows Control Panel, select Computer Management
> Disk Management. Examine all listed disks (Multiple paths
may exist to each disk. In consequence, disks may be listed many
times.) Note the device name or LUN number of the replication
volume.
3. In Computer Management, select Device Management > Disk
drives. Disable all instances of the replication volume. Disks can
be identified by their device name or LUN number. Run the Scan
for hardware changes command.
In scripts, volumes can be enabled and disabled as follows:
• In Windows 2000 and 2008, use the kutils disable and enable
commands.
Use cases
227
Testing, Failover, and Migration
• In Windows 2003, use the Microsoft utility devcon.
4. In Computer Management, select Device Management > Disk
drives. Enable all instances of the replica volume. Run the Scan
for hardware changes command. Verify that all relevant disks are
displayed.
5. Mount the replication volume:
a. In Computer Management > Disk Management, run Rescan
disks.
b. Find the replication volume and assign it a drive letter.
c. Verify that the disk is now accessible from the host.
6. From the replica host, unmount the replication volume (unassign
or remove the drive letter).
7. Disable image access (resume distribution).
Testing a replica
From time to time, it is a good practice to make sure that replicas can
be used to restore data, recover from disaster, or seamlessly take over
production. In most cases, while testing a replica, applications can
continue to run on the production servers, and replication can
continue as usual. The writes will be stored in the replica journal until
testing is completed. When testing is completed and write access at
the replica disabled, any writes made during testing will be rolled
back; and the writes from production will be distributed from the
journal to the replica. The entire process can be completed without
application downtime and without loss of data at the replica.
To test a replica follow the steps below. For detailed instructions of
the individual commands, refer to “Bookmarking” on page 235 and
“Failover commands” on page 249.
1. From the Image Access menu, select Enable Image Access.
228
EMC RecoverPoint Release 3.3 Administrator’s Guide
Testing, Failover, and Migration
If Virtual Access is appropriate, it is the quickest. If you are only
testing images, Virtual Access without Roll image in
background is preferred. However, if you need to test images for
an extended period of time or need maximum performance while
testing, select Logged Access (physical).
2. At the host, mount the replica volume you wish to access. If the
volume is in a volume group managed by a logical volume
manager, import the volume group.
3. If desired, run fsck (chkdsk on Windows) on the replica volumes.
4. Access the volumes and test as desired. If you need to test longer
than is possible with Logged Access (because the journal is full or
will be full) or you require even better performance than Logged
Access, Direct Image Access may be preferable. Refer to “Direct
Image Access” on page 246 for details. Note the drawbacks of
using Direct Image Access.
5. When testing is completed, unmount the replica volumes from
the host. If using logical disk management, deport the volume
groups. Then Disable Image Access at the replica. The writes to
the replica will automatically be undone.
Offloading a task
The same procedure as for testing (“Testing a replica” on page 228)
can be used to offload a task to a replica. For instance, if you need to
run a large query on a database, and you do not want to tie up your
source, you can run the query on a replica. Of course, this assumes
that you do not need the very latest information (your data will be
out of date by your lag time, possibly a few seconds, plus the length
of time it takes to run the query).
Recovering from a
disaster
To recover from a disaster (hardware failure or logical disaster), fail
over to a replica. You have these options:
Use cases
229
Testing, Failover, and Migration
Recovering the
production source
◆
Fail over temporarily to a replica only to use the replica’s journal
to roll back to a previous point in time in the production storage
(“Recovering the production source” on page 230).
◆
Fail over to a replica and work from there until the production
source is repaired or you have recovered from the disaster at the
production site (“Failing over to a replica temporarily” on
page 231).
◆
Fail over permanently to a replica, making that the new
production source (“Routine maintenance on production system”
on page 233). Use this same process to migrate to a new
production site.
To correct file or logical corruption by rolling back to a previous point
in time, use Recover Production. Access an image in the replica,
verify that it is not corrupt (that is, that the image predates the
corruption), then roll back to that point in time. Use the following
procedure. For instructions of the individual commands, refer to
“Bookmarking” on page 235 and “Failover commands” on page 249.
1. From the Image Access menu of the replica, select Enable Image
Access.
Use Virtual Access, if practical.
2. From the host, mount the replica volumes. If the volume is in a
volume group managed by a logical volume manager, import the
volume group.
3. If desired, run fsck (chkdsk on Windows).
4. Test the snapshots until you find one that you wish to roll back to
(before the corruption or disaster).
5. Roll to the selected snapshot: use Virtual Access with Roll,
Logged Access, or Roll to Image.
You do not need to wait until rolling to the selected snapshot
finishes.
6. At the copy’s Failover menu, select Recover production.
In the Component Pane for the selected consistency group, the
image status of the production source will be Distributing
Pre-replication image and the role will be Production (being restored).
230
EMC RecoverPoint Release 3.3 Administrator’s Guide
Testing, Failover, and Migration
7. After the transfer status changes to Active, enable image access at
the production source. Select any image after the Pre-replication
image.
8. Unmount the replica volumes from the replica hosts. If the
volume is in a volume group managed by a logical volume
manager, deport the volume group.
When image access is enabled at the production source, Resume
Production will be enabled.
9. At the production source, click Resume Production (Failover
Actions icon).
The production journal is erased. The production source is rolled
back to the selected image and normal replication from the
production source is restored.
Failing over to a
replica temporarily
Use the following procedure to temporarily fail over to a replica and
continue working from the replica until the production site is
available or repaired.
1. Enable image access. If there is any chance that the latest image of
the replica is not usable, select Virtual Access with Roll image in
background.
If you do not need to test images, at the replica you to which you
wish to fail over, select Enable Image Access. The latest snapshot
is a logical choice. When prompted, select the snapshot to which
you wish to fail over. When prompted, select Logged Access.
It can take a few minutes for the system to roll to the desired
snapshot.
2. From the replica host, mount the replica volumes. If the volume is
in a volume group managed by a logical volume manager, import
the volume group.
Use cases
231
Testing, Failover, and Migration
3. If desired, run fsck (chkdsk on Windows).
4. If necessary, test and find a usable image (“Testing a replica” on
page 228).
5. At the Failover Actions menu, select Failover to <local replica
name> or Failover to <remote replica name>.
The replica’s journal will be erased. In a CLR (three-copy)
configuration, transfer to the third copy will pause until
production is resumed at the production source.
6. Repair the production site as needed. In the meantime, your
applications and business operations can continue at the replica.
The production journal and the production storage (assuming
they are online) will be kept up-to-date from the replica.
7. When repairs at the production site have been completed, select
Enable Image Access at the production site. Then at the
production site, select Resume Production.
The production journal is erased. If you have three copies
(production, local, and remote), transfer to the third copy will
automatically be resumed.
232
EMC RecoverPoint Release 3.3 Administrator’s Guide
Testing, Failover, and Migration
8. Unmount the replica volumes from the replica hosts. If the
volume is in a volume group managed by a logical volume
manager, deport the volume group.
Routine
maintenance on
production system
Failing over to a replica (“Failing over to a replica temporarily” on
page 231) is also useful for performing maintenance on the
production source, such as updating the storage system.
Migration
Use the following procedure if you are migrating from one
production site to another or failing over permanently to another site.
Migration requires advanced planning to ensure risk-free migration
without data loss. Contact EMC Customer Service for assistance
before migrating.
For Windows hosts, to ensure an up-to-date image of the file system,
you should flush all file systems that reside on replication volumes.
Note that some applications, such as Exchange, have their own cache,
which should be flushed as well. Flushing the file system does not
flush application level data.
To gracefully shut down source-site host activities:
1. Close all applications that are using the consistency group’s
volumes at the production site. Flush application data as
necessary.
2. On Windows hosts at the source-site, run (for each drive that is in
the consistency group):
kutils flushFS <drive letter>.
3. On the source-site hosts, run (for each drive that is in the
consistency group):
kutils umount <drive letter>.
Note: If the host is boot-from-SAN, shut down the host machine as well.
4. Enable image access. If there is any chance that you do not wish
to use the latest image, select Virtual Access with Roll image in
background. Skip to Step 5.
If you do not need to test images, at the replica you wish to fail
over to, select Enable Image Access. When prompted, select the
snapshot to which you wish to fail over. The last snapshot is a
logical choice. When prompted, select Logged Access.
Use cases
233
Testing, Failover, and Migration
It can take a few minutes for the system to roll to the desired
snapshot.
5. From the replica host, mount the replica volumes. If the volume is
in a volume group managed by a logical volume manager, import
the volume group.
6. If desired, run fsck (chkdsk on Windows).
7. If necessary, test and find a usable image (“Testing a replica” on
page 228).
8. Click the Failover Actions menu, select Failover to <local replica
name> or Failover to <remote replica name>.
The replica’s journal will be erased. In a three-copy (CLR)
configuration, the transfer to the third replica will pause until
production is resumed at the production source.
9. From the Failover Actions menu, select Set local copy as
production or Set remote copy as production.
Only relevant to CLR (configuration with remote and local
replica): If you fail over to the local replica, a full sweep will be
required to resynchronize the remote replica. If you fail over to
the remote replica, you will not be able to retain both the
production source and the local replica. RecoverPoint does not
support more than one remote replica.
!
CAUTION
The production journal is erased.
234
EMC RecoverPoint Release 3.3 Administrator’s Guide
Testing, Failover, and Migration
Bookmarking
The concept of bookmarking is explained in “Bookmarks” on
page 38.
Take note of the following:
Creating a
bookmark
◆
You can only bookmark a snapshot for a consistency group that is
enabled and actively transferring.
◆
latest is a reserved term and therefore cannot be used as a
bookmark name.
◆
Some applications support a quiesced state. For best reliability, you
should use the quiesced state when bookmarking a snapshot.
To create a bookmark:
1. In the Navigation Pane, select the consistency group you want to
bookmark.
2. Click the Bookmark button.
3. Enter the following information on the Bookmark a snapshot of
<group> dialog:
1. Enter a descriptive name for the bookmark.
2. Set the consolidation policy for the bookmark.
The default consolidation policy for a snapshot is Always
Consolidate, which means that the snapshot is consolidated the
next time that the consolidation process runs.
a. Check Set snapshot consolidation policy.
b. Set the snapshot consolidation policy:
Bookmarking
235
Testing, Failover, and Migration
Table 41
Consolidation policies
Policy
Description
Never consolidate snapshot
Snapshot is never consolidated.
Snapshot must survive
Daily/Weekly/Monthly
consolidations
• Daily: Snapshot remains after daily consolidations, but is consolidated
in weekly, monthly and manual consolidations.
• Weekly: Snapshot remains after daily and weekly consolidations, but is
consolidated in monthly and manual consolidations.
• Monthly: Snapshot remains after daily, weekly and monthly
consolidations, but is consolidated in manual consolidations.
4. Click OK.
Applying
bookmarks to
multiple groups
simultaneously
To apply the same bookmark at a single point in time across multiple
consistency groups, use the Parallel bookmark command. When the
command is executed, the system immediately closes a snapshot on
each of the consistency groups specified.
Note: You cannot use latest as a bookmark name, because latest is a reserved
term.
Some applications support a quiesced state. For best reliability, you
should use the quiesced state when bookmarking a snapshot.
To apply parallel bookmarks:
1. In the Navigation Pane, select Consistency Groups. In the
Component Pane, select all the consistency groups you wish to
bookmark simultaneously. All selected consistency groups must
be enabled and transfer must be active.
2. Click Parallel Bookmarks (upper right corner of the Component
Pane).
3. Enter the following information on the Create Parallel Bookmark
dialog:
236
EMC RecoverPoint Release 3.3 Administrator’s Guide
Testing, Failover, and Migration
1. Enter a descriptive name for the bookmark.
2. Set the consolidation policy for the bookmark.
The default consolidation policy for a snapshot is Always
Consolidate, which means that means that the snapshot is
consolidated the next time that the consolidation process runs.
1. Check Configure consolidation policy.
2. Set the snapshot consolidation policy. Table 41 on page 236
describes the snapshot consolidation policies.
4. Click OK.
Automatic periodic
bookmarking
A group set allows you to automatically bookmark a set of
consistency groups so that the bookmark represents the same
recovery point in each consistency group in the group set. This allows
you to define consistent recovery points for consistency groups that
are distributed across different RPAs.
The automatic periodic bookmark consists of the name you specified
for the group set and an automatically incremented number.
Numbers start at zero, are incremented up to 65535, then begin again
at 0.
The same bookmark name is used across all the groups. To apply
automatic bookmarks, the sources must be at the same site
(replicating in the same direction) and transfer must be enabled for
each consistency group included in the group set.
Group sets
A group set is a set of consistency groups to which the system applies
parallel bookmarks at a user-defined frequency. Group sets are useful
for consistency groups that are dependent on one another or that
must work together as a single unit.
The Group Set Details dialog box allows you to create, edit, or
remove group sets.
Creating a group set
Note: Automatic bookmarking of group sets will succeed only if all groups
are active and all sources are at the same site.
To create a group set:
1. In the Navigation Pane, select Consistency Groups. In the
Component Pane, select all the consistency groups you wish to
bookmark automatically.
Bookmarking
237
Testing, Failover, and Migration
2. Click Group Sets (upper right corner of the Component Pane).
The Group Set Details dialog box is displayed.
3. In the Group Set Details dialog box, click Create.
4. Enter a name for the automatic bookmarks. Select the consistency
groups to be in the group set, and specify the bookmarking
frequency. It is recommended that the interval between automatic
bookmarks not be less than 30 seconds.
5. Click OK.
Editing a group set
Editing a group set only allows you to change the bookmarking
frequency.
1. In the Navigation Pane, select Consistency Groups.
2. Click Group sets (upper right corner of the Component Pane).
The Group Set Details dialog box is displayed.
3. In the Group Set Details dialog box, click Edit.
You can edit only the bookmarking frequency.
4. Edit the bookmarking frequency as desired. The interval between
automatic bookmarks should not be less than 30 seconds.
5. Click OK.
238
EMC RecoverPoint Release 3.3 Administrator’s Guide
Testing, Failover, and Migration
Removing a group set
Use the following procedure to remove group sets.
!
CAUTION
The selected group sets are removed as soon as you click OK.
1. In the Navigation Pane, select Consistency Groups.
2. Click Group sets (upper right corner of the Component Pane).
The Group Set Details dialog box is displayed. All existing group
sets are displayed.
3. Select the Group Set you wish to remove, and click Remove.
4. To remove the selected group sets and close the dialog box, click
OK.
Applying
bookmarks using
KVSS
The RecoverPoint KVSS utility is a command-line utility that enables
applying bookmarks to Windows 2003 and 2008-based applications
that support Microsoft Volume Shadow Copy Service (VSS).
Microsoft Exchange and SQL are examples of Windows applications
that support Volume Shadow Copy Service.
A single bookmark can be used to bookmark Volume Shadow Copy
Service-aware applications in many consistency groups. Volume
Shadow Copy Service guarantees that the applications are in a
consistent state at the point-in-time when each bookmark is applied
to an image. As a result, recovery using an image with a KVSS
bookmark is faster than recovering from normal RecoverPoint
images.
KVSS bookmarks are applied using the kvss.exe bookmark
command. The working folder for running KVSS commands is
%SystemDriver%/VssKashyaProvider/.
Bookmarking
239
Testing, Failover, and Migration
The syntax is as follows:
kvss.exe bookmark
writers=writer1[ writer2…]
groups=group1[ group2…]
bookmark=<bookmark_name>
[policy=never | survive_daily | survive_weekly |
survive_monthy | always]
ip=<RecoverPoint_mgmt_ip_address>‘
[type = [FULL|COPY]]
where:
writer = a VSS-aware host application
group = RecoverPoint consistency group
bookmark = name by which you can identify the applied bookmark
policy = consolidation policy to set for this snapshot.
Valid values are:
◆
never; Snapshot is never consolidated.
◆
survive_daily; Snapshot remains after daily consolidations, but is
consolidated in weekly, monthly, and manual consolidations.
◆
survive_weekly; Snapshot remains after daily and weekly
consolidations, but is consolidated in monthly and manual
consolidations.
◆
survive_monthly; Snapshot remains after daily, weekly, and
monthly consolidations, but is consolidated in manual
consolidations.
◆
always; Snapshot is consolidated in every consolidation process,
whether manual or automatic.
Note: The default policy is always. If the consolidation_policy parameter is
not specified, the snapshot is consolidated in both automatic and manual
consolidation processes.
ip = RecoverPoint site-management IP
type = The shadow copy type: either FULL or COPY. This setting is
optional. The default is COPY. The settings full and copy are
implemented by the writer application. Generally, when type = full,
backup logs are truncated; when type = copy, backup logs are not
truncated.
240
EMC RecoverPoint Release 3.3 Administrator’s Guide
Testing, Failover, and Migration
Note: Values should be surrounded by quotation marks. You can use the
vssadmin list writers command to obtain a list of registered writers on the
host machine.
The following is an example of a command used to produce a
bookmark for a Microsoft Exchange application:
kvss.exe bookmark
writers="Microsoft Exchange Writer"
groups="exchange group"
bookmark="exchange hourly snapshot"
policy=”survive_daily”
ip=10.10.0.145
Note: To use KVSS in a Microsoft Cluster Server environment with
Symmetrix DMX storage, the SPC-2 flag must be enabled on the Symmetrix
ports.
Bookmarking
241
Testing, Failover, and Migration
Accessing a replica
When replicating normally, writes to the production source are also
written to the journal of the replicas. The storage at the replica sites is
not accessible (state of storage = No access), because the snapshots in
the journal are being distributed to the storage at that site.
Figure 7
Normal replication to local and remote replica simultaneously
To do any of the following, you must access the replica:
◆
Test a replica
◆
Roll back the production source to a previous point in time
◆
Fail over to a replica
◆
Migrate permanently to a different production site
To enable a host to access a replica, enable image access to that
replica; then mount the volume. If the access is logged (or virtual
with roll), distribution of writes from the journal to the replica will
stop.
Writes will be collected in the journal until image access is disabled. If
the journal is completely filled with writes, replication will be
disabled.
Enabling image
access
242
1. To enable image access, click the Image Access pull-down of the
desired replica
EMC RecoverPoint Release 3.3 Administrator’s Guide
Testing, Failover, and Migration
The Enable Image Access drop-down menu is displayed
2. Select Enable Image Access.
The Enable Image Access dialog box is displayed.
3. Select by which method you wish to specify which snapshot to
access:
Select an image from the list
or
Specify desired point in time
or
Directly select the latest image
Then select the specific image you wish to access.
The snapshot you select to access depends on what you want to
achieve:
• To test an image, you may wish to start with the last image
known to be valid.
• To analyze data, you generally want the latest snapshot.
• To fail over to a replica, you generally want the most recent
snapshot that you know to be valid. For instance, if you are
using Microsoft Volume Shadow Copy Service, you probably
Accessing a replica
243
Testing, Failover, and Migration
want to select the most recent shadow copy. The shadow
copies will be bookmarked with the name that you assigned to
shadow copies in the Microsoft Volume Shadow Copy Service
configuration.
• To restore the production source, select Production Restore.
• Migration should be well planned in advance, and the
snapshot to select for migration should be part of an overall
migration strategy.
After specifying the snapshot, the Image access mode dialog box
is displayed. Select one of the options listed in the following table
for the image access mode.
Table 42
244
Image access modes
Mode
Values and description
Logged access
(physical)
To fail over, migrate, or restore production from this replica, select
Logged access.
Logged access rolls backwards (or forwards) to the snapshot (point
in time) you wish to recover. There will be a delay until the system
rolls to the specified image. The length of delay depends on how far
the selected image is from the image currently being distributed to
storage.
Once access is enabled, hosts in the SAN will have direct access to
the storage volumes, and the RPA will not have access; that is,
distribution of images from the journal to storage will be paused.
If you disable image access, the writes to the storage volume while
image access was enabled will be rolled back. Then distribution to
storage will continue from the current image forward.
If you wish to use the current image as is (with the writes to storage
made now), fail over to this image or restore production.
Virtual access
(instant)
To test an image or restore a file to a previous state, select Virtual
access.
Virtual access creates the required parts of the image you wish to
recover in a separate virtual volume (or in memory). Access is very
fast, as the system does not actually roll to the image in storage. You
can use virtual access in the same way as logged access, however, it
is not suitable if you need to run many commands or if you need data
from large areas of the replica.
After you have tested an image and found it suitable, you may wish to
use the Roll to Image command (Table 43 on page 247) to actually
roll to the selected image.
When you disable image access, the virtual volume and all writes to it
are discarded.
EMC RecoverPoint Release 3.3 Administrator’s Guide
Testing, Failover, and Migration
Table 42
Image access modes
Mode
Values and description
Roll image in
background
To test an image and roll to it in background.
Virtual access with Roll image in background creates the required
parts of the image in a virtual volume, but simultaneously rolls to the
physical image in background. Once the rollback is completed, the
virtual volume is discarded, and the physical volume takes its place.
The virtual volume and the physical volume have the same SCSI ID,
therefore the switch from one to the other will be transparent to
servers and applications.
If you disable image access, the writes to the volume while image
access was enabled will be discarded. Then distribution to storage
will continue from the current image forward.
If you wish to use the current image as is (with the writes to storage
made now), fail over to this image or restore production.
After selecting the Image access mode and clicking Next, the
Image access mode Summary box is displayed. Check your
choices. If necessary, go back and change any choices that are not
as you desire. When satisfied, click Finish.
4. From the host, mount the volumes you wish to access. If desired,
run fsck (chkdsk on Windows). Even if you use virtual access,
mount the volume. The virtual image has the same SCSI ID as the
actual image.
Possible courses of action at this point:
• Access another image
• Disable image access
• Undo writes
• Enable direct access
• Move to previous point in time
• Move to next point in time
• Fail over to local (or remote) replica
• Recover production
These courses of action are described in Table 43 on page 247 and
Table 44 on page 249.
If you selected virtual access, you can roll to the actual logged image
using Roll to image.
Accessing a replica
245
Testing, Failover, and Migration
To set this replica as the production site, you must first fail over to the
replica.
Undo information for any changes made to the replica by the host
will be written to the image access log, and automatically undone
when you disable image access.
The quantity of data that can be written by the host application to the
replica journal is limited by the capacity of the journal. About 5% of
the journal is reserved for instant recovery (indexing) information;
and approximately 1 GB is reserved for handling peaks during
distribution. The remaining disk space is available for the journal and
the image access log. The size of the image access log is, by default,
20% of the available disk space; however, this proportion can be
modified (refer to Proportion of journal allocated for image access
log (%) in Table 19 on page 155). The remaining available space is
reserved for the journal.
For virtual access, the maximum size for the image access log is
approximately 40 GB. If the capacity for the image access log is
reached, host access to the replica is blocked and target-side
processing is halted. You can ensure continued use of the image
access log by adding capacity to the journal (up to the limits for
journal or image access log size).
Direct Image
Access
Alternatively, you can use the Direct Image Access command, which
does not impose a limit to the amount of data that you can write to
storage. In addition, Direct Image Access gives better system
performance when accessing the replica, because no rollback
information to the image access log is being written in parallel with
the ongoing disk I/Os. Hence, this option may be preferred if you
want to carry out processing that generates a high volume of write
transactions at the replica. It can also be used for testing the
replicated images of BFS groups.
Direct Image Access has the following drawbacks:
246
◆
Journal is cleared.
◆
After selecting Direct Image Access, you cannot roll back to an
earlier image, if in the meantime you discover corrupted data.
Moreover, in the event of a disaster at the source side, you will be
unable to remove any new data that you have written to the
replica (unless you have a third replica with a journal).
◆
Transfer for the consistency group must be paused.
EMC RecoverPoint Release 3.3 Administrator’s Guide
Testing, Failover, and Migration
◆
Nonetheless, the system continues to write markers to the
production journal volume, and it can use those markers later to
resync the replica with the source.
Note: If you wish to preserve a particular image of the replica, to give
yourself added protection, you can back up the image before beginning your
offline processing.
Image Access
Enabled mode
Table 43
After enabling image access by selecting an image, the choices listed
in the following table are available.
Image access enabled mode
Command
Values and description
Access Another
Image
Select another snapshot to recover. You can roll forward or backward.
Disable Image
Access
Unmount the volume from the host at this replica before disabling.
To disable image access.
If you were in Logged access mode, any writes made directly to the
LUN while image access was enabled will be discarded. Distribution
from the journal to the storage will continue from the accessed image
forward.
If you were in Virtual access mode, the virtual LUN and any writes to
it will be discarded. Distribution will continue from the last snapshot
that was distributed before the image access.
If you were in Virtual access with Roll image in background, the
virtual LUN and any changes to it and any writes made directly to
storage will be discarded. Distribution will continue from whatever
snapshot the system has rolled to.
Undo Writes
To undo the writes recorded in the image access log without disabling
image access.
Roll to Image
Only available for Virtual access without Roll image in background:
To roll the stored replica to the selected image.
Enable Direct
Access
Caution: A full sweep will occur when you disable image access.
To allow the host at this site to modify the replica. These writes are
not logged in the image access log and cannot be undone except by a
full sweep.
Move to previous
point in time
Roll the stored image back one snapshot.
Accessing a replica
247
Testing, Failover, and Migration
Table 43
248
Image access enabled mode
Command
Values and description
Move to next
point in time
Roll the stored image forward one snapshot.
EMC RecoverPoint Release 3.3 Administrator’s Guide
Testing, Failover, and Migration
Failover commands
After enabling image access, the following possibilities for failing
over to the enabled image are available.
Table 44
Failover commands
Command
Values and description
Fail over to
Local/Remote
copy
Caution: This command will erase the journal at this site.
To use the selected (local or remote) replica as the source. Transfer
from production will stop.
If the system has only a local or only a remote copy, but not both, this
replica will automatically become the production source and the
production source will become the local or remote replica.
If the system has three copies (production, local, and remote),
transfer to the third copy will not be resumed until production is
restored as the source.
In a three-copy configuration, to convert the current source to the
production source, select Set local copy as production or Set
remote copy as production.
Recover
Production
To repair the production source using the replica as the source.
Recover Production is only available if the replica’s journal is still in
tact; therefore, Recover Production is not available if you used Direct
Image Access or after distributing a snapshot that is larger than the
capacity of the journal (refer to Table 19 on page 155).
Transfer from production source will be paused. Transfer to a third
copy will not resume until production is restored as the source. Host
access to the selected replica will be blocked. You will only be able to
to restore the production source from the selected replica.
While being restored, the role of the production replica will be
“Production (being restored).” When the restore is completed, enable
image access at the production source, and select the failover option
Resume production. The production journal is discarded.
Set Local/Remote
copy as
production
Only available after failing over to a local or remote replica in a
three-copy configuration:
To set the current replica as the production source.
If the local replica is converted to the production source and there is a
remote replica, the remote replica will require a full sweep. If the
remote replica is converted to the production source and there is also
a local replica, you must delete either the original production source
or the local replica, before the remote replica can become a
production source. In other words, having two remote replicas is not
supported.
Failover commands
249
Testing, Failover, and Migration
Table 44
250
Failover commands
Command
Values and description
Resume
Production
Only after image access was enabled, and after either failover or a
recover production was performed on the selected copy.
Restores the production copy as the data source.
EMC RecoverPoint Release 3.3 Administrator’s Guide
6
Notification of Events
Notification of Events
This section explains how to configure RecoverPoint event
notification. Various events generate entries to the RecoverPoint
system log. RecoverPoint notifies of events by logging events in the
Management Application. RecoverPoint can be configured to notify
of events by e-mail, setting SNMP traps to designated hosts, or by
sending events to a syslog server. In addition, by default
RecoverPoint sends system reports to EMC Customer Service.
The topics in this section are:
◆
◆
◆
◆
◆
◆
◆
Configuring event notification.......................................................
E-mail notification............................................................................
SNMP notification............................................................................
Syslog notification............................................................................
System reports ..................................................................................
System alerts .....................................................................................
Collecting system information .......................................................
Notification of Events
252
253
255
259
260
266
268
251
Notification of Events
Configuring event notification
RecoverPoint supports the following types of event notification:
◆
◆
◆
◆
◆
◆
252
“E-mail notification”
“SNMP notification”
“Syslog notification”
“System reports”
“System alerts”
“Collecting system information”
EMC RecoverPoint Release 3.3 Administrator’s Guide
Notification of Events
E-mail notification
To configure email notification of system events, use the following
procedure.
1. From the System menu, select System Settings > Alert Settings.
2. To enable sending emails, check Email System Enabled.
3. Specify an SMTP server’s address to send the emails. The server
address may be entered either in IP format or in DNS format.
4. Specify the email address of the sender. This address will be
displayed as the sender of the email.
5. Specify which alerts you wish to send to whom. To do so, click the
Add button. Then fill in the Add New Alert Rule dialog box.
The settings are described in the following table.
Table 45
New Alert Rule settings
Setting
Values
Rule Enabled
Enabled: To enable the specified rule
E-mail notification
253
Notification of Events
Table 45
New Alert Rule settings
Setting
Values
Topic
Select which events to report, according to the component of the
RecoverPoint system where the events occur:
• All Topics
• Site
• RPA
• Group
• Splitter
Level
Info: Messages are informative in nature, usually referring to changes
in the configuration or normal system state.
Warning: Message indicates a warning, usually referring to a
transient state or an abnormal condition that does not degrade
system performance.
Error: Message indicates an important event that is likely to disrupt
normal system behavior and/or performance.
Scope
Normal: To report only the root cause for an entire set of detailed and
advanced events. In most cases, these events are sufficient for
effective monitoring of system behavior.
Detailed: This category includes all events, with respect to all
components, that are generated for use by users.
Advanced: In specific cases (for instance, for troubleshooting a
problem) EMC Customer Service may ask you to retrieve information
from the advanced log events. These events contain information that
is intended primarily for the technical support engineers.
Type
Immediate: Each event notification is sent immediately as it occurs.
Daily: Event notifications are sent once a day in a digest.
6. After specifying the Alert Rules, click Add to add the email
addresses to be notified of events matching these alert rules.
To edit an existing alert rule, select the rule and click Edit.
To remove an alert rule, select it and click Remove.
Event notifications are sent by e-mail as specified.
254
EMC RecoverPoint Release 3.3 Administrator’s Guide
Notification of Events
SNMP notification
RecoverPoint supports the standard Simple Network Management
Protocol (SNMP), including support for SNMP version 3 (SNMPv3).
RecoverPoint supports various SNMP queries and can be configured
to generate SNMP traps (notification events), which are sent to
designated network management systems.
To configure SNMP notification of system events, use the following
procedure.
1. From the System menu, select System Settings, and select SNMP
Settings from the Navigation Pane. The SNMP Settings Pane of
the System Settings dialog box is displayed.
2. Enter the desired values in the General Settings group box,
according to the information in the following table.
Table 46
SNMP general settings
Setting
Range of values
Agent Enabled
When selected, enables the RecoverPoint SNMP agent. The
agent must be enabled to send SNMP traps (notification
events).
Send Event Traps
When selected, sends SNMP traps (notification events) to the
RecoverPoint SNMP agent.
Event Trap Level
Error: Only sends important messages indicating an event that
is likely to disrupt normal system behavior or performance.
Warning: In addition to Errors, sends warnings, usually referring
to a transient state or an abnormal condition that does not
degrade system performance.
Info: In addition to Warnings and Errors, sends informative
messages, usually referring to changes in the configuration or
normal system state.
Trap Destination (local) (optional) The network management server at the local site to
which you wish to deliver notifications. The address may be
either in IP or DNS format. A DNS address will work only if a
DNS server is configured in the RecoverPoint system.
Trap Destination
(remote)
(optional) The network management server at the remote site to
which you wish to deliver notifications. The address may be
either in IP or DNS format. A DNS address will work only if a
DNS server is configured in the RecoverPoint system.
SNMP notification
255
Notification of Events
3. Enter the desired values in the Advanced Settings group box.
If you are using SNMP version 1, enter the read-only community
string (a type of password; but note that it is transmitted in
cleartext).
If you are using SNMP version 3, click Add, and enter user names
and passwords. To remove a user name, click on it and click
Remove. Click Apply to save your choices.
4. Click OK.
SNMP trap
configuration
RecoverPoint supports the default MIB-II. The RecoverPoint MIB can
be downloaded from powerlink.emc.com at the following location:
Home > Support > Software Downloads and Licensing >
Downloads P–R > RecoverPoint
The application MIB OID is:
1.3.6.1.4.1.21658
The trap identifiers for RecoverPoint traps are as follows:
1. Info
2. Warning
3. Error
The product ID = Kashya
The RecoverPoint SNMP trap variables and their possible values are
listed in the following table.
RecoverPoint SNMP trap variables
Table 47
256
Variable
OID
Description
dateAndTime
3.1.1.1
Date and time that trap was sent
eventID
3.1.1.2
Unique event identifier; the values are listed in
Appendix A, page 277.
siteName
3.1.1.3
Name of site where event occurred
EMC RecoverPoint Release 3.3 Administrator’s Guide
Value
Notification of Events
RecoverPoint SNMP trap variables
Table 47
Variable
OID
Description
Value
eventLevel
3.1.1.4
See values.
3.
4.
5.
6.
7.
info
warning
warning off
error
error off
eventTopic
3.1.1.5
See values.
1.
2.
3.
4.
5.
site
K-Box
group
splitter
management
hostName
3.1.1.6
Name of host
kboxName
3.1.1.7
Name of K Box
volumeName
3.1.1.8
Name of volume
groupName
3.1.1.9
Name of group
eventSummary
3.1.1.10
Short description of event
eventDescription
3.1.1.11
More detailed description of event
OMSA support
OpenManage Server Administrator (OMSA) support provides
RecoverPoint customers with the ability to:
◆
Display Dell hardware event notifications, together with
RecoverPoint event notifications, in real-time, in the same
management console. The instructions for this procedure follow.
◆
Collect system information that includes Dell hardware
configuration information.
To do so, see “Collecting system information” on page 268.
Note: This feature is only relevant for systems in which the RPAs are
running on Dell PowerEdge platforms.
RecoverPoint generates events that result in Simple Network
Management Protocol (SNMP) traps. When an event with predefined
characteristics (defined in RecoverPoint and OMSA MIBs) occurs on
SNMP notification
257
Notification of Events
your system, the SNMP subagent sends information about the event,
along with trap variables, to a specified management console.
To view OMSA and RecoverPoint events in real-time:
1. Configure the RecoverPoint SNMP agent to send event traps to
your event management console.
Note: In the RecoverPoint Management Application SNMP Settings
dialog box; make sure that the Agent Enabled and Send Event Traps
checkboxes are checked and that the IP address of the machine that has
been dedicated as the event management console is defined as the Trap
Destination value, see “SNMP notification” on page 255.
2. Install a MIB browser on the machine dedicated as the event
management console.
Note: MIB browsers are used to manage SNMP-enabled devices and
applications in a network. MIB browsers enable users to load MIBs, issue
SNMP requests, and receive SNMP traps.
3. Open the MIB Browser on your management console, and:
a. Enter the site management IP of the RPA cluster into the
address bar.
Note: The site management IP is a virtual IP address assigned to the
RPA that is currently active. In the event of failure by this RPA, this IP
address dynamically switches to the RPA that assumes operation.
b. Enable the MIB browser’s Trap Receiver.
In the Trap Receiver, both the RecoverPoint and OMSA events
are displayed in real-time. RecoverPoint events are preceded
with their severity level (Info, Warning, or Error), the Dell
OMSA event OIDs have the prefix 1.3.6.1.4.1.674.
See also
The Dell OpenManage™ Server Administrator Version 1.9 SNMP
Reference Guide can be found at the Dell website at:
http://support.dell.com/support/edocs/software/svradmin/1.9.2/
en/SNMP/
258
EMC RecoverPoint Release 3.3 Administrator’s Guide
Notification of Events
Syslog notification
To configure notification of system events by syslog, use the
following procedure.
1. From the System menu, select System Settings > Syslog
Settings.
2. To enable sending notification by syslog, check Syslog Enabled.
3. Enter the desired values for the remaining fields in the Syslog
Settings dialog box. Refer to the following table.
Table 48
Syslog settings
Setting
Range of values
Facility
Select one of the available labels to be attached to all
messages.
Level
Info: Informative messages, usually referring to changes in the
configuration or normal system state will be sent in addition to
warnings and errors.
Warning: Warnings, usually referring to a transient state or an
abnormal condition that does not degrade system performance
will be sent in addition to errors.
Error: Only important messages indicating an event that is likely
to disrupt normal system behavior or performance will be sent.
Target Host (local)
(optional) Specify the syslog server at the local site to which you
wish to deliver notifications. The address may be either in IP or
DNS format. A DNS address will work only if a DNS server is
configured in the RecoverPoint system.
Target Host (remote)
(optional) Specify the syslog server at the remote site to which
you wish to deliver notifications. The address may be either in IP
or DNS format. A DNS address will work only if a DNS server is
configured in the RecoverPoint system.
4. Click OK.
Syslog notification
259
Notification of Events
System reports
The system reports (SyR) mechanism provides one-way
communication between a RecoverPoint installation and the EMC
System Reports database. This mechanism supports two types of
information, system alerts, and system configuration reports.
System reports on the configuration and state of the RecoverPoint
system are sent per site, every Sunday. System alerts are sent in
real-time (at the time that the events occur) to the EMC System
Reports database, allowing EMC to provide pre-emptive support for
RecoverPoint issues.
The system report mechanism filters system alerts (see “System
alerts” on page 266), and decides whether a service request should be
opened with EMC Customer Service. If a service request is required,
the system reports mechanism will automatically open one. One
example of a possible alert rule may be; if an RPA is down for more
than an hour, open a service request.
The system reports mechanism is enabled by default, but can be
disabled at any time through the Management Application or CLI.
By default, the system report and alerts are compressed and
encrypted with RSA encryption using a 256-bit key before they are
sent.
Note: The system report mechanism will only send system alerts and reports
provided the SMTP settings are configured and the Software Serial IDs,
provided with each RecoverPoint license, are entered into the Management
Application GUI or CLI.
By default, system reports are sent through FTPS to SyR's FTPS
server, but they can be configured for transfer through a customer's
ESRS server (by way of SMTP), or a customer's designated SMTP
server. System reports are sent in XML format.
The following sections describe the handling of system reports:
◆
◆
◆
260
“Before you begin”
“System report operations”
“Best practice”
EMC RecoverPoint Release 3.3 Administrator’s Guide
Notification of Events
Before you begin
Entering software
serial IDs
Before you begin:
◆
Ensure that your Software Serial IDs have been entered into the
Management Application, see “Entering software serial IDs” on
page 261.
◆
Decide upon a transfer method for the system reports and
configure the required transfer settings, see “Configuring a server
for transfer” on page 262.
Two Software Serial IDs, one for each site in the RecoverPoint
installation, are supplied with each RecoverPoint license. After the
product’s installation, you must enter these IDs into the Management
Application GUI (or CLI) to enable the sending of system alerts
and/or reports to the system report mechanism.
To enter your Software Serial IDs into the Management Application:
1. From the System menu, select System Settings > Account
Settings.
2. Enter the Software Serial ID of the first site in the Software serial
ID (<site1>) field.
3. Enter the Software Serial ID of the second site in the Software
serial ID (<site2>) field.
4. Click the Apply button.
System reports
261
Notification of Events
Configuring a server
for transfer
By default, system reports are transferred through RecoverPoint’s
built-in FTPS server. If FTPS is your required method of transfer, no
further configuration is necessary and you can skip to the “System
report operations” section.
Note: If you wish to transfer system reports using FTPS, ensure that ports 990
and 989 are open and available for FTPS traffic. If you wish to transfer system
reports using SMTP or ESRS, ensure that port 25 is open and available for
SMTP traffic.
To define ESRS or SMTP as your method of transfer, click System in
your main RecoverPoint menu bar and select System Settings >
System Report Settings.
◆
To transfer system reports through an SMTP server:
a. In the Transfer Method section, select the SMTP radio button.
b. In the SMTP Server Address field, specify the IP address or
DNS name of your dedicated SMTP server, in IPv4 or IPv6
format.
c. Click the Apply button.
◆
To transfer system reports through an ESRS gateway:
a. In the Transfer Method section, select the ESRS radio button.
b. In the ESRS Gateway IP Address field, specify the IP address
of your ESRS gateway in IPv4 format.
c. Click the Apply button.
System report
operations
Most of the system report operations are performed through the
System Report Settings screen of the RecoverPoint Management
Application GUI.
To access the system report settings, from the main RecoverPoint
menubar, click the System menu, and select System Settings >
System Report Settings.
262
EMC RecoverPoint Release 3.3 Administrator’s Guide
Notification of Events
The following sections describe the handling of system reports:
◆
◆
“Operations in the Management Application”
“Additional CLI options”
System reports
263
Notification of Events
Operations in the
Management
Application
The following operations can be performed through the Management
Application interface.
Enabling system reports
To enable the automatic sending of weekly system reports:
1. Check the System Reports checkbox.
2. Click the Apply button.
Disabling system reports
To disable the automatic sending of weekly system reports:
1. Uncheck the System Reports checkbox.
2. Click the Apply button.
Enabling system alerts
To enable the sending of system alerts within the system reports:
1. Check the System Alerts checkbox.
2. Click the Apply button.
Disabling system alerts
To disable the sending of system alerts within the system reports:
1. Uncheck the System Alerts checkbox.
2. Click the Apply button.
Encrypting system reports
To encrypt the output of the system reports and alerts with RSA
encryption using a 256-bit key before sending:
1. Check the Encrypt checkbox.
2. Click the Apply button.
Compressing system reports
To compress the output of the system reports and alerts before
sending:
1. Check the Compress checkbox.
2. Click the Apply button.
Note: Additional system report operations are available through the
RecoverPoint CLI, see “Additional CLI options” on page 265.
264
EMC RecoverPoint Release 3.3 Administrator’s Guide
Notification of Events
Additional CLI options
The following system report operations can only be performed
through the CLI. See the EMC RecoverPoint CLI Reference Guide for
more information.
Viewing system reports
To view the current system report information, from the
RecoverPoint Command Line Interface, run the get_system_report
CLI command.
Sending system reports to a specified e-mail address
To send the current system report information to a specified e-mail
address, from the RecoverPoint Command Line Interface, run the
get_system_report CLI command.
Displaying the current system report settings
To display the current system report settings, from the RecoverPoint
Command Line Interface, run the get_system_report_settings CLI
command.
Best practice
The following configuration is the best practice and recommended by
EMC:
System Reports = enabled
System Alerts = enabled
Encrypt = enabled
Compress = enabled
System reports
265
Notification of Events
System alerts
RecoverPoint appliances send system events (whose scope is normal)
to the EMC System Reports database, in real-time, through the
System Alerts mechanism, provided the SMTP settings are
configured and the Software Serial IDs, supplied with each
RecoverPoint license, are entered into the Management Application
or CLI. The system alert mechanism will filter these events, to decide
whether a service request should be opened with EMC Customer
Service. If a service request is required, the system alerts mechanism
will automatically open one.
System alerts are used by EMC to provide pre-emptive support for
RecoverPoint issues, and are deployed according to predefined alert
rules. One example of a possible alert rule may be; if an RPA is down
for more than an hour, send an alert.
This mechanism is enabled by default, but can be disabled at any
time through the Management Application GUI or the CLI.
The following sections describe the handling of system alerts:
◆
◆
Before you begin
“Before you begin”
“System alert operations”
System alerts are sent through a designated SMTP server. Before you
begin, enable e-mail notifications and define the SMTP server though
which alerts will be sent (see “E-mail notification” on page 253).
Two Software Serial IDs, one for each site in the RecoverPoint
installation, are supplied with each RecoverPoint license. After the
product’s installation, you must enter these IDs into the Management
Application GUI (or CLI) for the system alert mechanism to work.
To enter your Software Serial IDs into the Management Application:
1. From the System menu, select System Settings > Account
Settings.
2. Enter the Software Serial ID of the first site in the Software serial
ID (<site1>) field.
3. Enter the Software Serial ID of the second site in the Software
serial ID (<site2>) field.
4. Click the Apply button.
266
EMC RecoverPoint Release 3.3 Administrator’s Guide
Notification of Events
System alert
operations
All system alert operations can be accessed via the Alert Settings
dialog box of the RecoverPoint Management Application GUI. To
access this dialog box, from the System menu, select System Settings
> Alert Settings.
To view system alerts:
1. Select Logs in the Navigation Pane.
2. Identify the warning and error events whose scope is Normal.
System alerts
267
Notification of Events
Collecting system information
The past thirty days of system information can be collected from
RPAs and splitters. This information is used to analyze and resolve
support cases, and can be collected through:
◆
the Deployment Manager, see EMC RecoverPoint Deployment
Manager Product Guide.
◆
the Management Application GUI, see “How to collect system
information” on page 269.
Upon completion of the collection process, an output file is placed
in:
http://[RPA IP address]/info
or
https://[RPA IP address]/info
To retrieve the output file, you must log in as a user with
webdownload permissions. See “Access control” on page 119 for
more information on users with webdownload permissions.
Note: The system information collected from RPAs also includes Dell
OMSA hardware configuration information, provided the RPAs are
running on Dell PowerEdge platforms.
The following sections describe the process of system information
collection:
◆
◆
◆
◆
Process alternatives
“Process alternatives”
“Process errors”
“Splitter credentials”
“How to collect system information”
If for any reason (connectivity issues, etc.) the collection process fails
to collect the specified host information, you can collect the
information directly from individual hosts on which the feature is
enabled.
To do so:
◆
for Windows-based hosts; from the
Program Files\KDriver\hic directory on the host, run:
host_info_collector
268
EMC RecoverPoint Release 3.3 Administrator’s Guide
Notification of Events
◆
for Solaris or AIX-based hosts; from the
kdriver/info_collector directory on the host, run:
info_collect.sh
Process errors
Errors will occur in the following cases:
◆
If connection with an RPA is lost while info collection is in
process, no information is collected.
In this case, run the process again. If the collection from the
remote site failed because of a WAN failure, run the process
locally at the remote site.
Splitter credentials
How to collect
system information
◆
If a simultaneous info collection process is being performed on
the same RPA, only the collector that established the first
connection can succeed.
◆
If an FTP failure occurs, the entire process fails.
In order to collect system information for SANTap and CLARiiON
splitters, you must first enter splitter credentials. You can enter
splitter credentials as part of the following procedures, through the
specified interfaces:
◆
When collecting system information through the Collect System
Information Wizard, see “How to collect system information” on
page 269.
◆
When adding new splitters, through the Add Splitter Wizard, see
“Adding splitters” on page 128.
◆
When managing existing splitters, through the Splitter
Properties dialog box, see “Manually attaching volumes to
splitters” on page 173.
In the RecoverPoint GUI, system information is collected using the
Collect System Information Wizard.
To collect system information:
1. Select System > Collect System Information from the main
system menu.
The Collect System Information Wizard is displayed.
Collecting system information
269
Notification of Events
2. Configure the collection process by entering the desired values
into the fields of the first Collect System Information dialog box
screen. Refer to the following table.
Table 49
Collect system information settings
Setting
Description
Include information
from
Default=yesterday, the current hour
Specify the date and time (in GMT or your local time) of the first
system information that you want to include in the output.
Although the system information of the past thirty days is
available for collection, only three days of system information
can be collected at a time.
Note: GMT is not adjusted for daylight savings time.
To
Default=today, the current hour
Specify the date and time (in GMT or your local time) of the
latest system information that you want to include in the output.
Although the system information of the past thirty days is
available for collection, only three days of system information
can be collected at a time.
Note: GMT is not adjusted for daylight savings time.
Include system
components
The system components whose system information to include in
the output. Possible options are RPAs only, Splitters only, or
Splitters and RPAs.
Include core files
Optional
Default=disabled
Whether or not to include core files in the output.
Note: Core files may be large. Subsequently, including these
files in the collection process may substantially increase
collection time.
Include both sites
270
Optional
Default=disabled
Whether to include the system information of components from
both sites in the output. When disabled, only the system
information of the site from which the collection process is
triggered is collected.
EMC RecoverPoint Release 3.3 Administrator’s Guide
Notification of Events
Table 49
Collect system information settings
Setting
Description
Copy output file to an
FTP server
Optional
Default=disabled
Whether to copy the collection process output file to an FTP
server.
When enabled, RecoverPoint will create a copy of the collection
process output file, and upload it to the specified FTP server.
FTP server address
Optional
The IP or DNS address of the FTP server to which to upload the
collection process output file.
For example: 10.10.180.10 or ftp.EMC.com
Port
Optional
The port through which to access the specified FTP server.
Username
Optional
The username to use when logging into the specified FTP
server.
Password
Optional
The password to use when logging into the specified FTP
server.
Remote path
Optional
The path to the copy of the output file stored on the specified
FTP server.
For example: / (to access the rootdir)
Override default file
name (Not
recommended)
Optional
Default=disabled
Whether to override the default file name of the output file
placed on the FTP server.
When enabled, RecoverPoint renames the output file uploaded
to the FTP server according to the new file name specified in the
Filename field.
Note: It is recommended to keep this setting disabled, and the
default file name as is.
Collecting system information
271
Notification of Events
Table 49
Collect system information settings
Setting
Description
New file name
Optional
Only relevant when the Specify filename setting is enabled.
The new file name for the output file placed on the FTP server.
Note: It is recommended to keep the Specify filename setting
disabled, and the default file name as is.
3. Click the Next button, to continue on to the next screen of the
Collect System Information Wizard.
4. If you set the Include system components field to RPAs only, skip
this step. Otherwise; the Splitter Selection screen of the Collect
System Information Wizard is displayed.
In the Splitter Selection screen:
a. Select the splitters whose system information you want to
include in the collection process and click the Next button.
A screen is displayed for each splitter for which login
credentials must be defined.
b. If you have already configured login credentials for all of the
SANTap and CLARiiON splitters that you selected in the last
step (see “Splitter credentials” on page 269), click Next until
the Summary screen is displayed.
Otherwise, enter credentials for each selected splitters, and
click the Next button. When credentials have been defined for
each selected SANTap and CLARiiON splitter, the Summary
screen in displayed.
5. In The Summary screen:
a. Review the displayed information to verify that your
configuration settings are correct.
To do so;
– Verify that a green checkmark exists in the Status column
of all listed splitters and RPAs, and that the text Action
succeeded exists in the Details column of all listed splitters
and RPAs.
272
EMC RecoverPoint Release 3.3 Administrator’s Guide
Notification of Events
– If all splitters and RPAs do not have green checkmarks in
their Status columns, see “Process alternatives” on
page 268.
Note: If required, click the Back button to edit your settings.
b. Click the Next button to initiate the collection process.
Note: You can click the Cancel button at any time during the
collection process to immediately stop the process.
The System Information Results screen of the Collect System
Information Wizard is displayed.
6. In The System Information Results screen:
a. Verify that the specified system information has been
successfully collected.
To do so:
– Verify that a green checkmark exists in the Status column
of all listed splitters and RPAs, and that the text Action
succeeded exists in the Details column of all listed splitters
and RPAs.
– If all splitters and RPAs do not have green checkmarks in
their Status columns, see “Process alternatives” on
page 268.
b. Retrieve the locally stored output file.
To do so:
Click the Output file (HTTP) link, or Output file (HTTPS)
link, and enter the username and password of a user with
webdownload privileges. See “Access control” on page 119 for
more information on users with webdownload permissions.
7. Click the Finish button to exit the wizard.
8. If you enabled the Copy output file to an FTP server option in
Step 2, you can now retrieve the remote copy of the output file
from the FTP server. To do so:
a. Open a Web browser window.
b. Enter the FTP server address or DNS name of the FTP server
you specified in Step 2 into the address bar.
Collecting system information
273
Notification of Events
c. At the login prompt, enter the Username and Password you
specified in Step 2.
d. Browse to the Remote path that you specified in Step 2.
274
EMC RecoverPoint Release 3.3 Administrator’s Guide
7
Host Cluster Support
Host Cluster Support
The RecoverPoint system supports both local and remote
high-availability host clusters. The topics in this section are:
◆
Configuring RecoverPoint cluster support .................................. 276
Note: When a RecoverPoint installation includes reservation-aware host
clusters and host splitters, you must install a KDriver on any host that can
access any replication volume on the storage, or on any host that could access
a replication volume in the absence of the cluster configuration. You must
then add the splitter to the RecoverPoint configuration, and attach that
splitter to all volumes the splitter could access.
Host Cluster Support
275
Host Cluster Support
Configuring RecoverPoint cluster support
To configure RecoverPoint for applications running on a host cluster,
follow the instructions for configuring replication, in “Starting
Replication” on page 127. The following changes are required to the
standard procedures:
◆
Place all volumes that are resources of the host cluster in a single
consistency group. Simplify management by assigning the name
of the host cluster as the name of the consistency group.
◆
When configuring the consistency group, in the Policy Tab,
General Settings, enable Reservations Support. In Advanced
Settings, verify that Global cluster mode = None.
◆
The journal volumes must not be resources of the host cluster.
◆
In Advanced Policies, Reservations Policy = SCSI-2 may be
required. Refer to the instructions for Reservations Policy in
“Configuring copy policies” on page 152.
◆
Assign each replication set the same name as its disk resource in
the cluster.
Note: If all cluster nodes (hosts) at the side are down, you may not
be able to create replication volumes. To correct this problem, either bring
up the nodes, or run a rescan_san command (from the CLI), with
volumes=FULL.
◆
Best practice: Perform first-time failover as soon as configuration
is completed. First-time failover instructions are different for
different host clusters. Refer to the Technical Note for your host
cluster to perform first-time failover.
Prior to enabling the consistency group, the System Pane may show
an Error on the source-side storage and display the message “Volume
cannot be accessed by any RPA.” The error status may also be
displayed in the Status Tab and Replication Sets Tab display for the
consistency group. The errors are removed upon enabling the
consistency group.
276
EMC RecoverPoint Release 3.3 Administrator’s Guide
A
Events
Events
This section presents a comprehensive list of the events that may
occur during RecoverPoint operation, as reported in the logs.
◆
◆
◆
Introduction ...................................................................................... 278
Normal events .................................................................................. 279
Detailed events ................................................................................. 301
Events
277
Events
Introduction
RecoverPoint generates an event log in response to events in the
RecoverPoint system. The events in the event log may be viewed
(“Event log management” on page 210). In addition, RecoverPoint
offers several options (e-mail, SNMP, and syslog) for sending event
notifications (“Notification of Events” on page 251).
Table 50 on page 279 and Table 51 on page 301 list system events and
their description. The events are divided into two tables, according to
scope: normal and detailed.
278
EMC RecoverPoint Release 3.3 Administrator’s Guide
Events
Normal events
The normal events include both “root-cause” events (a single
description for an event that can spawn many individual events) and
other selected basic events.
Table 50
Listing of normal events and their descriptions
Event ID
Topic
Level
Description
Trigger
1000
Management
Info
User logged in. (User <user>)
User login
1001
Management
Warning
Login failed. (User <user>)
User failed to login
1003
Management
Warning
Failed to generate SNMP trap. (Trap
contents)
System failed to send SNMP trap
1004
Management
Warning
Failed to send email alert to specified
address. (Address <email address>,
Event summary <summary>)
System failed to mail an email alert
1005
Management
Warning
Failed to update file. (File <file>)
Failure to update local config file
(passwords, ssh keys, syslog
configuration, SNMP
configuration)
1006
Management
Info
Settings changed. (User <user>,
Settings <settings>)
User changed settings
1007
Management
Warning
Settings change failed. (User <user>,
Settings <settings>, Reason <reason>)
Failure to change settings
1008
Management
Info
User action succeeded. (User <user>,
Action <action>)
User performed one of the
following actions:
bookmark_image, clear_markers,
set_markers, undo_logged_writes,
set_num_of_streams
1009
Management
Warning
User action failed. (User <user>, Action
<action>, Reason <reason>)
One of the following actions failed:
bookmark_image, clear_markers,
set_markers, undo_logged_writes,
set_num_of_streams
1011
Management
Error
Grace period expired. You must install
an activation code to activate
your RecoverPoint license.
Grace period expired
1014
Management
Info
User bookmarked an image. (Group
<group>, Snapshot <bookmark>)
User bookmarked image
Normal events
279
Events
Table 50
280
Listing of normal events and their descriptions
Event ID
Topic
Level
Description
Trigger
1015
Management
Warning
RPA to storage multipathing problem.
(RPA <RPA>, Volume <volume>)
Single path only or more paths
between RPA and volume is not
available.
1016
Management
Warning Off
RPA multipathing problem fixed. (RPA
<RPA>, Volume <volume>)
All paths between RPA and
volume are available.
1017
Management
Warning
RPA to host multipathing problem. (RPA
<RPA>, Splitter <splitter>)
One or more paths between RPA
and splitter is not available.
1018
Management
Warning Off
RPA multipathing problem fixed. (RPA
<RPA>, Splitter <splitter>)
All paths between RPA and splitter
are available.
1019
Management
Warning
User action performed successfully.
(Markers cleared. Group <group>,
<copy>)
(Replication set attached as clean.
Group <group>)
User cleared markers or attached
replication set as clean.
1021
RPA
Error
An error has occurred in the firmware of
an HBA. Please collect system
information and send the results to
EMC Customer Service as soon as
possible. To collect system information,
log in as boxmgmt and run Collect
System Info from the Diagnostics menu.
An internal error occurred in the
HBA firmware.
3001
RPA
Warning
RPA is no longer a cluster member.
(RPA <RPA>)
An RPA is known to be
disconnected from site control
3005
RPA
Error
Settings conflict between sites.
(Reason <reason>)
Settings conflict between the sites
was discovered
3006
RPA
Error Off
Settings conflict between sites resolved
by user. (Using Site <site> settings)
Settings conflict between the sites
was resolved by user
3020
RPA
Warning Off
The link to RPA at the other site has
been restored.
3021
RPA
Warning
Error occurred in link to RPA at the
other site.
3030
RPA
Warning
RPA switched path to storage. (RPA
<RPA>, Volume <volume>)
A storage path change was
initiated by the RPA
4056
Group
Warning
No image found in journal to match
query. (Group <group>)
No image was found in the journal
to match the query.
EMC RecoverPoint Release 3.3 Administrator’s Guide
Events
Table 50
Listing of normal events and their descriptions
Event ID
Topic
Level
Description
Trigger
4090
Group
Warning
Image access log is 90% full. When log
is full, writing by hosts at target side will
be disabled. (Group <group>)
Image access log is 90% full
4106
Group
Warning
Capacity reached -- cannot write
additional markers for this group to
<production volume>. Starting full
sweep. (Group <group>)
Disk space for markers filled (for
the group)
4100
Group
Warning
Group created. Creating a new group
modifies the load distribution across
RPAs. To balance the write load across
all RPAs, run the balance_load
command in seven days, and apply the
recommendation.
New group has been created.
4117
Group
Warning
Virtual access buffer is 90% full. When
buffer is full, writing by hosts at target
side will be disabled. (Group <group>)
Usage of virtual access buffer has
reached 90%.
4131
Group
Warning
Transfer paused or synchronizing for
unusually long time
For given copy, time that transfer
has been paused or synchronizing
exceeds pre-set value.
4132
Group
Warning Off
Transfer has resumed (following long
pause or synchronization)
For copy on which transfer had
been paused or synchronizing for
unusually long time, transfer has
now restarted.
4133
Group
Error
Copy regulation has started.
User requests copy regulation.
4134
Group
Error Off
Copy regulation has ended due to a
user action or internal timeout.
User requests to end copy
regulation.
4135
Group
Info
Data transfer to copy paused by user.
User pauses transfer to copy.
4136
Group
Info
Data transfer to copy resumed by user.
User resumes transfer to copy.
4137
Group
Info
Snapshot consolidation has been
successful.
System successfully consolidates
snapshots, according to user’s
request.
4138
Group
Warning
Snapshot consolidation has failed.
System does not successfully
consolidate snapshots, according
to user’s request.
4174
Group
Info
Migration of configuration data has
started.
Normal events
281
Events
Table 50
282
Listing of normal events and their descriptions
Event ID
Topic
Level
Description
Trigger
4175
RPA
Info
System has entered RPA addition
maintenance mode.
System begins RPA addition
maintenance mode.
4176
RPA
Info
System has entered major version
upgrade maintenance mode.
System begins major version
upgrade maintenance mode.
4177
RPA
Info
System has entered minor version
upgrade maintenance mode.
System begins major version
upgrade maintenance mode.
4178
RPA
Info
System has entered RPA replacement
maintenance mode.
System enters RPA replacement
maintenance mode.
4179
RPA
Info
System has exited RPA addition
maintenance mode.
System exits RPA addition
maintenance mode.
4180
RPA
Info
System has exited major version
upgrade maintenance mode.
System exits major version
upgrade maintenance mode.
4181
RPA
Info
System has exited minor version
upgrade maintenance mode.
System exits minor version
upgrade maintenance mode.
4182
RPA
Info
System has exited RPA replacement
maintenance mode.
System exits RPA replacement
maintenance mode.
4210
System
Warning
A virtual machine is no longer being
replicated.
The LUNs of the virtual machine
are no longer configured for
replication.
4211
System
Warning
A virtual machine is now partially
replicated.
Some LUNs of the virtual machine
are no longer configured for
replication.
4212
System
Info
A virtual machine is now fully
replicated.
All LUNs of the virtual machine are
now configured for replication.
4213
System
Error
vCenter Server is not accessible.
vCenter Server credentials may be
incorrect or there may be a
problem with the physical
connection between the vCenter
Server and RecoverPoint.
4300
Site
Warning
Writing rate to production journal is
slow
For 90% of last 10 minutes, the
production journal performed
poorly, which is defined as a delta
marker (or, in SANTap,
pwlhandler) flush > 0.5 second.
EMC RecoverPoint Release 3.3 Administrator’s Guide
Events
Table 50
Listing of normal events and their descriptions
Event ID
Topic
Level
Description
Trigger
4301
RPA
Warning
Box is unable to handle incoming data
rate, due to high compression level
WAN compression > 70% of total
CPU capacity averaged over the
last 10 minutes
4302
Group
Warning
Journal is unable to handle incoming
data rate
Accumulator
distributorPhase1TotalTime > 0.65
4303
Group
Warning
Journal and replication volumes at copy In last 10 minutes of distribution,
are unable to handle incoming data rate phase2 time > 80%, and fast
forward > 10%
4304
Group
Warning
Remote storage is unable to handle
incoming data rate; regulating
distribution
Accumulator
distributorReceiverRegulation >
0.15
4305
Group
Warning
Remote site is unable to handle
incoming data rate
Incoming data rate exceeds
distribution rate, and no problem
detected with journal performance.
4306
Site
Warning
Data transfer rate between sites is slow
One of the following accumulators
exceeds a threshold:
• vacancyObserverAccumulator
• transmitterReceiverWaitingTime
• transmitterReceiverCreditWaiting
Time
• transmitterMPIWaitTime
4307
Site
Warning
Reading rate from local replication
volumes during resynchronization is
slow
Resync during last 10 minutes,
and reading rate of the local
replication volumes at copy is <
10 Mbytes/sec
4308
Site
Warning
Reading rate from replication volumes
Resync during last 10 minutes,
at copy during resynchronization is slow and reading rate of the replication
volumes at copy is <
10 Mbytes/sec
4309
RPA
Warning
Box utilization reached 80%
Utilization exceeds 80%
4310
Group
Warning
Link utilization reached 80%
Utilization exceeds 80%
4311
Group
Info
Load balancing recommendation
User ran the balance_load CLI
command.
4400
Site
Warning Off
Writing rate to production journal is
normal
Thirty minutes after occurrence of
Event 4300, the production journal
is successfully handling incoming
writes.
Normal events
283
Events
Table 50
284
Listing of normal events and their descriptions
Event ID
Topic
Level
Description
Trigger
4401
RPA
Warning Off
Box handling incoming data rate
successfully
Thirty minutes after occurrence of
Event 4301, RPA is successfully
handling incoming data.
4402
Group
Warning Off
Journal handling incoming data rate
successfully
Thirty minutes after occurrence of
Event 4302, journal is successfully
handling incoming data.
4403
Group
Warning Off
Journal and replication volumes at copy
handling incoming data rate
successfully
Thirty minutes after occurrence of
Event 4303, journal and replication
volumes are successfully handling
incoming data.
4404
Group
Warning Off
Remote storage no longer regulating
distribution
Thirty minutes after occurrence of
Event 4304, remote storage is
successfully handling incoming
data.
4405
Group
Warning Off
Remote site handling incoming data
rate successfully
Thirty minutes after occurrence of
Event 4305, remote site is
successfully handling incoming
data.
4406
Site
Warning Off
Data transfer rate between sites is
normal
Thirty minutes after occurrence of
Event 4306, data transfer between
sites is no longer slow.
4407
Site
Warning Off
Reading rate from local replication
volumes during synchronization is
normal
Thirty minutes after occurrence of
Event 4307, reading rate from
local replication volumes during
synchronization is no longer slow.
4408
Site
Warning Off
Reading rate from replication volumes
at copy during synchronization is
normal
Thirty minutes after occurrence of
Event 4308, reading rate from
replication volumes at copy during
synchronization is no longer slow.
4409
RPA
Warning Off
Box utilization is normal
Thirty minutes after occurrence of
Event 4310, and current RPA
utilization is < 80%.
4410
Group
Warning Off
Link utilization is normal
Thirty minutes after occurrence of
Event 4310, and current link
utilization is < 80%.
5008
Splitter
Warning
Host shut down. (Host Splitter
<splitter>)
Host shutdown/restarted
EMC RecoverPoint Release 3.3 Administrator’s Guide
Events
Table 50
Listing of normal events and their descriptions
Event ID
Topic
Level
Description
Trigger
5010
Splitter
Warning
Splitter stopped -- depending on policy,
writing by host may be disabled for
some groups, and a full sweep may be
required for other groups. (Splitter
<splitter>)
Splitter stopped by user without
detaching volumes; policy
implemented per volume
5011
Splitter
Info
Splitter stopped (Splitter <splitter>); full
sweep will be required.
Splitter stopped by user after
removing volumes; volumes
disconnected
5012
Splitter
Warning
Splitter stopped (Splitter <splitter>);
writes to replication volumes disabled.
Splitter stopped; host access to all
volumes disabled
5017
Splitter
Error Off
Splitter version is supported.
Splitter version is supported.
5018
Splitter
Error
Splitter version is not supported.
Splitter version is not supported.
5052
Group
Info
RecoverPoint was successfully
configured to replicate synchronously to
one of the replicas of this group, but
since then the splitter at the replica site
has been replaced. The new splitter
does not support synchronous
replication, and consequentially, all
replication has been stopped. Replace
your current splitter with one that
supports synchronous replication. If
problem persists, contact EMC
Customer Service.
Splitter was downgraded to a
version that does not support
synchronous replication.
5053
Group
Error
RecoverPoint is configured to replicate
synchronously to a replica of this group,
and the splitter at the replica site now
supports synchronous replication.
Splitter was upgraded to a version
that does support synchronous
replication.
5054
Group
Error
Consistency group is configured with a
LUN greater than 2 TB and a CLARiiON
splitter version that does not support
the LUN.
Splitter was downgraded to a
version that does not support
LUNs greater than 2 TB.
5055
Group
Error Off
Consistency group was configured with
a LUN greater than 2 TB and a
CLARiiON splitter version that did not
support the LUN. Now the splitter
version supports the LUN.
Splitter was upgraded to a version
that supports LUNs greater than 2
TB.
Normal events
285
Events
Table 50
286
Listing of normal events and their descriptions
Event ID
Topic
Level
Description
Trigger
10000
-
Info
Changes occurring in system.
Analyzing...
-
10001
-
Info
System changes have occurred.
System is now stable.
-
10002
-
Info
System activity has not stabilized -issuing an intermediate report.
-
10101
-
Error
Cause of system activity unclear. To
obtain more information, filter events
log using Detailed scope.
-
10102
-
Info
Site control recorded internal changes
that do not impact system operation.
-
10201
-
Info
Settings have changed.
-
10202
-
Info
System changes have occurred at the
other site.
-
10203
-
Error
RPA cluster was down.
-
10204
-
Error
One or more RPAs are disconnected
from the RPA cluster.
-
10205
-
Brief Error
An error in communication has
occurred in an internal process.
-
10206
-
Info
Internal process was restarted.
-
10207
-
Error
Internal process was restarted.
-
10210
-
Error
Initialization is experiencing high load
conditions.
-
10211
-
Error
Temporary problem in Fibre Channel
link between splitters and RPAs.
-
10212
-
Error Off
Temporary problem in Fibre Channel
link between splitters and RPAs -resolved.
-
10501
-
Info
Synchronization completed.
-
10502
-
Info
Access to target-side image enabled
-
EMC RecoverPoint Release 3.3 Administrator’s Guide
Events
Table 50
Listing of normal events and their descriptions
Event ID
Topic
Level
Description
Trigger
10503
-
Error
Transferring latest snapshot before
pausing transfer (no data loss).
-
10504
-
Info
Journal cleared.
-
10505
-
Info
Undoing of writes to image access log
completed.
-
10506
-
Info
Roll to physical image complete -logged access to physical image now
enabled.
-
10507
-
Info
Due to system changes, the journal was
temporarily out of service. Journal is
now available.
-
10508
-
Info
All data flushed from local-side RPA;
automatic failover will proceed.
-
10509
-
Info
Initial long resync completed.
-
10510
-
Info
Following a pause transfer, system now
cleared to restart transfer.
-
10511
-
Info
Finished recovering replication backlog. -
12001
-
Error
Splitter is down.
-
12002
-
Error
Error occurred in all links to the other
site; the other site may be down.
-
12003
-
Error
Error occurred in link to RPA at the
other site.
-
12004
-
Error
Error in data link over WAN -- all RPAs
unable to transfer replicated data to
other site.
-
12005
-
Error
Error in data link over WAN -- RPA
unable to transfer replicated data to
other site.
-
12006
-
Error
RPA is disconnected from the RPA
cluster.
-
12007
-
Error
All RPAs are disconnected from the
RPA cluster.
-
12008
-
Error
RPA is down.
-
Normal events
287
Events
Table 50
288
Listing of normal events and their descriptions
Event ID
Topic
Level
Description
Trigger
12009
-
Error
Group entered high load. If high load
persists, consider running the
balance_load command and applying
the load balancing recommendation, or
manually modifying the preferred RPA
of each group according to the
recommendation.
-
12010
-
Error
Journal error -- full sweep to be
performed after error is corrected.
-
12011
-
Error
Image access log or virtual buffer is full
-- writing by hosts at target side is
disabled.
-
12012
-
Error
Unable to enable virtual access to
image.
-
12013
-
Error
Unable to enable access to specified
image.
-
12014
-
Error
Fibre Channel link between all RPAs
and all splitters and storage is down.
-
12016
-
Error
Fibre Channel link between all RPAs
and all storage is down.
-
12022
-
Error
Fibre Channel link between RPA and
splitters or storage volumes (or both) is
down.
-
12023
-
Error
Fibre Channel link between RPA and all
splitters and storage is down.
-
12024
-
Error
Fibre Channel link between RPA and all
splitters is down.
-
12025
-
Error
Fibre Channel link between RPA and all
storage is down.
-
12026
-
Error
Error occurred in link to RPA at the
other site.
-
12027
-
Error
All replication volumes attached to the
consistency group (or groups) are not
accessible.
-
12029
-
Error
Fibre Channel link between all RPAs
and one or more volumes is down.
-
EMC RecoverPoint Release 3.3 Administrator’s Guide
Events
Table 50
Listing of normal events and their descriptions
Event ID
Topic
Level
Description
Trigger
12033
-
Error
Repository volume is not accessible;
data may be lost.
-
12034
-
Error
Writes to storage occurred without
corresponding writes to RPA.
-
12035
-
Error
Error in WAN link to RPA cluster at
other site.
-
12036
-
Error
Renegotiation of transfer protocol
requested.
-
12037
-
Error
All volumes attached to the consistency
group (or groups) are not accessible.
-
12038
-
Error
All journal volumes attached to the
consistency group (or groups) are not
accessible.
-
12039
-
Error
Long resync started.
-
12040
-
Error
System has detected bad sectors in
volume.
-
12041
-
Error
Splitter is up.
-
12042
-
Error
Splitter write may have failed (while
group was transferring data).
Synchronization will be required.
-
12043
-
Error
Splitter writes may have failed.
-
12044
-
Error
Problem with IP link between RPAs (in
at least in one direction).
-
12045
-
Error
Problem with all IP links between RPAs. -
12046
-
Error
Problem with IP link between RPAs.
-
12047
-
Error
RPA network interface card (NIC)
problem.
-
12048
-
Error
Splitter version is not supported.
-
12049
-
Info
RPA has entered maintenance mode.
-
Normal events
289
Events
Table 50
290
Listing of normal events and their descriptions
Event ID
Topic
Level
Description
Trigger
12050
-
Warning
RecoverPoint has dynamically started
replicating asynchronously to one of the
replicas of this group. The group will
now be initialized. During initialization,
data is not transferred synchronously.
-
12054
-
Error
RPA to storage connectivity failure.
-
12055
-
Error
RPA to CLARiiON storage connectivity
failure.
-
12056
-
Error
RPA to CLARiiON storage/splitter
connectivity failure.
-
12072
-
Error
Fibre channel link between <RPAs> is
down.
-
14001
-
Error Off
Splitter is up, and version is supported.
-
14002
-
Error Off
All WAN links to other site restored.
-
14003
-
Error Off
The link to RPA at the other site has
been restored.
-
14004
-
Error Off
Data link over WAN restored -- all RPAs
able to transfer replicated data to other
site.
-
14005
-
Error Off
Data link over WAN restored -- RPA
able to transfer replicated data to other
site.
-
14006
-
Error Off
Connection of RPA to the RPA cluster is
restored.
-
14007
-
Error Off
Connection of all RPAs to the RPA
cluster is restored.
-
14008
-
Error Off
RPA is up.
-
14009
-
Error Off
Group exited high load – initialization
completed. If high load persists,
consider running the balance_load
command and applying the load
balancing recommendation, or
manually modifying the preferred RPA
of each group according to the
recommendation.
-
EMC RecoverPoint Release 3.3 Administrator’s Guide
Events
Table 50
Listing of normal events and their descriptions
Event ID
Topic
Level
Description
Trigger
14010
-
Error Off
Journal error corrected -- full sweep
required.
-
14011
-
Error Off
Image access log or virtual buffer no
longer full.
-
14012
-
Error Off
Virtual access to image enabled.
-
14013
-
Error Off
No longer trying to access a diluted
image.
-
14014
-
Error Off
Fibre Channel link between all RPAs
and all splitters and storage is restored.
14016
-
Error Off
Fibre Channel link between all RPAs
and all storage is restored.
-
14022
-
Error Off
Fibre Channel link that was down
between RPA and splitters or storage
volumes (or both) is restored.
-
14023
-
Error Off
Fibre Channel link between RPA and all
splitters and storage is restored.
-
14024
-
Error Off
Fibre Channel link between RPA and all
splitters is restored.
-
14025
-
Error Off
Fibre Channel link between RPA and all
storage is restored.
-
14026
-
Error
The link to RPA at the other site has
been restored.
-
14027
-
Error Off
Access to all volumes attached to the
consistency group (or groups) is
restored.
-
14029
-
Error Off
Fibre Channel link between all RPAs
and one or more volumes is restored.
-
14033
-
Error Off
Access to repository volume is
restored.
-
14034
-
Error Off
Replication consistency in writes to
storage has been restored.
-
14035
-
Error Off
WAN link to RPA at other site is
restored.
-
Normal events
291
Events
Table 50
292
Listing of normal events and their descriptions
Event ID
Topic
Level
Description
Trigger
14036
-
Error Off
Renegotiation of transfer protocol
completed.
-
14037
-
Error Off
Access to all replication volumes
attached to the consistency group (or
groups) has been restored.
-
14038
-
Error Off
Access to all journal volumes attached
to the consistency group (or groups) is
restored.
-
14039
-
Info
Long resync completed.
-
14040
-
Error Off
System has detected correction of bad
sectors in volume.
-
14041
-
Error Off
System has detected that volume is no
longer read only.
-
14042
-
Error Off
Synchronization in progress to restore
any failed writes in group.
-
14043
-
Error Off
Synchronization in progress to restore
any failed writes.
-
14044
-
Error Off
Problem with IP link between RPAs (in
at least in one direction) corrected.
-
14045
-
Error Off
All IP links between RPAs restored.
-
14046
-
Error Off
IP link between RPAs restored.
-
14047
-
Error Off
RPA network interface card (NIC)
problem corrected.
-
14049
-
Info
RPA is out of maintenance mode.
-
14050
-
Info
RecoverPoint has dynamically resumed
synchronous replication to one of the
replicas of this group. The group will
now be initialized. During initialization,
data is not transferred synchronously.
-
14054
-
Error off
End of RPA to storage connectivity
failure.
-
14055
-
Error off
End of RPA to CLARiiON storage
connectivity failure.
-
EMC RecoverPoint Release 3.3 Administrator’s Guide
Events
Table 50
Listing of normal events and their descriptions
Event ID
Topic
Level
Description
Trigger
14056
-
Error off
End of RPA to CLARiiON
storage/splitter connectivity failure.
-
14072
-
Error off
Fibre channel link between <RPAs> has
been restored.
-
16000
-
Error
Transient root cause
-
16001
-
Error
Splitter was down. Problem has been
corrected.
-
16002
-
Error
Error occurred in all WAN links to other
site. Problem has been corrected.
-
16003
-
Error
Error occurred in link to the RPA cluster
at the other site. Problem has been
corrected.
-
16004
-
Error
Error occurred in data link over WAN -all RPAs were unable to transfer
replicated data to other site. Problem
has been corrected.
-
16005
-
Error
Error occurred in data link over WAN -RPA was unable to transfer replicated
data to other site. Problem has been
corrected.
-
16006
-
Error
RPA was disconnected from the RPA
cluster. Connection has been restored.
-
16007
-
Error
All RPAs were disconnected from the
RPA cluster. Problem has been
corrected.
-
16008
-
Error
RPA was down. Problem has been
corrected.
-
16009
-
Error
Group entered high load. Problem has
been corrected. If high load persists,
consider running the balance_load
command and applying the load
balancing recommendation, or
manually modifying the preferred RPA
of each group according to the
recommendation.
-
16010
-
Error
Journal error occurred. Problem has
been corrected -- full sweep required.
-
Normal events
293
Events
Table 50
294
Listing of normal events and their descriptions
Event ID
Topic
Level
Description
Trigger
16011
-
Error
Image access log or virtual buffer was
full -- writing by hosts at target side was
disabled. Problem has been corrected.
-
16012
-
Error
Was unable to enable virtual access to
image. Problem has been corrected.
-
16013
-
Error
Was unable to enable access to
specified image. Problem has been
corrected.
-
16014
-
Error
Fibre Channel link between all RPAs
and all splitters and storage was down.
Problem has been corrected.
-
16016
-
Error
Fibre Channel link between all RPAs
and all storage was down. Problem has
been corrected.
-
16022
-
Error
Fibre Channel link between RPA and
splitters or storage volumes (or both)
was down. Problem has been
corrected.
-
16023
-
Error
Fibre Channel link between RPA and all
splitters and storage was down.
Problem has been corrected.
-
16024
-
Error
Fibre Channel link between RPA and all
splitters was down. Problem has been
corrected.
-
16025
-
Error
Fibre Channel link between RPA and all
storage was down. Problem has been
corrected.
-
16026
-
Error
Error occurred in link to the RPA cluster
at the other site. Problem has been
corrected.
-
16027
-
Error
All volumes attached to the consistency
group (or groups) were not accessible.
Problem has been corrected.
-
16029
-
Error
Fibre Channel link between all RPAs
and one or more volumes was down.
Problem has been corrected.
-
EMC RecoverPoint Release 3.3 Administrator’s Guide
Events
Table 50
Listing of normal events and their descriptions
Event ID
Topic
Level
Description
Trigger
16033
-
Error
Repository volume was not accessible.
Problem has been corrected.
-
16034
-
Error Off
Writes to storage occurred without
corresponding writes to RPA. Problem
has been corrected.
-
16035
-
Error
Error occurred in link to the RPA cluster
at the other site. Problem has been
corrected.
-
16036
-
Error
Renegotiation of transfer protocol was
requested, and has already been
completed.
-
16037
-
Error
All replication volumes attached to the
consistency group (or groups) were not
accessible. Problem has been
corrected.
-
16038
-
Error
All journal volumes attached to the
consistency group (or groups) were not
accessible. Problem has been
corrected.
-
16039
-
Info
System ran long resync.
-
16040
-
Error
System had detected bad sectors in
volume. Problem has been corrected.
-
16041
-
Error
System had detected that volume is
read only. Problem has been corrected.
-
16042
-
Error
Splitter write may have failed (while
group was transferring data). Problem
has been corrected.
-
16043
-
Error
Splitter writes may have failed.
-
16044
-
Error
There was a problem with an IP link
between RPAs (in at least in one
direction). Problem has been corrected.
16045
-
Error
There was a problem with all IP links
between RPAs. Problem has been
corrected.
-
Normal events
295
Events
Table 50
296
Listing of normal events and their descriptions
Event ID
Topic
Level
Description
Trigger
16046
-
Error
There was a problem with an IP link
between RPAs. Problem has been
corrected.
-
16047
-
Error
There was an RPA network interface
card (NIC) problem. Problem has been
corrected.
-
16048
-
Brief Error
Splitter version was not supported.
Problem has been corrected.
-
16049
-
Info
RPA temporarily entered maintenance
mode, but has since exited.
-
16050
-
Warning
RecoverPoint had dynamically started
replicating asynchronously to one of the
replicas of this group, but has since
resumed synchronous replication.
Consequentially, the group has been
initialized twice. During initialization,
data was not transferred synchronously.
If this is not the expected behavior,
contact EMC Customer Service.
-
16054
-
Error brief
End of brief RPA to storage connectivity
failure.
-
16055
-
Error brief
End of brief RPA to CLARiiON storage
connectivity failure.
-
16056
-
Error brief
End of brief RPA to CLARiiON
storage/splitter connectivity failure.
-
16072
-
Error
Fibre channel link between <RPAs>
was down, but the problem has been
corrected, and the link is back up again.
18001
-
Error
Splitter was up and supported, but it is
now down or not supported.
-
18002
-
Error
All links to other site were temporarily
restored, but problem has returned.
-
18003
-
Error
Link to the RPA cluster at the other site
was temporarily restored, but the
problem has returned.
-
EMC RecoverPoint Release 3.3 Administrator’s Guide
Events
Table 50
Listing of normal events and their descriptions
Event ID
Topic
Level
Description
Trigger
18004
-
Error Off
Data link was temporarily restored, but
problem has returned -- all RPAs are
unable to transfer replicated data to
other site.
-
18005
-
Error Off
Data link was temporarily restored, but
problem has returned -- RPA is
currently unable to transfer replicated
data to other site.
-
18006
-
Error Off
Connection of RPA to the RPA cluster
was temporarily restored, but problem
has returned.
-
18007
-
Error Off
All RPAs were temporarily restored to
the RPA cluster, but problem has
returned.
-
18008
-
Error Off
RPA was temporarily up, but problem
has returned – RPA is down
-
18009
-
Error Off
Group temporarily exited high load, but
problem has returned
-
18010
-
Error Off
Journal error was temporarily
corrected, but problem has returned.
-
18011
-
Error Off
Image access log or virtual buffer was
temporarily no longer full, and writing by
hosts at target side was re-enabled -but problem has returned.
-
18012
-
Error Off
Virtual access to image was temporarily
enabled, but problem has returned.
-
18013
-
Error Off
Access to image was temporarily
enabled, but problem has returned.
-
18014
-
Error Off
Fibre Channel link between all RPAs
and all splitters and storage was
temporarily restored, but problem has
returned.
-
18016
-
Error Off
Fibre Channel link between all splitters
and all storage was temporarily
restored, but problem has returned.
-
Normal events
297
Events
Table 50
298
Listing of normal events and their descriptions
Event ID
Topic
Level
Description
Trigger
18022
-
Error Off
Fibre Channel link that was down
between RPA and splitters or storage
volumes (or both) was temporarily
restored, but problem has returned.
-
18023
-
Error Off
Fibre Channel link between RPA and all
storage was temporarily restored, but
problem has returned.
-
18024
-
Error Off
Fibre Channel link between RPA and all
splitters was temporarily restored, but
problem has returned.
-
18025
-
Error Off
Fibre Channel link between RPA and all
storage was temporarily restored, but
problem has returned.
-
18026
-
Error
Link to the RPA cluster at the other site
was temporarily restored, but the
problem has returned.
-
18027
-
Error Off
Access to all journal volumes attached
to the consistency group (or groups)
was temporarily restored, but problem
has returned.
-
18029
-
Error Off
Fibre Channel link between all RPAs
and one or more volumes was
temporarily restored, but problem has
returned.
-
18033
-
Error Off
Access to repository volume was
temporarily restored, but problem has
returned.
-
18034
-
Error Off
Replication consistency in writes to
storage and writes to RPAs was
temporarily restored, but problem has
returned.
-
18035
-
Error Off
Link to the RPA cluster at the other site
was temporarily restored, but the
problem has returned.
-
18036
-
Error Off
Negotiation of transfer protocol was
completed, but is now requested again.
-
EMC RecoverPoint Release 3.3 Administrator’s Guide
Events
Table 50
Listing of normal events and their descriptions
Event ID
Topic
Level
Description
Trigger
18037
-
Error Off
Access to all volumes attached to the
consistency group (or groups) was
temporarily restored, but problem has
returned.
-
18038
-
Error Off
Access to all replication volumes
attached to the consistency group (or
groups) had been temporarily restored,
but problem has returned.
-
18039
-
Info
Long resync was completed, but has
now restarted.
-
18040
-
Error Off
User marked volume as OK, but bad
sectors problem persists.
-
18041
-
Error Off
User marked volume as OK, but
read-only problem persists.
-
18042
-
Error Off
Synchronization had restored any failed
writes in group, but problem has
returned.
-
18043
-
Error Off
Internal problem.
-
18044
-
Error Off
Problem with IP link between RPAs (in
at least in one direction) was corrected,
but problem has returned.
-
18045
-
Error Off
Problem with all IP links between RPAs
(in at least in one direction) was
corrected, but problem has returned.
-
18046
-
Error Off
Problem with IP link between RPAs was
corrected, but problem has returned.
-
18047
-
Error Off
RPA network interface card (NIC)
problem was corrected, but problem
has returned.
-
18049
-
Info
RPA temporarily exited maintenance
mode, but has since re-entered.
-
Normal events
299
Events
Table 50
300
Listing of normal events and their descriptions
Event ID
Topic
Level
Description
Trigger
18050
-
Info
RecoverPoint had dynamically resumed
synchronous replication to one of the
replicas of this group, but has since
started replicating asynchronously
again. Consequentially, the group has
been initialized twice. During
initialization, data was not transferred
synchronously. If this is not the
expected behavior, contact EMC
Customer Service.
-
18054
-
Error
Recurring RPA to storage connectivity
failure.
-
18055
-
Error
Recurring RPA to CLARiiON storage
connectivity failure.
-
18056
-
Error
Recurring RPA to CLARiiON
storage/splitter connectivity failure.
-
18072
-
Error
Fibre channel link between <RPAs>
was temporarily restored, but the
problem has returned, and the link is
back down again.
-
EMC RecoverPoint Release 3.3 Administrator’s Guide
Events
Detailed events
Detailed events contain more information than normal-scope events.
Table 51
Listing of detailed events and their descriptions
Event ID
Topic
Level
Description
Trigger
1002
Management
Info
User logged out. (User <user>)
User logged off from the
system
1010
Management
Warning
Grace period expires in 1 day. You must
install an activation code to activate
your RecoverPoint license.
1 day prior to grace period
expiration
1012
Management
Warning
License expires in 1 day. You must obtain a
new RecoverPoint license.
1 day prior to RecoverPoint
license expiration
1013
Management
Error
License expired. You must obtain a new
RecoverPoint license.
RecoverPoint license expired
2000
Site
Info
Site management running on <RPA>.
Site control open; RPA has
become cluster leader
3000
RPA
Info
RPA has become a cluster member. (RPA
<RPA>)
RPA connected to site control
3002
RPA
Warning
Site management switched over to this
RPA. (RPA <RPA>, Reason <reason>)
Leadership is transferred from
RPA to RPA
3007
RPA
Warning Off
RPA is up. (RPA <RPA>).
RPA that was previously down
came up
3008
RPA
Warning
RPA appears to be down. (RPA <RPA>)
RPA suspects that other RPA
is down
3011
RPA
Info
RPA access to volumes restored. (RPA
<RPA>, Volume <volume>, Volume Type
<type>)
Volumes that were
inaccessible became
accessible
3012
RPA
Warning
RPA unable to access volumes. (RPA
<RPA>, Volume <volume>, Volume Type
<type>)
Volumes ceased to be
accessible to the RPA
3013
RPA
Warning Off
RPA access to <repository volume>
restored. (RPA <RPA>, Volume <volume>)
Repository volume that was
inaccessible became
accessible
3014
RPA
Warning
RPA unable to access <repository
volume>. (RPA <RPA>, Volume <volume>)
Repository volume became
inaccessible to a single RPA
Detailed events
301
Events
Table 51
302
Listing of detailed events and their descriptions
Event ID
Topic
Level
Description
Trigger
3020
RPA
Warning Off
WAN link to RPA at other site restored.
(RPA at other site: <RPA>)
RPA regained the WAN
connection with an RPA at the
other site
3021
RPA
Warning
Error in WAN link to RPA at other site. (RPA
at other site: <RPA>)
RPA lost the WAN connection
with an RPA at the other site
3022
RPA
Warning Off
LAN link to RPA restored. (RPA <RPA>)
RPA regained the LAN
connection with an RPA at the
local site
3023
RPA
Warning
Error in LAN link to RPA. (RPA <RPA>)
RPA lost the LAN connection
with an RPA at the local site,
without losing connection
through the repository volume
3035
RPA
Info
An internal process restarted, starting
control process.
A control process is triggered
(for various reasons).
4000
Group
Info
Group capabilities OK. (Group <group>)
Capabilities are full and
previous capabilities are
unknown
4001
Group
Warning
Group capabilities minor problem. (Group
<group>)
Capabilities are either: 1) not
full temporarily on RPA on
which the group is currently
running, or 2) not full
indefinitely on the RPA on
which the group is not running
4003
Group
Error
Group capabilities problem (Group
<group>)
Capabilities are not full
indefinitely on the RPA on
which the group is running
4007
Group
Info
Pausing data transfer. (Group <group>,
Reason: <reason>)
Stop transfer by user
4008
Group
Warning
Pausing data transfer. (Group <group>,
Reason: <reason>)
Transfer stopped temporarily
by system
4009
Group
Error
Pausing data transfer. (Group <group>,
Reason: <reason>)
Transfer stopped indefinitely
by system
4010
Group
Info
Starting data transfer. (Group <group>)
Start transfer requested by
user
4015
Group
Info
Transferring latest snapshot before pausing
transfer (no data loss). (Group <group>)
In total storage disaster -flushing buffer before stopping
replication
EMC RecoverPoint Release 3.3 Administrator’s Guide
Events
Table 51
Listing of detailed events and their descriptions
Event ID
Topic
Level
Description
Trigger
4016
Group
Warning
Transferring latest snapshot before pausing
transfer (no data loss). (Group <group>)
In total storage disaster -flushing buffer before stopping
replication
4017
Group
Error
Transferring latest snapshot before pausing
transfer (no data loss). (Group <group>)
In total storage disaster -flushing buffer before stopping
replication
4018
Group
Warning
Transfer of latest snapshot from source
complete (no data loss). (Group <group>)
In total storage disaster -- last
snapshot from source site is
available at target site
4019
Group
Warning
Group in high load -- transfer to be paused
temporarily. (Group <group>)
Disk manager high load
4020
Group
Warning Off
Group is no longer in high load (Group
<group>)
End of disk manager high load
4021
Group
Error
Journal full -- initialization paused. To
complete initialization, enlarge journal or
allow long resync. (Group <group>)
In initialization -- journal is full
and long resync is not allowed
4022
Group
Error Off
Initialization resumed. (Group <group>)
End of situation, in
initialization, where journal is
full and long resync is not
allowed
4023
Group
Error
Journal full -- transfer paused. To restart
transfer, first disable access to image.
(Group <group>)
Access to image is enabled
and journal is full
4024
Group
Error Off
Transfer restarted. (Group <group>)
End of situation where access
to image is enabled and
journal is full
4025
Group
Warning
Group in high load -- initialization to be
restarted. (Group <group>)
Group in high load;
initialization to be restarted
4026
Group
Warning Off
Group is no longer in high load. (Group
<group>)
End of high load for group
4027
Group
Error
Group in high load -- Journal full, roll to
physical image paused, transfer paused.
(Group <group>)
No space left to which to write
during roll.
4028
Group
Error Off
Group is no longer in high load. (Group
<group>)
Added journal capacity or
disabled image access.
Detailed events
303
Events
Table 51
304
Listing of detailed events and their descriptions
Event ID
Topic
Level
Description
Trigger
4040
Group
Error
Journal error -- full sweep to be performed.
(Group <group>)
Journal volume error
4041
Group
Info
Group activated. (Group <group>, RPA
<RPA>)
Group replication-ready; i.e.,
replication could take place if
other factors are OK, such as
network, RPAs, storage
access.
4042
Group
Info
Group deactivated. (Group <group>, RPA
<RPA>)
Group deactivated as a result
of user action
4043
Group
Warning
Group deactivated. (Group <group>, RPA
<RPA>)
Group temporarily deactivated
by system
4044
Group
Error
Group deactivated. (Group <group>, RPA
<RPA>)
Group deactivated indefinitely
by system
4051
Group
Info
Disabling access to image -- resuming
distribution. (Group <group>)
Access to image is disabled
(i.e., distribution is resumed)
by user
4054
Group
Error
Enabling access to image. (Group
<group>)
Access enabled to image
indefinitely by system
4057
Group
Warning
Specified image removed from journal. Try
a later image. (Group <group>)
Specified image was removed
from the journal (i.e., FIFO).
4062
Group
Info
Access enabled to latest image. (Group
<group>, Failover site <site>)
Access enabled to latest
image during automatic
failover
4063
Group
Warning
Access enabled to latest image. (Group
<group>, Failover site <site>)
Access enabled to latest
image during automatic
failover
4064
Group
Error
Access enabled to latest image. (Group
<group>, Failover site <site>)
Access enabled to latest
image during automatic
failover
4080
Group
Warning
Current lag exceeds maximum lag. (Group
<group>, Lag <lag>, Maximum lag <max
lag>)
Group’s lag exceeds maximum
lag (when not regulating
application)
4081
Group
Warning Off
Current lag within policy. (Group <group>,
Lag <lag>, Maximum Lag <max_lag>)
Group’s lag drops from above
the maximum lag to below
90% of maximum.
4082
Group
Warning
Starting full sweep. (Group <group>)
Group’s markers set
EMC RecoverPoint Release 3.3 Administrator’s Guide
Events
Table 51
Listing of detailed events and their descriptions
Event ID
Topic
Level
Description
Trigger
4083
Group
Warning
Starting volume sweep. (Group <group>,
Pair <pair>)
Volume's markers set
4084
Group
Info
Markers cleared. (Group <group>)
Group’s markers cleared
4085
Group
Warning
Unable to clear markers. (Group <group>)
Attempt to clear group’s
markers failed
4086
Group
Info
Initialization started. (Group <group>)
Initialization started
4087
Group
Info
Initialization completed. (Group <group>)
Initialization completed
4091
Group
Error
Image access log is full -- writing by hosts
at target side disabled. (Group <group>,
Site <site>)
Image access log is full
4095
Group
Info
Writing image access log to storage -writes to log cannot be undone. (Group
<group>)
Started marking to retain
writes to image access log
4097
Group
Warning
Maximum journal lag exceeded.
Distribution in fast-forward -- older images
removed from journal. (Group <group>)
Fast-forward started (causing
loss of snapshots from before
maximum journal lag was
exceeded)
4098
Group
Warning Off
Maximum journal lag within limit.
Distribution normal -- rollback information
retained. (Group <group>)
Five minutes passed since
fast-forward stopped
4099
Group
Info
Initializing in long resync mode. (Group
<group>)
Started long resync
4107
Group
Info
Integrity check completed; no
inconsistencies found.
Integrity check completed
successfully, and no
inconsistencies were found.
4108
Group
Info
Integrity check completed, inconsistencies
found.
Integrity check completed
successfully, but
inconsistencies were found.
4109
Group
Error
Integrity check was aborted by system.
The preferred RPA setting is
modified or image access is
enabled for a copy of the
specified consistency group.
4110
Group
Info
Enabling virtual access to image. (Group
<group>)
User initiated enabling virtual
access to an image
Detailed events
305
Events
Table 51
306
Listing of detailed events and their descriptions
Event ID
Topic
Level
Description
Trigger
4111
Group
Info
Virtual access to image enabled. (Group
<group>)
Virtual access to an image has
been enabled
4112
Group
Info
Rolling to physical image. (Group <group>)
Rolling to the image (in
background) while virtual
access to the image is
enabled.
4113
Group
Info
Roll to physical image stopped. (Group
<group>)
Rolling to the image (i.e., in
background, while virtual
access to the image is
enabled) is stopped.
4114
Group
Info
Roll to physical image complete – logged
access to physical image now enabled.
(Group <group>)
System completed roll to
physical image.
4115
Group
Error
Unable to enable access to virtual image,
due to partition table error. (The partition
table on at least one of the volumes in
group <group> has been modified since
logged access was last enabled to a
physical image. To enable access to a
virtual image, first enable logged access to
a physical image.)
Attempt to pause on a virtual
image is unsuccessful (due to
a change in the partition table
of a volume (or volumes) in the
group).
4116
Group
Error
Virtual access buffer is full -- writing by
hosts at target side is disabled. (Group
<group>)
Attempt to write to the virtual
image is unsuccessful
(because virtual access buffer
usage is 100%).
4118
Group
Error
Unable to enable virtual access to an
image. (Group <group>)
Attempt to enable virtual
access to the image is
unsuccessful (due to
insufficient memory).
4119
Group
Error
Initiator issued an out of bounds I/O.
Contact Technical Support. (Initiator
<initiator wwn>, Group <group>, Volume
<volume>)
Configuration problem
4120
Group
Warning
Journal usage (with logged access
enabled) now exceeds this threshold.
(Group <group>, <journal usage
threshold>)
Journal usage (with logged
image access enabled) has
passed a specified threshold.
EMC RecoverPoint Release 3.3 Administrator’s Guide
Events
Table 51
Listing of detailed events and their descriptions
Event ID
Topic
Level
Description
Trigger
4121
Group
Error
Unable to gain permissions to write to
replica.
RPAs unable to write to
replication or journal volumes
because they do not have
proper permissions.
4122
Group
Error Off
Trying to regain permissions to write to
replica
User has indicated that the
permissions problem has been
corrected.
4123
Group
Error
Unable to access volumes -- bad sectors
encountered
RPAs unable to write to
replication or journal volumes
due to bad sectors on the
storage.
4124
Group
Error Off
Trying to access volumes that previously
had bad sectors
User has indicated that the
bad sectors problem has been
corrected.
4125
Group
Error
Journal capacity is currently insufficient for
the required protection window.
User specified a required
protection window, but journal
does not support rollback of
that size (even though it is full).
4126
Group
Error Off
Journal capacity is currently sufficient for
the required protection window.
User specified a required
protection window; journal did
not support rollback of that
size, but now does.
4127
Group
Warning
Journal capacity is predicted to be
insufficient for the required protection
window.
User specified a required
protection window; system
predicts that journal will not
support rollback of that size.
4128
Group
Warning Off
Journal capacity is predicted to be
sufficient for the required protection
window.
User specified a required
protection window; system
predicted that journal would
not support rollback of that
size, but now predicts that it
will.
4129
Group
Warning
Image access is enabled on group copy for
unusually long time
For given copy, time that
image access has been
enabled on same image
exceeds pre-set value.
Detailed events
307
Events
Table 51
308
Listing of detailed events and their descriptions
Event ID
Topic
Level
Description
Trigger
4130
Group
Warning Off
Image access on group copy is now
disabled (or is enabled on a different
image)
For copy, where image access
had been enabled on same
image for unusually long time,
user has now disabled image
access (or enabled access to
a different image).
5000
Splitter
Info
Splitters attached to volume. (Splitter
<splitter>, Volume <volume>)
User attached a splitter to a
volume
5001
Splitter
Info
Splitters detached from volume. (Splitter
<splitter>, Volume <volume>)
User detached a splitter from a
volume
5002
Splitter
Error
RPA unable to access splitter. (Splitter
<splitter>, RPA <RPA>)
RPA is unable to access a
splitter
5003
Splitter
Error Off
RPA access to splitter restored. (Splitter
<splitter>, RPA <RPA>)
RPA can access a splitter that
was previously inaccessible
5004
OBSOLETE
-
-
-
5005
OBSOLETE
-
-
-
5006
OBSOLETE
-
-
-
5007
OBSOLETE
-
-
-
5013
Splitter
Error
Splitter is down. (Splitter <splitter>)
Connection to splitter lost with
no warning; splitter crashed, or
connection is down
5015
Splitter
Error Off
Splitter is up. (Splitter <splitter>)
Connection to splitter regained
after splitter crash
5016
Splitter
Warning
Splitter has restarted (Splitter <splitter>)
Boot timestamp of splitter has
changed
5030
Splitter
Error
Splitter write is suspected of possible
failure. (Splitter <splitter>, Group <group>)
Splitter write succeeded to
RPA, but not necessarily to
storage
5031
Splitter
Warning
Splitter not splitting to replication volumes -volume sweep/s will be required. (Host
<Host>, Volumes <Volume Names>,
Groups <Groups>)
Splitter not splitting to
replication volumes
5032
Splitter
Info
Splitter splitting to replication volumes.
(Host <Host>, Volumes <Volume Names>,
Groups <Groups>)
Splitter started splitting to
replication volumes
EMC RecoverPoint Release 3.3 Administrator’s Guide
Events
Table 51
Listing of detailed events and their descriptions
Event ID
Topic
Level
Description
Trigger
5035
Splitter
Info
Writes to replication volumes disabled.
(Splitter <splitter>, Volumes <Volume
Names>, Groups <Groups>)
Writes to replication volumes
disabled
5036
Splitter
Warning
Writes to replication volumes disabled.
(Host <Host>, Volumes <Volume Names>,
Groups <Groups>)
Writes to replication volumes
disabled
5037
Splitter
Error
Writes to replication volumes disabled.
(Splitter <splitter>, Volumes <Volume
Names>, Groups <Groups>)
Writes to replication volumes
disabled
5038
Splitter
Info
Splitter delaying writes. (Splitter <splitter>,
Volumes <volumes>, Groups <groups>)
-
5039
Splitter
Warning
Splitter delaying writes. (Splitter <splitter>,
Volumes <volumes>, Groups <groups>)
-
5040
Splitter
Error
Splitter delaying writes. (Splitter <splitter>,
Volumes <volumes>, Groups <groups>)
-
5041
Splitter
Info
Splitter not splitting to replication volumes.
(Splitter <splitter>, Volumes <volumes>,
Groups <groups>)
Splitter not splitting to
replication volumes as result
of user decision
5042
Splitter
Warning
Splitter not splitting to replication volumes.
(Splitter <splitter>, Volumes <volumes>,
Groups <groups>)
Splitter not splitting to
replication volumes
5043
Splitter
Error
Splitter not splitting to replication volumes.
(Splitter <splitter>, Volumes <volumes>,
Groups <groups>)
Splitter not splitting to
replication volumes due to
system action
5045
Splitter
Warning
Simultaneous problems reported in splitter
and RPA. Full-sweep resynchronization will
be required upon restarting data transfer.
Marking backlog on splitter
lost (as result of concurrent
double disaster to splitter and
RPA)
5046
Splitter
Warning
Transient error – re-issuing splitter write.
-
5050
Splitter
Warning
Failed to collect system information. Verify
that the correct login credentials have been
defined for this splitter.
-
Detailed events
309
Events
Table 51
310
Listing of detailed events and their descriptions
Event ID
Topic
Level
Description
Trigger
5051
Splitter
Warning
No login credentials have been defined for
this splitter. Define login credentials to
extend the period in which system
information is saved, from three days to
thirty days. For support purposes, it is
recommended that you do so as soon as
possible.
-
6000
Group
Error
An unrecognizable error has occurred. The
specified image cannot be accessed. Try
accessing a different image. If you cannot
access any other images, contact EMC
Customer Service.
-
6001
Group
Error off
The system has stopped trying to access
an inaccessible image of a distributed
group. No user action is required.
-
EMC RecoverPoint Release 3.3 Administrator’s Guide
B
Kutils Reference
Kutils Reference
This section presents information on the syntax and use of each of the
commands available as part of the kutils utility.
◆
◆
◆
◆
◆
◆
◆
◆
◆
◆
◆
◆
Introduction ......................................................................................
flushFS ...............................................................................................
manage_auto_host_info_collection ...............................................
mount.................................................................................................
showFS...............................................................................................
show_vol_info ..................................................................................
show_vols..........................................................................................
sqlRestore ..........................................................................................
sqlSnap...............................................................................................
start.....................................................................................................
stop .....................................................................................................
umount ..............................................................................................
Kutils Reference
312
315
316
317
318
319
320
321
323
326
327
328
311
Kutils Reference
Introduction
The kutils utility is installed automatically when you install a
RecoverPoint splitter on a host (see the EMC RecoverPoint Deployment
Manager Product Guide). When using fabric-based splitters, a
standalone version of kutils can be installed separately on hosts.
The following sections describe the use of this utility:
◆
◆
◆
Usage
“Usage”
“Path designations”
“Commands”
A kutils command is always introduced with the kutils key word.
When this key word is entered by itself, the kutils utility returns usage
notes, as follows:
C:\program files\kdriver\kutils> kutils
Usage: kutils <command> <arguments>
Examples of the command usage are provided for each command.
For commands that are available only on hosts running Windows, a
Windows-type system prompt is always shown. For other
commands, the example may use either a Windows or Unix system
prompt.
Note: Mount points are not native to Windows. Although Windows supports
reparse points, which allows using Unix-like mount points in the NTFS file
system, Kutils only supports the use of drive letters.
Path designations
The path to a device can be designated in the following ways:
◆
device path
Example:
"SCSI\DISK&VEN_EMC&PROD_MASTER_RIGHT&REV_0001\5&1
33EF78A&0&000"
◆
storage path
Example:
"SCSI#DISK&VEN_EMC&PROD_MASTER_RIGHT&REV_0001#5&1
33EF78A&0&000#{53f56307-b6bf-11d0-94f2-00a0c91efb8b}"
312
EMC RecoverPoint Release 3.3 Administrator’s Guide
Kutils Reference
◆
volume path
Example:
"\\?\Volume{33b4a391-26af-11d9-b57b-505054503030}\"
The particular designation used is noted in the description for each
command.
In addition, some commands (e.g., showDevices, showFS) return the
symbolic link for a device. The symbolic link generally provides
additional information about the characteristics of the specific device.
The following are examples of symbolic links:
"\Device\0000005c"
"\Device\EmcPower\Power2"
"\Device\Scsi\ql23001Port2Path0Target0Lun2"
Introduction
313
Kutils Reference
Commands
The following sections present descriptions and examples of each of
the kutils commands:
◆
◆
◆
◆
◆
◆
◆
◆
◆
◆
◆
314
“flushFS”
“manage_auto_host_info_collection”
“mount”
“showFS”
“show_vol_info”
“show_vols”
“sqlRestore”
“sqlSnap”
“start”
“stop”
“umount”
EMC RecoverPoint Release 3.3 Administrator’s Guide
Kutils Reference
flushFS
This command initiates an OS-flush of the file system.
Parameters
drive_letter
Drive designation for the file system that is to be flushed.
Usage
Examples
This command is available only on hosts running Windows.
To initiate an OS-flush of the file system on the device designated as
drive E:
C:\program files\kdriver\kutils> kutils flushFS E:
Flushing buffers for drive E:... Flushed.
Related commands
None
Commands
315
Kutils Reference
manage_auto_host
_info_collection
Parameters
This command is used to display the current setting for automatic
host info collection (HIC), or to enable or disable the feature.
setting
Possible values are ENABLE and DISABLE.
Usage
This command is not included in the kutils standalone version.
When no parameter is appended to the command, it displays the
current setting.
Examples
To display the current automatic host info collection setting:
C:\program files\kdriver\kutils> kutils
manage_auto_host_info_collection
Automatic host info collection enabled.
Related commands
316
None
EMC RecoverPoint Release 3.3 Administrator’s Guide
Kutils Reference
mount
This command mounts a file system.
Parameters
drive_letter
Drive designation for the file system that is to be mounted.
path_to_device (optional)
Volume path for the device.
Usage
This command is available only on hosts running Windows. For
Windows 2003 and later, use the mountvol.exe, which comes with the
Windows operating system.
You must specify a volume path for the path_to_device parameter
whenever the host has no previous record of a device mapping to the
designated drive.
If for any reason the mount operation fails, you should follow the
following procedure:
1. Create a text file, rescan.txt, that includes the following single line
rescan
2. Run the command:
diskpart.exe /s rescan.txt
Examples
To mount a device:
C:\program files\kdriver\kutils> kutils mount E:
Mounting drive E: as
"\\?\Volume{33b4a391-26af-11d9-b57b-505054503030}\"...
Mounted.
or
C:\program files\kdriver\kutils> kutils mount E:
\\?\Volume{33b4a391-26af-11d9-b57b-505054503030}\
Mounting drive E: as
"\\?\Volume{33b4a391-26af-11d9-b57b-505054503030}\"...
Mounted.
Related commands
umount
Commands
317
Kutils Reference
showFS
This command presents the drive designation, and, as available, the
device path, storage path, and symbolic link, for each mounted physical
device.
Parameters
Usage
Examples
None
This command is available only on hosts running Windows.
To show the mounted devices according to drive letter:
C:\program files\kdriver\kutils> kutils showFS
Obtaining mapping... This could take several minutes.
Drive C:
\\.\PHYSICALDRIVE0:
"IDE\DISKMAXTOR_6E040L0__________________________NAR61H
A0\314530564C454547202020202020202020202020"
Drive E:
\\.\PHYSICALDRIVE5:
"SCSI\DISK&VEN_EMC&PROD_POWER&\0000284500461065"
"SCSI#DISK&VEN_EMC&PROD_POWER&#0000284500461065#{53f563
07-b6bf-11d0-94f2-00a0c91efb8b}"
"\Device\EmcPower\Power2"
Related commands
318
None
EMC RecoverPoint Release 3.3 Administrator’s Guide
Kutils Reference
show_vol_info
This command presents information on the specified volume,
including: RecoverPoint name (if created in RecoverPoint), storage
path, size, vendor, and product.
Parameters
volume_name
The RecoverPoint name or storage path of the volume for which you
want to display information.
Usage
Both the volume's RecoverPoint name and its storage path are
legitimate values for the volume_name parameter.
This command is not included in the kutils standalone version.
Examples
To show information about a specific volume:
$:/kdriver/bin/kenv.sh/kdriver/bin/kutils# kutils
show_vol_info Vol2
Name
Vol2
Path
\??\SCSI#Disk&Ven_EMC&Prod_Power&#0000284500461069#{53f
56307-b6bf-11d0-94f2-00a0c91efb8b}
Size
3450
Vendor
EMC
Product
SYMMETRIX
Related commands
show_vols
Commands
319
Kutils Reference
show_vols
This command presents information on all volumes (RecoverPoint
and non-RecoverPoint) to which the host has access, including:
RecoverPoint name (if created in RecoverPoint), size, and storage path.
Parameters
Usage
None
In the information returned by the command, a “-” is displayed
under name for any volume that has not been created in the
RecoverPoint context.
This command is not included in the kutils standalone version.
Examples
To display information on all volumes to which the host has access:
C:\program files\kdriver\kutils> kutils show_vols
name
Size
Path
39205MB
\??\IDE#DiskMaxtor_6E040L0__________________________NAR
61HA0#314530564c454547202020202020202020202020#{53f56307
-b6bf-11d0-94f2-00a0c91efb8b}
Vol6
862MB
\??\SCSI#Disk&Ven_EMC&Prod_Power&#0000284500461003#{53f5
6307-b6bf-11d0-94f2-00a0c91efb8b}
Vol1
3450MB
\??\SCSI#Disk&Ven_EMC&Prod_Power&#0000284500461065#{53f5
6307-b6bf-11d0-94f2-00a0c91efb8b}
.
.
.
7MB
\??\SCSI#Disk&Ven_EMC&Prod_Power&#0000284500461192030001
0d00#{53f56307-b6bf-11d0-94f2-00a0c91efb8b}
Related commands
320
show_vol_info
EMC RecoverPoint Release 3.3 Administrator’s Guide
Kutils Reference
sqlRestore
This command is used to restore a snapshot previously created by the
sqlSnap command.
Additional information essential to the understanding and use of this
command is presented in EMC Deploying with Microsoft SQL Server
Technical Notes.
Parameters
database
Name of the SQL Server database that is to be restored. Multiple
databases can be specified using a comma to separate between
the databases.
Note: Note: When there is more than one instance of Microsoft SQL on
the same server, you can use the format
<instance_name>.<database_name> to specify the database.
metadata_drive
Directory from which the VDI metadata is read when restoring.
Note: Note: This directory must reside on one of the volumes being
replicated.
Alternatively, the above parameters can be incorporated into the
following single parameter:
file
Name of a configuration file with the following format:
database=<db1[,db2,…]>
metadata_drive=<drive>
Usage
This command is available only on hosts running Windows.
When restoring from a configuration file, the command reads only
the database and metadata_drive parameters from within the file.
Example
To restore the VDI storage:
C:\program files\kdriver\kutils> kutils sqlRestore
database=db1,db2 metadata_drive=E
Alternatively, the syntax can specify a configuration file, from which
the command reads only the first two parameters:
C:\program files\kdriver\kutils> kutils sqlRestore
file=sqlparams.file
Commands
321
Kutils Reference
where the structure of sqlparams.file is as follows:
db1,db2
E:\
Related commands
322
sqlSnap
EMC RecoverPoint Release 3.3 Administrator’s Guide
Kutils Reference
sqlSnap
This command performs a VDI-based SQL Server snapshot.
It includes a backup operation, used to put SQL Server into a
quiescent state for taking a snapshot.
Additional information essential to the understanding and use of this
command is presented in EMC Deploying with Microsoft SQL Server
Technical Notes.
Parameters
database
Name of the SQL Server database that is to be replicated. Multiple
databases can be specified using a comma to separate between
the databases.
Note: When there is more than one instance of Microsoft SQL on the
same server, you can use the format
<instance_name>.<database_name> to specify the database.
metadata_drive
Directory in which the VDI metadata is stored (and from which it
is read when restoring).
Note: This directory must reside on one of the volumes being replicated.
group
Name of a RecoverPoint consistency group to which the image
(related to this snapshot) to be bookmarked belongs. You can
specify multiple groups.
tag
The label text to be used when bookmarking the VDI-enabled
snapshot, or the name of the snapshot from which to restore.
policy
Snapshot consolidation policy to set for this snapshot. Valid
values are:
• never; Snapshot is never consolidated.
• survive_daily; Snapshot remains after daily consolidations, but
is consolidated in weekly, monthly and manual
consolidations.
Commands
323
Kutils Reference
• survive_weekly; Snapshot remains after daily and weekly
consolidations, but is consolidated in monthly and manual
consolidations.
• survive_monthly; Snapshot remains after daily, weekly and
monthly consolidations, but is consolidated in manual
consolidations.
• always; Snapshot is consolidated in every consolidation
process, whether manual or automatic.
ip
IP address of the local RecoverPoint cluster management
interface.
drives_to_flush
Drive (or drives) where database files are stored (or to which they
should be restored). Each drive is designated by a drive letter,
e.g., “E”. When performing a backup, the system performs a
file-system flush operation just before bookmarking a snapshot.
You can specify multiple drives by using a comma to separate the
drive designations.
Alternatively, all of the above parameters can be incorporated into
the following single parameter:
file
Name of a configuration file with the following format:
database=<db1[,db2,…]>
metadata_drive=<drive>
group=<group1[,group2,…]>
tag=<bookmark_name>
policy=<policy_name>
ip=<mgmt_ip>
drives_to_flush=<drive1[,drive2,…]>
Usage
This command is available only on hosts running Windows.
To bookmark a VDI snapshot (while using the kutils sqlSnap utility),
you must have IP connectivity to the RecoverPoint cluster at least one
of the sites. You can verify this by pinging the RecoverPoint Cluster
Management IP addresses from the host.
Examples
To take the VDI snapshot:
C:\program files\kdriver\kutils> kutils sqlSnap
database=db1,db2 metadata_drive=E group=group1,group2
324
EMC RecoverPoint Release 3.3 Administrator’s Guide
Kutils Reference
bookmark=Hourly_VDI_9-12-04-2315 policy=survive_daily
ip=192.168.0.1 drives_to_flush=E,F
This would be equivalent to running the following command:
C:\program files\kdriver\kutils> kutils sqlSnap
file=sqlparams.file
where the structure of sqlparams.file is as follows:
database=db1,db2
metadata_drive=E:\
group=group1,group2
tag=Hourly_VDI_9-12-04_2315
policy=survive_daily
ip=192.168.0.1
drives_to_flush=E,F
In response (to either command), RecoverPoint immediately takes a
bookmarked VDI-enabled snapshot, and returns a report on the
result of the operation (i.e., success or error).
Related commands
sqlRestore
Commands
325
Kutils Reference
start
This command causes the host to split writes via the host-based splitter.
Parameters
Usage
Examples
None
This command is not included in the kutils standalone version.
To split writes via the host splitter:
$:/kdriver/bin/kenv.sh/kdriver/bin/kutils# kutils start
Splitter started Successfully
Related commands
326
stop
EMC RecoverPoint Release 3.3 Administrator’s Guide
Kutils Reference
stop
This command causes the host-based splitter to stop splitting writes.
Parameters
Usage
Examples
None
This command is not included in the kutils standalone version.
To stop splitting writes via the host splitter:
C:\program files\kdriver\kutils> kutils stop
Stopping Service Succeeded
Related commands
start
Commands
327
Kutils Reference
umount
This command unmounts a file system.
Parameters
drive_letter
Drive designation for the file system that is to be unmounted.
Usage
Examples
This command is available only on hosts running Windows 2008. For
Windows 2003, use the mountvol.exe, which comes with the
Windows operating system.
To unmount a device:
C:\program files\kdriver\kutils> kutils umount E:
Unmounting drive E:... unmounted from
"\\?\Volume{33b4a391-26af-11d9-b57b-505054503030}\"
Related commands
328
mount
EMC RecoverPoint Release 3.3 Administrator’s Guide
C
Troubleshooting
Troubleshooting
This section presents the user actions necessary to mitigate events
that may occur during RecoverPoint operation. This section describes
occurrences in RecoverPoint, how to identify them, and how to
mitigate them, from a user’s perspective.
◆
◆
◆
◆
My host applications are hanging .................................................
My copy is being regulated ............................................................
My copy has entered a high load state .........................................
My RPA keeps rebooting ................................................................
Troubleshooting
330
332
334
339
329
Troubleshooting
My host applications are hanging
Some RecoverPoint users set a policy that enables RecoverPoint to
control the acknowledgement of writes back to the host in the case of
bottlenecks or insufficient resources that would otherwise prevent
RecoverPoint from replicating the data.
If your host applications experience delays, loss of client connectivity,
or slow response times, check whether the Allow Regulation setting
in the Consistency group Policy tab is checked.
See “Application regulation” on page 49 for more information.
This section answers the questions:
◆
◆
◆
◆
330
“When does application regulation happen?”
“How does application regulation work?”
“How do I know application regulation is happening?”
“What can I do to stop my group from being regulated?”
When does
application
regulation happen?
Application regulation happens when a user enables the Allow
Regulation Consistency Group Protection policy setting in the
RecoverPoint Management Application, or sets the
regulate_application parameter in the set_policy CLI command.
How does
application
regulation work?
The system slows host applications when approaching the lag policy
limit (see “RPO control” on page 53). When the system cannot
replicate the current incoming write-rate while guaranteeing the lag
setting, the system slows host applications to guarantee that the RPO
is always enforced. Additionally, if there is a bottleneck in the system,
the system will regulate the host applications instead of entering a
high load state (see “My copy has entered a high load state” on
page 334).
How do I know
application
regulation is
happening?
If your host applications experience delays, loss of client connectivity,
or slow response times, check whether there is a check in the Allow
Regulation checkbox of the consistency group Policy tab. If there is,
your host applications are being regulated to ensure an RPO.
EMC RecoverPoint Release 3.3 Administrator’s Guide
Troubleshooting
What can I do to
stop my group from
being regulated?
To come out of this state, uncheck the Allow Regulation checkbox.
Note: Before unchecking this checkbox, make sure you are familiar with all of
the contents of “My host applications are hanging” on page 330, “Application
regulation” on page 49 and “Allow Regulation” on page 146, and understand
all of the implications of doing so.
My host applications are hanging
331
Troubleshooting
My copy is being regulated
RecoverPoint includes a smart mechanism that protects the system
from adverse affects and over-consumption of system resources,
when a system component is operating improperly in the system.
This mechanism is referred to as Control action regulation.
This section answers the questions:
◆
◆
◆
◆
◆
When does control
action regulation
happen?
Control action regulation happens when a system component is
operating improperly in the system, and jittering (quickly changing)
between two states for a set period of time.
How do I know
control action
regulation is
happening?
You know the control action regulation mechanism has been enabled,
when:
How does control
action regulation
work?
332
“When does control action regulation happen?”
“How do I know control action regulation is happening?”
“How does control action regulation work?”
“How do I release a copy from control action regulation?”
“How do I verify that regulation is over?”
◆
Event number 4133 (Copy regulation has started) is displayed in the
event log.
◆
In the consistency group Status Tab of the GUI, the Role of a copy
becomes Regulated and is displayed in red.
When control action regulation happens, to allow the environment to
stabilize, the control action regulation mechanism places the copy in a
Regulated state, in which the system will protect itself by closing the
link to the copy, limiting any adverse affects and over-consumption
of system resources. The copy stays in the state it was in before
regulation began for 30 minutes or until corrective action is taken.
EMC RecoverPoint Release 3.3 Administrator’s Guide
Troubleshooting
How do I release a
copy from control
action regulation?
How do I verify that
regulation is over?
When control action regulation happens, you can:
◆
Release all groups at all copies from this state by running the
unregulate_all_copies command from the CLI.
◆
Check previous event logs. Look for repetitive errors that may
indicate a specific problem in the system.
◆
Check SAN/IP events outside of RecoverPoint, as instabilities
may not originate from RecoverPoint.
◆
If regulation persists, collect all system information (See
“Collecting system information” on page 238), and contact EMC
Customer Support for further instruction.
The following indicators can help you verify that your copy is no
longer being regulated:
◆
Event 4132 (Copy regulation has ended due to a user action or internal
timeout.) is displayed in the event log.
◆
The Role of a copy is no longer displayed in red in the consistency
group Status Tab.
My copy is being regulated
333
Troubleshooting
My copy has entered a high load state
High load is a system state that indicates resource depletion during
replication.
There are two kinds of high loads in RecoverPoint:
◆
“What is a permanent high load?” - In these cases, RecoverPoint
stops and waits for a user action in order to come out of high
load.
◆
“What is a temporary high load?” - In these cases, RecoverPoint
tries to recover from the high load and will keep trying until the
condition that triggered the high load changes.
This section answers the questions:
◆
◆
◆
◆
◆
◆
◆
◆
◆
◆
◆
How do I know a
copy is
experiencing a high
load?
What is a
permanent high
load?
334
“How do I know a copy is experiencing a high load?”
“When do permanent high loads occur?”
“How do permanent high loads work?”
“How can I tell a copy is under permanent high load?”
“What can I do to come out of permanent high load?”
“How do I verify that a permanent high load is over?”
“When do temporary high loads occur?”
“How do temporary high loads work?”
“How can I tell a copy is under temporary high load?”
“What should I know about temporary high loads?”
“How do I verify that a temporary high load is over?”
You know that a copy has entered a high load state, when:
◆
Warning events are logged specifying that the replica is
experiencing high load.
◆
In the consistency group Status Tab of the GUI, the Transfer state
of a copy becomes High load.
A permanent high load is a system state that happens during
replication, when the size of the journal, or the queue of snapshots
waiting for distribution of the journal at the replica copy, is
insufficient.
EMC RecoverPoint Release 3.3 Administrator’s Guide
Troubleshooting
When do
permanent high
loads occur?
A permanent high load generally happens in one of two cases:
◆
When a user accesses a replica in logged or virtual access modes
for a long time (see “Image access” on page 53) and the queue of
snapshots waiting for distribution of the journal reaches its
maximum capacity (see “The distribution phase” on page 70), in
this case the system will Pause Transfer and wait for user input.
◆
When the system is in initialization mode, and the journal volume
has reached its maximum capacity, while the Allow distribution
of snapshots that are larger than capacity of journal volumes
setting is disabled (Also known as: Long initialization, or long
resync).
How do permanent
high loads work?
When any of the events described in “When do permanent high loads
occur?” on page 335 occurs, the system stops transfer and waits for
user input.
How can I tell a
copy is under
permanent high
load?
The following indicators are displayed when your copy is
experiencing a permanent high load:
◆
Warning events are logged specifying that the replica is
experiencing high load.
◆
The Transfer state is displayed as High load. You can display the
transfer state:
• By running the get_group_states command in the
RecoverPoint Command Line Interface.
• In the consistency group Status Tab of the RecoverPoint
Management Application:
My copy has entered a high load state
335
Troubleshooting
What can I do to
come out of
permanent high
load?
How do I verify that
a permanent high
load is over?
To release a copy from a permanent high load:
◆
If the user accessed a replica in logged or virtual access mode for
a long time (see “Image access” on page 53) and the queue of
snapshots waiting for distribution of the journal reached its
maximum capacity (see “The distribution phase” on page 70),
you should disable image access or enable direct access (see
“Image access modes” on page 74).
◆
If the system was in initialization mode, and the journal volume
became full, while the Allow distribution of snapshots that are
larger than capacity of journal volumes setting was disabled, see
“Long initializations” on page 85.
The following indicators can help you verify that your copy is no
longer experiencing a permanent high load:
◆
Warning events are logged specifying that the replica is no longer
experiencing high load.
◆
The Transfer state is no longer displayed as High load. You can
display the transfer state:
• By running the get_group_states command in the
RecoverPoint Command Line Interface.
• In the consistency group Status Tab of the RecoverPoint
Management Application:
336
What is a temporary
high load?
A temporary high load is a system state that happens during
replication, when the RPA resources at the production site are
insufficient.
When do temporary
high loads occur?
Temporary high loads occur:
◆
In extreme cases, during uncommonly great durations of
uncommonly extreme write loads
◆
Replica or journal volumes not fast enough to handle distribution
◆
WAN too slow
◆
Compression level too high
EMC RecoverPoint Release 3.3 Administrator’s Guide
Troubleshooting
How do temporary
high loads work?
When any of the events described in “When do temporary high loads
occur?” on page 336 occurs, traffic is paused and started immediately.
If resources are still low, the system waits five minutes and then tries
to pause and start transfer again until the required resources are
available.
Note: Upon every start of transfer, a short initialization occurs.
How can I tell a
copy is under
temporary high
load?
The following indicators are displayed when your copy is
experiencing a temporary high load:
◆
Warning events are logged specifying that the replica is
experiencing high load.
◆
The Transfer state is displayed as high load, followed by a
progress status. You can display the transfer state:
• By running the get_group_states command in the
RecoverPoint Command Line Interface.
• In the consistency group Status Tab of the RecoverPoint
Management Application:
What should I know
about temporary
high loads?
Temporary high loads are a common occurrence and expected to
happen from time to time.
How do I verify that
a temporary high
load is over?
The following indicators can help you verify that your copy is no
longer experiencing a temporary high load:
If the high load lasts for an extreme period of time or occurs too
frequently (and will eventually impact the business RPO) contact
EMC Customer Support for a mitigation plan.
◆
Warning events are logged specifying that the replica is no longer
experiencing high load.
My copy has entered a high load state
337
Troubleshooting
◆
The Transfer state is no longer displayed as High load. You can
display the transfer state:
• By running the get_group_states command in the
RecoverPoint Command Line Interface.
• In the consistency group Status Tab of the RecoverPoint
Management Application:
338
EMC RecoverPoint Release 3.3 Administrator’s Guide
Troubleshooting
My RPA keeps rebooting
Reboot regulation is a state of regulation that allows the system to
detach an RPA from its RPA cluster in the event of frequent
unexplained reboots or internal failures.
This section answers the questions:
◆
◆
◆
◆
“When does reboot regulation happen?”
“How does reboot regulation work?”
“How do I know reboot regulation is happening?”
“What should I do to stop reboot regulation?”
When does reboot
regulation happen?
Reboot regulation happens when an RPA is frequently and
unexpectedly rebooting, or undergoing a repeated internal failure.
How does reboot
regulation work?
When an RPA behaves in the manner described in “When does reboot
regulation happen?” on page 339, the system detaches the RPA from
the RPA cluster.
How do I know
reboot regulation is
happening?
Reboot regulation is happening when the red icon is frequently
displayed in the Connectivity column of the RPAs tab.
The user receives the following message when logging into the RPA
as a boxmgmt user:
What should I do to
stop reboot
regulation?
To stop reboot regulation, contact EMC Customer Service for further
instructions.
My RPA keeps rebooting
339
Troubleshooting
340
EMC RecoverPoint Release 3.3 Administrator’s Guide