Uploaded by Guru Rolsen

MIMIX Availability. Version 7.1 MIMIX Operations 5250

advertisement
MIMIX® Availability™
Version 7.1
MIMIX Operations–5250
Notices
MIMIX Operations - 5250 User Guide
January 2014
Version: 7.1.19.00
© Copyright 1999, 2014 Vision Solutions®, Inc. All rights reserved.
The information in this document is subject to change without notice and is furnished under a license
agreement. This document is proprietary to Vision Solutions, Inc., and may be used only as authorized in our
license agreement. No portion of this manual may be copied or otherwise reproduced without the express
written consent of Vision Solutions, Inc.
Vision Solutions provides no expressed or implied warranty with this manual.
The following are trademarks or registered trademarks of their respective organizations or companies:
• MIMIX and Vision Solutions are registered trademarks and AutoGuard, Data Manager, Director, Dynamic
Apply, ECS/400, GeoCluster, IntelliStart, Integrator, iOptimize, iTERA, iTERA Availability, MIMIX AutoNotify,
MIMIX Availability, MIMIX Availability Manager, MIMIX DB2 Replicator, MIMIX Director, MIMIX dr1, MIMIX
Enterprise, MIMIX Global, MIMIX Monitor, MIMIX Object Replicator, MIMIX Professional, MIMIX Promoter,
OMS/ODS, RecoverNow, Replicate1, RJ Link, SAM/400, Switch Assistant, Vision AutoValidate, and Vision
Suite are trademarks of Vision Solutions, Inc.
• Double-Take Share, Double-Take Availability, and Double-Take RecoverNow—DoubleTake Inc.
• AIX, AIX 5L, AS/400, DB2, eServer, IBM, Informix, i5/OS, iSeries, OS/400, Power, System i, System i5,
System p, System x, System z, and WebSphere—International Business Machines Corporation.
• Adobe and Acrobat Reader—Adobe Systems, Inc.
• HP-UX—Hewlett-Packard Company.
• Teradata—Teradata Corporation.
• Intel—Intel Corporation.
• Java, all Java-based trademarks, and Solaris—Sun Microsystems, Inc.
• Linux—Linus Torvalds.
• Internet Explorer, Microsoft, Windows, and Windows Server—Microsoft Corporation.
• Mozilla and Firefox—Mozilla Foundation.
• Netscape—Netscape Communications Corporation.
• Oracle—Oracle Corporation.
• Red Hat—Red Hat, Inc.
• Sybase—Sybase, Inc.
• Symantec and NetBackup—Symantec Corporation.
• UNIX and UNIXWare—the Open Group.
All other brands and product names are trademarks or registered trademarks of their respective owners.
If you need assistance, contact Vision Solutions’ CustomerCare team at:
CustomerCare
Vision Solutions, Inc.
Telephone: 1.800.337.8214 or 1.949.724.5465
Email: support@visionsolutions.com
Web Site: www.visionsolutions.com/Support/Contact-CustomerCare.aspx
Contents
Who this book is for................................................................................................... 11
What is in this book ............................................................................................. 11
The MIMIX documentation set .................................................................................. 11
Sources for additional information............................................................................. 13
How to contact us...................................................................................................... 14
Chapter 1
MIMIX overview
15
MIMIX concepts......................................................................................................... 17
Product concepts................................................................................................. 17
System role concepts .......................................................................................... 18
Journaling concepts ............................................................................................ 19
Configuration concepts........................................................................................ 20
Process concepts ................................................................................................ 21
Additional switching concepts ............................................................................. 22
Best practices for maintaining your MIMIX environment ........................................... 23
Authority to products and commands........................................................................ 23
Accessing the MIMIX Main Menu.............................................................................. 24
Chapter 2
MIMIX policies
26
Environment considerations for policies.................................................................... 27
Policies in environments with more than two nodes or bi-directional replication. 27
When to disable automatic recovery for replication and auditing ........................ 28
Disabling audits and recovery when using the MIMIX CDP feature .............. 29
Setting policies - general ........................................................................................... 29
Changing policies for an installation .................................................................... 29
Changing policies for a data group...................................................................... 30
Resetting a data group-level policy to use the installation level value ................ 30
Policies which affect an installation ........................................................................... 31
Changing retention criteria for procedure history ................................................ 31
Policies which affect replication................................................................................. 32
Errors handled by automatic database recovery ................................................. 33
Errors handled by automatic object recovery ...................................................... 34
Policies which affect auditing .................................................................................... 36
Policies for auditing runtime behavior ................................................................. 36
Policies for submitting audits automatically ......................................................... 37
When automatically submitted audits run...................................................... 38
Changing auditing policies ........................................................................................ 41
Changing when automatic audits are allowed to run........................................... 41
Changing scheduling criteria for automatic audits......................................... 41
Changing the selection frequency of priority auditing categories ........................ 42
Changing the audit level policy when switching .................................................. 43
Changing the system where audits are performed.............................................. 43
Changing retention criteria for audit history......................................................... 43
Restricting auditing based on the state of the data group ................................... 44
Preventing audits from running ........................................................................... 45
Disabling all auditing for an installation ......................................................... 46
Disabling all auditing for a data group ........................................................... 46
Disabling automatically submitted audits....................................................... 46
Policies for switching with model switch framework .................................................. 48
Specifying a default switch framework in policies ............................................... 48
3
Setting polices for MIMIX Switch Assistant ......................................................... 49
Setting policies when MIMIX Model Switch Framework is not used.................... 49
Policy descriptions..................................................................................................... 50
Chapter 3
Checking status in environments with application groups
60
Checking application group status ............................................................................ 60
Resolving problems reported in the Monitors field .............................................. 61
Resolving problems reported in the Notifications field ........................................ 63
Resolving problems reported in Status columns ................................................. 64
Resolving a procedure status problem .......................................................... 64
Resolving an *ATTN status for an application group..................................... 65
Resolving other common status values for an application group .................. 66
Status for Work with Node Entries ............................................................................ 66
Status for Work with Data Resource Group Entries .................................................. 68
Verifying the sequence of the recovery domain ........................................................ 70
Changing the sequence of backup nodes ................................................................. 71
Examples of changing the backup sequence ...................................................... 73
Chapter 4
Working with status of procedures and steps
77
Displaying status of procedures ................................................................................ 78
Displaying status of the last run of all procedures ............................................... 78
Displaying available status history of procedure runs ......................................... 79
Resolving problems with procedure status................................................................ 80
Responding to a procedure in *MSGW status..................................................... 81
Resolving a *FAILED or *CANCELED procedure status..................................... 82
Displaying status of steps within a procedure run ..................................................... 83
Resolving problems with step status ......................................................................... 85
Responding to a step with a *MSGW status ....................................................... 87
Resolving *CANCEL or *FAILED step statuses .................................................. 88
Acknowledging a procedure ...................................................................................... 89
Running a procedure................................................................................................. 90
Resuming a procedure ........................................................................................ 91
Overriding the attributes of a step ................................................................. 91
Canceling a procedure .............................................................................................. 92
Chapter 5
Monitoring status with MIMIX Availability Status
93
Checking replication status from the MIMIX Availability Status display .................... 95
Checking audit and notification status from the MIMIX Availability Status display.... 96
Checking status of supporting services from the MIMIX Availability Status display.. 96
Chapter 6
Working with data group status
98
The Work with Data Groups display.......................................................................... 99
Problems reflected in the Audits/Recov./Notif. field .......................................... 101
Problems reflected in the Data Group column .................................................. 101
Resolving problems highlighted in the Data Group column......................... 102
Manager problems reflected in the Source and Target columns....................... 103
Replication problems reflected in the Source and Target columns ................... 103
Setting the automatic refresh interval ................................................................ 104
Working with the detailed status of data groups...................................................... 105
Displaying data group detailed status ............................................................... 105
Merged view ................................................................................................ 106
4
Object detailed status views ........................................................................ 110
Database detailed status views ................................................................... 112
Identifying replication processes with backlogs....................................................... 115
Data group status in environments with journal cache or journal state ................... 117
Resolving a problem with journal cache or journal state ................................... 119
Chapter 7
Working with audits
121
Auditing overview .................................................................................................... 122
Components of an audit .................................................................................... 122
Phases of audit processing ............................................................................... 123
Object selection methods for automatic audits.................................................. 123
How priority auditing determines what objects to select.............................. 124
How audits are submitted automatically ............................................................ 124
Audit status and results ..................................................................................... 125
Audit compliance ............................................................................................... 125
Guidelines and considerations for auditing ............................................................. 126
Auditing best practices ...................................................................................... 126
Considerations for specific audits...................................................................... 127
Recommendations when checking audit results ............................................... 127
Displaying audit runtime status ............................................................................... 129
Running an audit immediately ........................................................................... 131
Resolving audit problems .................................................................................. 133
Checking the job log of an audit ........................................................................ 135
Ending audits..................................................................................................... 136
Displaying audit history ........................................................................................... 137
Audits with no selected objects ......................................................................... 139
Working with audited objects................................................................................... 139
Displaying audited objects from a specific audit run ......................................... 141
Displaying a customized list of audited objects ................................................. 141
Working with audited object history......................................................................... 142
Displaying the audit history for a specific object................................................ 143
Displaying audit compliance.................................................................................... 144
Determining whether auditing is within compliance........................................... 145
Displaying scheduling information for automatic audits .......................................... 147
Chapter 8
Working with system-level processes
149
Displaying status of system-level processes........................................................... 149
Resolving *ACTREQ status for a system manager ........................................... 151
Checking for a system manager backlog .......................................................... 151
Starting a system manager or a journal manager ............................................. 152
Ending a system manager or a journal manager .............................................. 152
Starting collector services ................................................................................. 152
Ending collector services................................................................................... 153
Starting target journal inspection processes ..................................................... 153
Ending target journal inspection processes....................................................... 154
Displaying status of target journal inspection .......................................................... 155
Displaying results of target journal inspection ......................................................... 156
Displaying details associated with target journal inspection notifications.......... 157
Displaying messages for TGTJRNINSP notifications.................................. 157
Identifying the last entry inspected on the target system ........................................ 158
5
Chapter 9
Working with notifications and recoveries
159
What are notifications and recoveries ..................................................................... 159
Displaying notifications............................................................................................ 160
What information is available for notifications ................................................... 160
Detailed information..................................................................................... 161
Options for working with notifications ................................................................ 162
Notifications for newly created objects .................................................................... 163
Displaying recoveries .............................................................................................. 164
What information is available for recoveries...................................................... 165
Detailed information..................................................................................... 166
Options for working with recoveries .................................................................. 166
Orphaned recoveries ......................................................................................... 167
Determining whether a recovery is orphaned.............................................. 167
Removing an orphaned recovery ................................................................ 168
Chapter 10
Starting and ending replication
169
Before starting replication........................................................................................ 171
Commands for starting replication........................................................................... 171
What is started with the STRMMX command.................................................... 171
STRMMX and ENDMMX messages............................................................ 172
What is started by the default START procedure for an application group ....... 172
Choices when starting or ending an application group...................................... 172
What occurs when a data group is started .............................................................. 174
Journal starting point identified on the STRDG request .................................... 175
Journal starting point when the object send process is shared ................... 175
Clear pending and clear error processing ......................................................... 175
Starting MIMIX......................................................................................................... 179
Starting an application group................................................................................... 180
Starting selected data group processes .................................................................. 181
Starting replication when open commit cycles exist ................................................ 183
Checking for open commit cycles...................................................................... 183
Resolving open commit cycles .......................................................................... 183
Before ending replication......................................................................................... 184
Commands for ending replication............................................................................ 184
Command choice by reason for ending replication ........................................... 184
Additional considerations when ending replication............................................ 186
Ending immediately or controlled ...................................................................... 186
Controlling how long to wait for a controlled end to complete ..................... 187
Ending all or selected processes....................................................................... 187
When to end the RJ link .................................................................................... 188
What is ended by the ENDMMX command ....................................................... 188
What is ended by the default END procedure for an application group ............ 189
What occurs when a data group is ended ............................................................... 190
Ending MIMIX.......................................................................................................... 192
Ending with default values................................................................................. 192
Ending by prompting the ENDMMX command.................................................. 192
After you end MIMIX products ........................................................................... 193
Ending an application group.................................................................................... 194
Ending a data group in a controlled manner ........................................................... 195
Preparing for a controlled end of a data group .................................................. 195
6
Performing the controlled end ........................................................................... 195
Confirming the end request completed without problems ................................. 196
Ending selected data group processes ................................................................... 198
What replication processes are started by the STRDG command.......................... 199
What replication processes are ended by the ENDDG command .......................... 203
Chapter 11
Resolving common replication problems
207
Working with message queues ............................................................................... 208
Working with the message log ................................................................................ 209
Working with user journal replication errors ............................................................ 210
Working with files needing attention (replication and access path errors)......... 210
Working with journal transactions for files in error....................................... 213
Placing a file on hold ......................................................................................... 214
Ignoring a held file ............................................................................................. 214
Releasing a held file at a synchronization point ................................................ 215
Releasing a held file .......................................................................................... 215
Releasing a held file and clearing entries.......................................................... 216
Correcting file-level errors ................................................................................. 216
Correcting record-level errors............................................................................ 217
Record written in error ................................................................................. 217
Working with tracking entries .................................................................................. 219
Accessing the appropriate tracking entry display .............................................. 219
Holding journal entries associated with a tracking entry ................................... 221
Ignoring journal entries associated with a tracking entry................................... 222
Waiting to synchronize and release held journal entries for a tracking entry .... 222
Releasing held journal entries for a tracking entry ............................................ 223
Releasing and clearing held journal entries for a tracking entry........................ 223
Removing a tracking entry................................................................................. 223
Working with objects in error ................................................................................... 224
Using the Work with DG Activity Entries display ............................................... 225
Retrying data group activity entries ................................................................... 227
Retrying a failed data group activity entry ................................................... 227
Determining whether an activity entry is in a delay/retry cycle .......................... 228
Removing data group activity history entries........................................................... 229
Chapter 12
Starting, ending, and verifying journaling
230
What objects need to be journaled.......................................................................... 231
Authority requirements for starting journaling.................................................... 232
MIMIX commands for starting journaling................................................................. 233
Journaling for physical files ..................................................................................... 235
Displaying journaling status for physical files .................................................... 235
Starting journaling for physical files ................................................................... 235
Ending journaling for physical files .................................................................... 236
Verifying journaling for physical files ................................................................. 237
Journaling for IFS objects........................................................................................ 238
Displaying journaling status for IFS objects ...................................................... 238
Starting journaling for IFS objects ..................................................................... 238
Ending journaling for IFS objects ...................................................................... 239
Verifying journaling for IFS objects.................................................................... 240
Journaling for data areas and data queues............................................................. 241
7
Displaying journaling status for data areas and data queues............................ 241
Starting journaling for data areas and data queues .......................................... 241
Ending journaling for data areas and data queues............................................ 242
Verifying journaling for data areas and data queues ......................................... 243
Chapter 13
Switching
244
About switching ....................................................................................................... 244
Planned switch .................................................................................................. 245
Unplanned switch .............................................................................................. 246
Switching application group environments with procedures.............................. 247
Switching data group environments with MIMIX Model Switch Framework ...... 248
Switching an application group................................................................................ 250
Switching a data group-only environment ............................................................... 251
Switching to the backup system ........................................................................ 251
Synchronizing data and starting MIMIX on the original production system ....... 252
Switching to the production system ................................................................... 252
Determining when the last switch was performed ................................................... 253
Checking the last switch date ............................................................................ 253
Problems checking switch compliance.................................................................... 254
Performing a data group switch............................................................................... 255
Switch Data Group (SWTDG) command................................................................. 257
Chapter 14
Less common operations
259
Starting the TCP/IP server ...................................................................................... 260
Ending the TCP/IP server........................................................................................ 261
Working with objects ............................................................................................... 262
Displaying long object names............................................................................ 262
Considerations for working with long IFS path names ................................ 262
Displaying data group spooled file information.................................................. 262
Viewing status for active file operations .................................................................. 263
Displaying a remote journal link .............................................................................. 264
Displaying status of a remote journal link................................................................ 265
Identifying data groups that use an RJ link ............................................................. 267
Identifying journal definitions used with RJ ............................................................. 268
Disabling and enabling data groups ........................................................................ 269
Procedures for disabling and enabling data groups .......................................... 270
Determining if non-file objects are configured for user journal replication............... 271
Determining how IFS objects are configured .................................................... 271
Determining how data areas or data queues are configured ............................ 272
Using file identifiers (FIDs) for IFS objects .............................................................. 273
Operating a remote journal link independently........................................................ 274
Starting a remote journal link independently ..................................................... 274
Ending a remote journal link independently ...................................................... 274
Chapter 15
Troubleshooting - where to start
276
Gathering information before reporting a problem .................................................. 278
Obtaining MIMIX and IBM i information from your system ................................ 278
Reducing contention between MIMIX and user applications................................... 279
Data groups cannot be ended ................................................................................. 280
Verifying a communications link for system definitions ........................................... 281
Verifying the communications link for a data group................................................. 282
8
Verifying all communications links..................................................................... 282
Checking file entry configuration manually.............................................................. 283
Data groups cannot be started ................................................................................ 285
Cannot start or end an RJ link................................................................................. 286
Removing unconfirmed entries to free an RJ link.............................................. 286
RJ link active but data not transferring .................................................................... 287
Errors using target journal defined by RJ link.......................................................... 288
Verifying data group file entries............................................................................... 289
Verifying data group data area entries .................................................................... 289
Verifying key attributes ............................................................................................ 289
Working with data group timestamps ...................................................................... 291
Automatically creating timestamps .................................................................... 291
Creating additional timestamps ......................................................................... 291
Creating timestamps for remote journaling processing ..................................... 292
Deleting timestamps .......................................................................................... 293
Displaying or printing timestamps ..................................................................... 293
Removing journaled changes.................................................................................. 294
Performing journal analysis ..................................................................................... 295
Removing journal analysis entries for a selected file ........................................ 297
Appendix A Interpreting audit results - supporting information
299
Interpreting results for configuration data - #DGFE audit........................................ 300
When the difference is “not found” .......................................................................... 302
Interpreting results of audits for record counts and file data ................................... 303
What differences were detected by #FILDTA.................................................... 303
What differences were detected by #MBRRCDCNT ......................................... 304
Interpreting results of audits that compare attributes .............................................. 306
What attribute differences were detected .......................................................... 306
Where was the difference detected................................................................... 308
What attributes were compared ........................................................................ 309
Appendix B IBM Power™ Systems operations that affect MIMIX
310
MIMIX procedures when performing an initial program load (IPL) .......................... 310
MIMIX procedures when performing an operating system upgrade........................ 311
Prerequisites for performing an OS upgrade on either system ......................... 312
MIMIX-specific steps for an OS upgrade on the backup system....................... 313
MIMIX-specific steps for an OS upgrade on the production system with switching
315
MIMIX-specific steps for an OS upgrade on the production system without switching............................................................................................................................ 316
MIMIX procedures when upgrading hardware without a disk image change .......... 318
Considerations for performing a hardware system upgrade without a disk image
change..................................................................................................................... 318
MIMIX-specific steps for a hardware upgrade without a disk image change..... 319
Hardware upgrade without a disk image change - preliminary steps .......... 319
Hardware upgrade without a disk image change - subsequent steps ......... 320
MIMIX procedures when performing a hardware upgrade with a disk image change...
321
Considerations for performing a hardware system upgrade with a disk image
change..................................................................................................................... 321
9
MIMIX-specific steps for a hardware upgrade with a disk image change.......... 322
Hardware upgrade with a disk image change - preliminary steps ............... 322
Hardware upgrade with a disk image change - subsequent steps .............. 323
Handling MIMIX during a system restore ................................................................ 325
Prerequisites for performing a restore of MIMIX ............................................... 325
Index
326
10
Who this book is for
Who this book is for
The MIMIX Operations - 5250 book describes how to perform routine operational
tasks and basic troubleshooting for MIMIX® Enterprise™ and MIMIX® Professional™
from a 5250 emulator.
What is in this book
The MIMIX Operations - 5250 book provides these distinct types of information:
•
A summary of concepts within MIMIX
•
Application group and data group status and troubleshooting
•
Audit status, troubleshooting, scheduling, and history
•
Procedures for starting, ending, and switching replication
•
Procedures for starting, ending, and verifying journaling
•
Procedures for handling MIMIX when performing operations such as IPLs or
hardware and operating system upgrades.
The MIMIX documentation set
The following documents about MIMIX® Availability™ products are available:
Using License Manager
License Manager currently supports MIMIX® Availability™, iTERA Availability™,
and iOptimize™. This book describes software requirements, system security, and
other planning considerations for installing software and software fixes for Vision
Solutions products that are supported through License Manager. The preferred
way to obtain license keys and install software is by using Vision AutoValidate™
and the product’s Installation Wizard. However, if you cannot use the wizard or
AutoValidate, this book provides instructions for obtaining licenses and installing
software from a 5250 emulator. This book also describes how to use the
additional security functions from Vision Solutions which are available for License
Manager and MIMIX and implemented through License Manager.
MIMIX Administrator Reference
This book provides detailed conceptual, configuration, and programming
information for MIMIX® Enterprise™ and MIMIX® Professional™. It includes
checklists for setting up several common configurations, information for planning
what to replicate, and detailed advanced configuration topics for custom needs. It
also identifies what information can be returned in outfiles if used in automation.
MIMIX Operations with IBM i Clustering
This book is for administrators and operators in an IBM i clustering environment
who either use the basic support for IBM i clustering provided within MIMIX or who
use MIMIX® Global™ to integrate cluster management with MIMIX logical
replication or supported hardware-based replication techniques. This book
11
The MIMIX documentation set
focuses on addressing problems reported in MIMIX status and basic operational
procedures such as starting, ending, and switching.
MIMIX Operations - 5250
This book provides high level concepts and operational procedures for managing
your high availability environment using MIMIX® Enterprise™ or MIMIX®
Professional™ from a 5250 emulator. This book focuses on tasks typically
performed by an operator, such as checking status, starting or stopping
replication, performing audits, and basic problem resolution.
Using MIMIX Monitor
This book describes how to use the MIMIX Monitor user and programming
interfaces available with MIMIX® Enterprise™ or MIMIX® Professional™. This
book also includes programming information about MIMIX Model Switch
Framework and support for hardware switching.
Using MIMIX Promoter
This book describes how to use MIMIX commands for copying and reorganizing
active files. MIMIX Promoter is available with MIMIX® Enterprise™ and as nocharge feature for MIMIX® Professional™.
MIMIX for IBM WebSphere MQ
This book identifies requirements for the MIMIX for MQ feature which supports
replication in IBM WebSphere MQ environments. This book describes how to
configure MIMIX for this environment and how to perform the initial
synchronization and initial startup. Once configured and started, all other
operations are performed as described in the MIMIX Operations - 5250 book.
12
Sources for additional information
Sources for additional information
This book refers to other published information. The following information, plus
additional technical information, can be located in the IBM System i and i5/OS
Information Center.
From the Information center you can access these IBM Power™ Systems topics,
books, and redbooks:
•
Backup and Recovery
•
Journal management
•
DB2 Universal Database for IBM Power™ Systems Database Programming
•
Integrated File System Introduction
•
Independent disk pools
•
OptiConnect for OS/400
•
TCP/IP Setup
•
IBM redbook Striving for Optimal Journal Performance on DB2 Universal
Database for iSeries, SG24-6286
•
IBM redbook AS/400 Remote Journal Function for High Availability and Data
Replication, SG24-5189
•
IBM redbook Power™ Systems iASPs: A Guide to Moving Applications to
Independent ASPs, SG24-6802
The following information may also be helpful if you replicate journaled data areas,
data queues, or IFS objects:
•
DB2 UDB for iSeries SQL Programming Concepts
•
DB2 Universal Database for iSeries SQL Reference
•
IBM redbook AS/400 Remote Journal Function for High Availability and Data
Replication, SG24-5189
13
How to contact us
How to contact us
For contact information, visit our Contact CustomerCare web page.
If you are current on maintenance, support for MIMIX products is also available when
you log in to Support Central.
It is important to include product and version information whenever you report
problems.
14
CHAPTER 1
MIMIX overview
This book provides operational information and procedures for using MIMIX®
Enterprise™ and MIMIX® Professional™ through its 5250 emulator user interface.
For simplicity, this book uses the term MIMIX to refer to the functionality provided by
either product unless a more specific name is necessary.
MIMIX® Availability™ version 7.1 provides high availability for your critical data in a
production environment on IBM Power™ Systems through real-time replication of
changes and the ability to quickly switch your production environment to a ready
backup system. These capabilities allow your business operations to continue when
you have planned or unplanned outages in your System i environment. MIMIX also
provides advanced capabilities that can help ensure the integrity of your MIMIX
environment.
Replication: MIMIX continuously captures changes to critical database files and
objects on a production system, sends the changes to a backup system, and applies
the changes to the appropriate database file or object on the backup system. The
backup system stores exact duplicates of the critical database files and objects from
the production system.
MIMIX uses two replication paths to address different pieces of your replication
needs. These paths operate with configurable levels of cooperation or can operate
independently.
•
The user journal replication path captures changes to critical files and objects
configured for replication through a user journal. When configuring this path,
shipped defaults use the remote journaling function of the operating system to
simplify sending data to the remote system. In previous versions, MIMIX DB2
Replicator provided this function.
•
The system journal replication path handles replication of critical system objects
(such as user profiles, program objects, or spooled files), integrated file system
(IFS) objects, and document library object (DLOs) using the system journal. In
previous versions MIMIX Object Replicator provided this function.
Configuration choices determine the degree of cooperative processing used between
the system journal and user journal replication paths when replicating database files,
IFS objects, data areas, and data queues.
Switching: One common use of MIMIX is to support a hot backup system to which
operations can be switched in the event of a planned or unplanned outage. If a
production system becomes unavailable, its backup is already prepared for users. In
the event of an outage, you can quickly switch users to the backup system where they
can continue using their applications. MIMIX captures changes on the backup system
for later synchronization with the original production system. When the original
production system is brought back online, MIMIX assists you with analysis and
synchronization of the database files and other objects.
15
Automatic verification and correction: MIMIX enables earlier and easier detection
of problems known to adversely affect maintaining availability and switch-readiness of
your replication environment. MIMIX automatically detects and corrects potential
problems during replication and auditing. MIMIX also helps to ensure the integrity of
your MIMIX configuration by automatically verifying that the files and objects being
replicated are what is defined to your configuration.
MIMIX is shipped with these capabilities enabled. Incorporated best practices for
maintaining availability and switch-readiness are key to ensuring that your MIMIX
environment is in tip-top shape for protecting your data. User interfaces allow you to
fine-tune to the needs of your environment.
Analysis: MIMIX also provides advanced analysis capabilities through the MIMIX
portal application for Vision Solutions Portal (VSP). When using the VSP user
interface, you can see what objects are configured for replication as well as what
replicated objects on the target system have been changed by people or programs
other than MIMIX. (Objects changed on the target system affect your data integrity.)
You can also check historical arrival and backlog rates for replication to help you
identify trends in your operations that may affect MIMIX performance.
Uses: MIMIX is typically used among systems in a network to support a hot backup
system. Simple environments have one production system and one backup system.
More complex environments have multiple production systems or backup systems.
MIMIX can also be used on a single system.
You can view the replicated data on the backup system at any time without affecting
productivity. This allows you to generate reports, submit (read-only) batch jobs, or
perform backups to tape from the backup system. In addition to real-time backup
capability, replicated databases and objects can be used for distributed processing,
allowing you to off-load applications to a backup system.
The topics in this chapter include:
•
“MIMIX concepts” on page 17 summarizes key concepts that you need to know
about MIMIX.
•
“Best practices for maintaining your MIMIX environment” on page 23 summarizes
recommendations from Vision Solutions.
•
“Authority to products and commands” on page 23 identifies authority levels to
MIMIX functions when additional security features provided by Vision Solutions
are used.
•
“Accessing the MIMIX Main Menu” on page 24 describes the MIMIX Basic Main
menu and the MIMIX Intermediate Main Menu. The MIMIX Basic Main menu is
used to access the MIMIX Availability Status (WRKMMXSTS) display.
16
MIMIX concepts
MIMIX concepts
The following subtopics organize the basic concepts associated with MIMIX® into
related groups. More detailed information is available in the MIMIX Administrator
Reference book.
Product concepts
MIMIX installation - The network of IBM Power™ Systems systems that transfer data
and objects among each other using functions of a common MIMIX product. A MIMIX
installation is defined by the way in which you configure the MIMIX product for each of
the participating systems. A system can participate in multiple independent MIMIX
installations.
Replication - The activity that MIMIX performs to continuously capture changes to
critical database files and objects on a production system as they occur, send the
changes to a backup system, and apply the changes to the appropriate database file
or object on the backup system.
Switch - The process by which a production environment is moved from one system
to another system and the production environment is made available there. A switch
may be performed as part of a planned event such as for system maintenance, or an
unplanned event such as a power or equipment failure. MIMIX provides customizable
functions for switching.
Audits - Audits are predetermined programs that are used to check for differences in
replicated objects and other conditions between systems. Audits run and can correct
detected problems automatically. Policies control when audits run and many other
aspects of how audits are performed. Additional auditing concepts and
recommendations are described in the auditing chapter of this book.
Automatic recovery - MIMIX provides a set of functions that provide the ability to
automatically correct problems detected in a MIMIX installation during database
replication, object replication, and auditing. During these activities, when MIMIX
detects any of a set of scenarios known to interfere with maintaining your MIMIX
environment, it will automatically start recovery actions to correct them. Through
policies, you have the ability to disable automatic recovery in any of these areas at the
installation or data group level.
Application group - A MIMIX construct used to group and control resources from a
single point in a way that maintains relationships between them. The use of
application groups is best practice for MIMIX® Professional™ and MIMIX®
Enterprise™ and required for MIMIX® Global™.
Data group - A MIMIX construct that is used to control replication activities. A data
group is a logical grouping of database files, data areas, objects, IFS objects, DLOs,
or a combination thereof that defines a unit of work by which MIMIX replication activity
is controlled. A data group may represent an application, a set of one or more
libraries, or all of the critical data on a given system. Application environments may
define a data group as a specific set of files and objects.
17
MIMIX concepts
Prioritized status - MIMIX assigns a priority to status values to ensure that problems
with the highest priorities, those for detected problems or situations that require
immediate attention or intervention, are reflected on the highest level of the user
interface. Additional detail and lower priority items can be viewed by drilling down to
the next level within the interfaces. Those interfaces are the Work with Systems
display and depending on your configuration, either the Work with Application Groups
display or the Work with Data Groups display.
Policies - A policy is a mechanism used to enable, disable, or provide input to a
function such as replication, auditing, or MIMIX Model Switch Framework. For most
policies, the initially shipped values apply to an installation. However, policies can be
changed and most can also be overridden for individual data groups. Policies that
control when audits are automatically performed can be set only for each specific
combination of audit rule and data group.
Notifications - A notification is the resulting automatic report associated with an
event that has already occurred. The severity of a notification is reflected in the
overall status of the installation. Notifications can be generated by a process,
program, command, or monitor. Because the originator of notifications varies, it is
important to note that notifications can represent both real-time events as well as
events that occurred in the past but, due to scheduling, are being reported in the
present.
Recoveries - This term recovery is used in two ways. The most common use refers to
the recovery action taken by a replication process or an audit to correct a detected
difference when automatic recovery polices are enabled. The second use refers to a
temporary report that provides details about a recovery action in progress that is
created when the recovery action starts and is removed when it completes.
System role concepts
MIMIX uses several pairs of terms to refer to the role of a system within a particular
context. These terms are not interchangeable.
Production system and backup system - These terms describe the role of a system
relative to the way applications are used on that system.
A production system is the system currently running the production workload for the
applications. In normal operations, the production system is the system on which the
principal copy of the data and objects associated with the application exist.
A backup system is the system that is not currently running the production workload
for the applications. In normal operations, the backup system is the system on which
you maintain a copy of the data and objects associated with the application. These
roles are not always associated with a specific system. For example, if you switch
application processing to the backup system, the backup system temporarily
becomes the production system.
Typically, for normal operations in basic two-system environment, replicated data
flows from the system running the production workload to the backup system.
Source system and target system - These terms identify the direction in which an
activity occurs between two participating systems.
18
MIMIX concepts
A source system is the system from which MIMIX replication activity between two
systems originates. In replication, the source system contains the journal entries.
Information from the journal entries is either replicated to the target system or used to
identify objects to be replicated to the target system.
A target system is the system on which MIMIX replication activity between two
systems completes.
Management system and network system - These terms define the role of a
system relative to how the products interact within a MIMIX installation. These roles
remain associated with the system within the MIMIX installation to which they are
defined. One system in the MIMIX installation is designated as the management
system and the remaining one or more systems are designated as network systems.
A management system is the system in a MIMIX installation that is designated as the
control point for all installations of the product within the MIMIX installation. The
management system is the location from which work to be performed by the product
is defined and maintained. Often the system defined as the management system also
serves as the backup system during normal operations.
A network system is any system in a MIMIX installation that is not designated as the
management system (control point) of that MIMIX installation. Work definitions are
automatically distributed from the management system to a network system. Often a
system defined as a network system also serves as the production system during
normal operations.
Journaling concepts
MIMIX uses journaling to perform replication and to support newer analysis
functionality.
Journaling and object auditing - Journaling and object auditing are techniques that
allow object activity to be logged to a journal. Journaling logs activity for selected
objects of specific object types to a user journal. Object auditing logs activity for all
objects to the security audit journal (QAUDJRN, the system journal), including those
defined to a user journal. MIMIX relies on these techniques and the entries placed in
the journal receivers for replicating logged activity.
Journal - An IBM i system object that identifies the objects being journaled and the
journal receivers associated with the journal. The system journal is a specialized
journal on the system which MIMIX uses.
Journal receiver - An IBM i system object that is associated with a journal and
contains the log of all activity for objects defined to the journal.
Journal entry - A record added to a journal receiver that identifies an event that
occurred on a journaled object. MIMIX uses file and record level journal entries to
recreate the object on a designated system.
Remote journaling - A function of IBM i that allows you to establish journals and
journal receivers on one system and associate them with specific journals and journal
receivers on another system. Once the association is established, the operating
system can use the pair of journals to replicate journal entries in one direction, from
the local journal to the remote journal on the other system. In some configurations,
19
MIMIX concepts
MIMIX uses remote journaling for transferring data to be replicated from the source
system to the target system.
Configuration concepts
MIMIX configuration provides considerable flexibility to enable supporting a wide
variety of customer environments. Configuration is implemented through sets of
related commands. The following terms describe configuration concepts.
Definitions - MIMIX uses several types of named definitions to identify related
configuration choices.
•
System definitions identify systems that participate in a MIMIX installation. Each
system definition identifies one system.
•
Transfer definitions identify the communications path and protocol to be used
between systems.
•
Journal definitions identify journaling environments that MIMIX uses for replication
Each journal definition identifies a system and characteristics of the journaling
environment on that system.
•
Data group definitions identify the characteristics of how replication occurs
between two systems. Each data group definition determines the direction in
which replication occurs between the systems, whether that direction can be
switched, and the default processing characteristics for replication processes.
•
Application group definitions identify whether the replication environment does or
does not use IBM i clustering. When clustering is used, the application group also
defines information about an application or proprietary programs necessary for
controlling operations in the clustering environment.
Data group entries - A data group entry is a configuration construct that identifies a
source of information to be replicated by or excluded from replication by a data group.
Each entry identifies at least one object and its location on the source system.
Classes of data group entries are based on object type. MIMIX uses data group
entries to determine whether a journal entry should be replicated. Data groups that
replicate from both the system journal and a user journal can have any combination of
data group entries.
Remote journal link (RJ link) - An RJ link is a MIMIX configuration element that
identifies an IBM i remote journaling environment used by user journal replication
processes. An RJ link identifies the journal definitions that define the source and
target journals, primary and secondary transfer definitions for the communications
path used by MIMIX, and whether the IBM i remote journal function sends journal
entries asynchronously or synchronously.
Cooperative processing - Cooperative processing refers to MIMIX techniques that
efficiently replicate certain object types by using a coordinated effort between the
system journal and user journal replication paths. Configuration choices in data group
definitions and data group entries determine the degree of cooperative processing
used between the system journal and user journal replication paths when replicating
database files, IFS objects, data areas, and data queues.
20
MIMIX concepts
Tracking entries - Tracking entries identify objects that can be replicated using
advanced journaling techniques and assist with tracking the status of their replication.
A unique tracking entry is associated with each IFS object, data area, and data queue
that is eligible for replication using advanced journaling. IFS tracking entries identify
eligible, existing IFS objects while object tracking entries identify eligible, existing data
areas and data queues.
Process concepts
The following terms identify MIMIX processes. Some, like the system manager, are
required to allow MIMIX to function. Others, like procedures, are used only when
invoked by users.
Replication path - A replication path is a series of processes used for replication that
represent the critical path on which data to be replicated moves from its origin to its
destination. MIMIX uses two replication paths to accommodate differences in how
replication occurs for user journal and system journal entries. These paths operate
with configurable levels of cooperation or can operate independently.
•
The user journal replication path captures changes to critical files and objects
configured for replication through a user journal. When configuring this path,
shipped defaults use the remote journaling function of the operating system to
simplify sending data to the remote system. The changes are applied to the target
system.
•
The system journal replication path handles replication of critical system objects
(such as user profiles, program objects, or spooled files), integrated file system
(IFS) objects, and document library object (DLOs) using the system journal.
Information about the changes are sent to the target system where it is applied.
System manager - The system manager is a pair of communications jobs between
the management system and a network system which must be active to enable
replication. The system manager monitors for configuration changes and
automatically moves any configuration changes to the network system. Dynamic
status changes are also collected and returned to the management system. The
system manager also gathers messages and timestamp information from the network
system and places them in a message log and timestamp file on the management
system. In addition, the system manager performs periodic maintenance tasks,
including cleanup of the system and data group history files.
Journal manager - The journal manager is a job on each system that MIMIX uses to
maintain the journaling environment on that system. By default, MIMIX performs both
change management and delete management for journal receivers used by the
replication process.
Collector services - A group of jobs that are necessary for MIMIX to track historical
data and to support using the MIMIX portal application within the Vision Solutions
Portal. One or more collector service jobs collect and combine MIMIX status from all
systems.
Cluster services - When MIMIX Global is configured for IBM i clustering, MIMIX uses
the cluster services function provided by IBM i to integrate the system management
functions needed for clustering. Cluster services must be active in order for a cluster
21
MIMIX concepts
node to be recognized by the other nodes in the cluster. MIMIX integrates starting and
stopping cluster services into status and commands for controlling processes that run
at the system level.
Target journal inspection - A MIMIX process that reads a journal on a system being
used as the target system for replication. The process identifies people or processes
other than MIMIX that accessed replicated objects on the target system. Users can
access the resulting information from the Replicated Objects portlet within the MIMIX
portal application in Vision Solutions Portal.
Procedures and steps - Procedures and steps are a highly customizable means of
performing operations for application groups. A set of default procedures for each
application group provide the ability to start, end, perform pre-check activity for
switching, and switch the application group. Each operation is performed by a
procedure that consists of a sequence of steps and multiple jobs. Each step calls a
predetermined step program to perform a specific sub-task of the larger operation.
Steps also identify runtime attributes for handling before and after the program call
within the context of the procedure.
Log space - A MIMIX object that provides an efficient storage and manipulation
mechanism for replicated data that is temporarily stored on the target system during
the receive and apply processes.
Additional switching concepts
The following concepts are specific to switching.
Environments configured with application groups perform switching through
procedures.
Planned switch - An intentional change to the direction of replication for any of a
variety of reasons. You may need to take the system offline to perform maintenance
on its hardware or software, or you may be testing your disaster recovery plan. In a
planned switch, the production system (the source of replication) is available. When
you perform a planned switch, replication is ended on both the source and target
systems. The next time you start replication, it will be set to replicate in the opposite
direction.
Unplanned switch - A change the direction of replication as a response to a problem.
Most likely the production system is no longer available. When you perform an
unplanned switch, you must initiate the switch from the target system. Replication is
ended on the target system. The next time you start replication, it will be set to
replicate in the opposite direction.
MIMIX Model Switch Framework - A set of programs and commands that provide a
consistent framework to be used when performing planned or unplanned switches in
environments that do not use application groups. Typically, a model switch framework
is customized to your environment through its exit programs.
MIMIX Switch Assistant - A guided user interface that guides you through switching
using your default MIMIX Model Switch Framework. MIMIX Switch Assistant is
accessed from the MIMIX Basic Main Menu and does not support application groups.
22
Best practices for maintaining your MIMIX environment
Best practices for maintaining your MIMIX environment
MIMIX is shipped with default settings that incorporate many best practices for
maintaining your environment. Others may require changing policies and adopting
best practices within your organization. Best practices include:
•
Allow MIMIX to automatically correct differences detected during database and
object replication processes that would otherwise result in errors. If MIMIX is
unable to perform the recovery, the problem is reported as a replication error (a
file is placed in held error or an object is in error).
•
Allow MIMIX to automatically perform audits and to automatically recover any
differences detected by audits. Best practice is to allow regularly scheduled audits
of all objects configured for replication and daily audits of prioritized categories of
replicated objects. User interfaces summarize audit results and indicate whether
MIMIX is unable to recover an object.
•
Perform all audits with the audit level set at level 30 immediately prior to a planned
switch to the backup system and before switching back to the production system.
•
Perform switches on a regular basis. Best practice is to switch every three to six
months. You need to set aside time for performing planned switches.
Environments that continue to use MIMIX Switch Assistant can use policies so
that compliance with regular switching is automatically reported in the user
interface.
Authority to products and commands
If your MIMIX environment takes advantage of the additional security available in the
product and command authority functions which Vision Solutions provides through
License Manager, you may need a higher authority level in order to perform MIMIX
daily operations.
A MIMIX administrator can change your authorization level to commands and
displays. Authorization levels typically fall into these categories:
•
Viewing information requires display (*DSP) authority.
•
Controlling operations requires operator (*OPR) authority.
•
Creating or changing configuration requires management (*MGT) authority.
For example, consider audits. You can view an audit if you have display authority,
perform audits if you have operator authority, and change policies that affect how
auditing is performed if you have management authority.
For more information about these provided security functions, see the Using License
Manager book.
23
Accessing the MIMIX Main Menu
Accessing the MIMIX Main Menu
The MIMIX command accesses the main menu for a MIMIX installation. The MIMIX
Main Menu has two assistance levels, basic and intermediate. The command defaults
to the basic assistance level, shown in Figure 1, with its options designed to simplify
day-to-day interaction with MIMIX. Figure 2 shows the intermediate assistance level.
The options on the menu vary with the assistance level. In either assistance level, the
available options also depend on the MIMIX products installed in the installation
library and their licensing. The products installed and the licensing also affect
subsequent menus and displays.
Accessing the menu - If you know the name of the MIMIX installation you want, you
can use the name to library-qualify the command, as follows:
Type the command library-name/MIMIX and press Enter. The default name of
the installation library is MIMIX.
If you do not know the name of the library, do the following:
1. Type the command LAKEVIEW/WRKPRD and press Enter.
2. Type a 9 (Display product menu) next to the product in the library you want on the
Vision Solutions Installed Products display and press Enter.
Changing the assistance level - The F21 key (Assistance level) on the main menu
toggles between basic and intermediate levels of the menu. You can also specify the
the Assistance Level (ASTLVL) parameter on the MIMIX command.
Figure 1.
MIMIX Basic Main Menu
MIMIX Basic Main Menu
System:
SYSTEM1
MIMIX
Select one of the following:
1.
2.
3.
4.
5.
6.
10.
11.
12.
13.
14.
Work with application groups
Start MIMIX
End MIMIX
Switch all application groups
Start or complete switch using Switch Asst.
Work with data groups
Availability status
Configuration menu
Work with monitors
Work with messages
Cluster menu
WRKAG
WRKDG
WRKMMXSTS
WRKMON
WRKMSGLOG
More...
Selection or command
===>__________________________________________________________________________
______________________________________________________________________________
F3=Exit
F4=Prompt
F9=Retrieve
F21=Assistance level
F12=Cancel
(C) Copyright Vision Solutions, Inc., 1990, 2014.
24
Accessing the MIMIX Main Menu
Note: On the MIMIX Basic Main Menu, options 5 (Start or complete switch using
Switch Asst.) and 10 (Availability Status) are not recommended for
installations that use application groups.
Figure 2.
MIMIX Intermediate Main Menu
MIMIX Intermediate Main Menu
System:
SYSTEM1
MIMIX
Select one of the following:
1.
2.
3.
4.
5.
6.
7.
11.
12.
13.
14.
Work
Work
Work
Work
Work
Work
Work
with
with
with
with
with
with
with
data groups
systems
messages
monitors
application groups
audits
procedures
WRKDG
WRKSYS
WRKMSGLOG
WRKMON
WRKAG
WRKAUD
WRKPROC
Configuration menu
Compare, verify, and synchronize menu
Utilities menu
Cluster menu
More...
Selection or command
===>__________________________________________________________________________
______________________________________________________________________________
F3=Exit
F4=Prompt
F9=Retrieve
F21=Assistance level F12=Cancel
(C) Copyright Vision Solutions, Inc., 1990, 2014.
25
CHAPTER 2
MIMIX policies
Each MIMIX policy is a mechanism used to enable, disable, or provide input to a
function such as replication, auditing, or MIMIX Model Switch Framework. A policy
may also determine how you are notified about certain problems that may occur.
For most policies, the initially shipped values apply to an installation. However,
policies can be changed and most can also be overridden for individual data groups.
When a policy is set for a data group, it takes precedence over the installation policy.
Some policies, such as ones that control when audits are automatically submitted,
apply to individual audit rules for specific data groups.
Policies must be changed from the management system. Changing policies requires
that you have management-level authority to the Set MIMIX Policy (SETMMXPCY)
command.
You can set policies from a command line or from the Work with Audits, the MIMIX
Availability Status, and the Work with DG Definitions displays.
The topics in this chapter include:
•
“Environment considerations for policies” on page 27 describes additional
considerations for setting policies for environments with more than two nodes or
bi-directional replication. Also, applications and features can conflict with policycontrolled automatic recovery functions.
•
“Setting policies - general” on page 29 provides basic procedures for changing
policies. Other topics in this chapter include more in-depth procedures for specific
policy-controlled functionality.
•
“Policies which affect an installation” on page 31 identifies the policies that are set
for an installation and which cannot be overridden by a data group-level setting.
Also, this includes procedures for changing retention criteria for procedure history.
•
“Policies which affect replication” on page 32 identifies the policies associated
with automatic error detection and correction during replication and identifies the
common object and file error situations that can be automatically recovered.
•
“Policies which affect auditing” on page 36 identifies policies that influence audit
runtime behavior and control scheduling for automatically submitted audits.
Shipped audits and their descriptions and default scheduling details are included.
•
“Changing auditing policies” on page 41 provides additional information and
procedures for changing policies associated with auditing. This includes changing
the auditing level before switching, changing automatic audit scheduling,
changing audit history retention, restricting auditing based on the state of data
groups, and disabling auditing.
•
“Policies for switching with model switch framework” on page 48 identify the
policies associated with model switch framework and includes instructions for
changing these policies.
•
“Policy descriptions” on page 50 describes polices used by MIMIX.
26
Environment considerations for policies
Environment considerations for policies
Default settings for policies are chosen to address the needs of a broad set of
customer environments. However, in more complex environments, you need to
consider the effect of policies. Also, applications and other MIMIX features in some
environments can conflict with automatic recovery actions during replication and with
auditing.
Policies in environments with more than two nodes or bi-directional replication
Policy values may affect data throughout your entire environment, not just a single
installation or data group. This is of particular concern in environments that have more
than two systems (nodes) or which have replication occurring simultaneously in more
than one direction (bi-directional). Specifically, be aware of the following:
•
In these environments, the value *DISABLED for the Objects only on target policy
is recommended. When the policy is disabled, audits will detect that objects exist
only on the target system but will not attempt to correct them. The commands
used by an audit are aware of all objects on the target system, not just those
which originate from the source system of the data group associated with the
audit. In these environments, the values *DELETE and *SYNC must be used with
care. When the policy value is Delete, audits will delete objects which may have
originated from systems not associated with the data group being audited. When
the policy value is Synchronize, audits will synchronize the objects to the source
system of the data group being audited, which may not be the source system from
which they originated.
•
Synchronization of user profiles and authorization lists associated with an object
will occur unless the user profiles and authorization lists are explicitly excluded
from the data group configuration. In the environments mentioned, this may result
in user profiles and authorization lists being synchronized to other systems in your
configuration. This behavior occurs whenever any of the automatic recovery
policies are enabled (database, object, audit). To prevent this from occurring, you
must explicitly exclude the user profiles and authorization lists from replication for
any data group for which you do not want them synchronized.
•
In a simultaneously bi-directional environment, determine which system ‘wins’ in
the event of a data conflict, that is, which system will be considered as having the
correct data. Choose one direction of replication that will be audited and allow
auditing for those data groups. Disable audits for data groups that replicate in the
opposite direction. For example, data groups AB and BA are configured for bidirectional replication between system A and system B. Data group AB replicates
from system A to system B and data group BA replicates the opposite direction.
System B is also the management system for this installation. You chose system
A as the winning system and want to permit auditing in the direction from A to B.
The Audit level policy for data group AB must be set to a level that permits audits
to run (level 10 or higher). The Audit level policy for data group BA must be set to
disable audits. The results of audits of data group AB will be available on system
B, because system B is the management system and default policy values cause
27
Environment considerations for policies
rules to be run from the management system.
•
In environments with three or more systems in the same installation, you need to
evaluate each pair of systems. For each pair of systems, evaluate the directions in
which replication is permitted. If any pair of systems supports simultaneous bidirectional replication, determine the winning system in each pair and determine
the direction to be audited. Set the audit level policy to permit auditing for the data
group that replicates in the chosen direction. Disable auditing for the data group
which replicates in the other direction. You may also want to consider changing
the values of the Run rule on system policy for the installation or the audited data
groups to balance processing loads associated with auditing.
•
In environments that permit multiple management systems in the same
installation, in addition to evaluating the direction of replication permitted within
each pair of systems, you must also consider whether the systems defined by
each data group are both management systems. If any pair of systems supports
simultaneous bi-directional replication, choose the winning system and change
the Audit level policies for each data group so that only one direction is audited.
You may need to change the Run rule on system policy to prevent certain data
groups from being audited from specific management systems.
When to disable automatic recovery for replication and auditing
At times, you may need to disable automatic recoveries during replication and
auditing for certain data groups because a feature in use or an application being
replicated may interact with auditing in an undesirable way.
Features - Do not use automatic recoveries during auditing and replication in any
data group that is using functions provided by the MIMIX CDP™ feature. This feature,
which requires an additional license key, permits you to perform operations
associated with maintaining continuous data protection. By configuring a recovery
window for a data group, you introduce an automatic delay into when the apply
processes complete replication. By setting a recovery point for a data group, you
identify a point that, when reached, will cause the apply processes to be suspended.
In both cases, source system changes have been transferred to the target system but
have not been applied. In such an environment, comparisons will report differences
and automatic recoveries will attempt recovery for items that have not completed
replication. To prevent this from occurring, disable comparisons and automatic
recoveries for any data group which uses the MIMIX CDP feature. For details, see
“Disabling audits and recovery when using the MIMIX CDP feature” on page 29.
Applications - At times, data groups for some applications will encounter problems if
the application cannot acquire locks on objects that are defined to MIMIX. These data
groups may need to be excluded from auditing. MIMIX acquires locks occasionally to
save and restore objects within the replication environment. Some applications may
fail when they cannot acquire a lock on an object. Refer to our Support Central for
FAQs that list specific applications whose data groups should be excluded from
auditing. For those excluded data groups, you can still run compares to determine if
objects are not synchronized between source and target systems. Care must be
taken to recover from these unsynchronized conditions.The applications may need to
be ended prior to manually synchronizing the objects.
28
Setting policies - general
To exclude a data group from audits, use the instructions in “Preventing audits from
running” on page 45.
Disabling audits and recovery when using the MIMIX CDP feature
The functions provided by the MIMIX CDP™ feature1 create an environment in which
source system changes have been transferred to the target system but have not been
applied. Any data group which uses this feature must disable automatic comparisons
and automatic recovery actions for the data group.
Do the following from the management system:
1. From the command line type SETMMXPCY and press F4 (Prompt).
2. For the Data group definition, specify the full three-part name of the data group
that uses the MIMIX CDP feature.
3. Press Enter to see all the policies and their current values.
4. For Automatic object recovery, specify *DISABLED.
5. For Automatic database recovery, specify *DISABLED.
6. For Automatic audit recovery, specify *DISABLED.
7. For Audit level, select *DISABLED.
8. To accept the changes, press Enter.
Setting policies - general
Policies must be changed from the management system. Changing policies requires
that you have management-level authority to the Set MIMIX Policy (SETMMXPCY)
command.
The following procedures describe the basic procedures for setting policies.
Changing policies for an installation
This procedure changes a policy value at the installation level. The installation level
value will overridden if a data group level policy has been specified with a value other
than *INST.
Do the following from the management system:
1. From the command line type SETMMXPCY and press F4 (Prompt).
2. Verify that the value specified for Data group definition is *INST.
3. Press Enter to see all the policies and their current values.
4. Specify a value for the policy you want. Use F1 (Help) to view descriptions of
possible values.
1. The MIMIX CDP™ feature requires an additional license key.
29
Setting policies - general
5. To accept the changes, press Enter.
Changing policies for a data group
Do the following from the management system:
1. From the command line type SETMMXPCY and press F4 (Prompt).
2. For the Data group definition, specify the full three-part name.
3. Press Enter to see all the policies and their current values.
4. Specify a value for the policy you want defined for the data group. Use F1 (Help)
to view descriptions of possible values.
5. To accept the changes, press Enter.
Resetting a data group-level policy to use the installation level value
Do the following from the management system:
1. From the command line type SETMMXPCY and press F4 (Prompt).
2. For the Data group definition, specify the full three-part name.
3. Press Enter to see all the policies and their current values.
4. For the policy you want to reset, specify *INST.
5. To accept the changes, press Enter.
30
Policies which affect an installation
Policies which affect an installation
While many policies can be set for an installation, the policies in Table 1 cannot be
overridden for an individual data group. At the data group level, these policies always
have a value of *INST.
Table 1.
Policies that can be set only at the installation level and shipped default values.
Policy
Shipped Values – Installation
Independent ASP library ratio
5
Procedure history retention
• Minimum days
• Minimum runs per procedure
• Min. runs per switch procedure
7
1
1
Changing retention criteria for procedure history
The procedure history retention policy determines how long to retain historical
information about procedure runs that completed, completed with errors, or that failed
or were canceled and have been acknowledged.
Environments configured with application groups use procedures to control
operations such as starting, ending, or switching. History information for a procedure
includes timestamps indicating when the procedure was run and detailed information
about each step within the procedure. The policy specifies how many days to keep
history information and the minimum number of runs to keep. You can specify a
different number of runs to keep for switch procedure runs than what is kept for other
types of procedures.
Each procedure run is evaluated individually against the policy and its history
information is retained until the specified minimum days and minimum runs are both
met. When a procedure run exceeds these criteria, system manager cleanup jobs will
remove the historical information for that procedure run from all systems. The values
specified at the time the cleanup jobs run are used for evaluation.
To change the procedure history retention policy for the installation, do the following:
1. From the command line type SETMMXPCY and press F4 (Prompt).
2. Verify that the value *INST is specified for the Data group definition prompt:
3. Press Enter to see all the policies and their current values.
4. Locate the Procedure history retention policy. The current values are displayed.
Specify values for the elements you want to change.
5. To accept the changes, press Enter.
31
Policies which affect replication
Policies which affect replication
Table 2 identifies the policies which can affect replication and their shipped default
values.
Table 2.
Policies associated with replication and shipped default values.
Policy
Shipped Values
Replication Processes
Installation
Data
Groups
System
Journal
User
Journal
Data group definition
*INST
Name1
Yes
Yes
Automatic system journal
recovery
*ENABLED
*INST1
Yes2
–
Automatic user journal
recovery
*ENABLED
*INST
–
Yes2
System journal recovery
notify on success
*YES
*INST
Yes
–
User journal recovery notify
on success
*YES
*INST
–
Yes
DB apply cache
*DISABLED
*INST
–
Yes
–
Yes
Access path maintenance3
• Optimize for DB apply
• Maximum number of jobs
*DISABLED
99
*INST
*INST
Synchronize threshold size
9,999,999
*INST
Yes
Yes
Number of third delay retry
attempts
100
*INST
Yes
–
Third delay retry interval
15
*INST
Yes
–
1.
2.
3.
A data group definition value of *INST indicates the policy is installation-wide. A name indicates the
policies are in effect only for the specified data group.
When this policy is enabled, the other policies in the same column are in effect unless otherwise
noted.
This policy is available only on systems running service pack 7.1.15.00 or higher. When running on
earlier levels, the Parallel AP maintenance provides similar functionality. For more information about
both access path maintenance functions, see the MIMIX Administrator Reference book.
MIMIX can automatically attempt to correct problems it encounters during replication
when the policies for Automatic system journal recovery and Automatic user journal
recovery are enabled. The following topics identify what errors can be recovered in
this way:
•
“Errors handled by automatic database recovery” on page 33
•
“Errors handled by automatic object recovery” on page 34
32
Policies which affect replication
Errors handled by automatic database recovery
MIMIX can detect and correct the most common file error situations that occur during
database replication. When the Automatic database recovery policy is enabled,
database replication processes detect the types of errors listed in Table 3. When an
error is detected, MIMIX automatically attempts to correct the error by starting a job to
perform an appropriate recovery action.
The recovery action also sends a report of a recovery in progress to the user
interface. The reports are on the Work with Recoveries display (WRKRCY command).
When the recovery action completes, the report is removed.
The DB rcy. notify on success policy determines whether a successful recovery
generates an informational notification.
Only when all recovery options are exhausted without success is a file placed in hold
error (*HLDERR) status. Recovery actions that end in an error do not generate a
separate error notification because the error is already reflected in MIMIX status.
Table 3.
Errors detected and corrected during database replication when automatic database recovery is
enabled.
Error
Description
File level errors
- and Unique-key record level
error
Typically invoked when there is a missing library, file, or member. Also invoked
when an attempt to write a record to a file results in a unique key violation.
Without database autonomics, these conditions result in the file being placed in
*HLDERR status.
Record level errors
Invoked when the database apply process detects a data-level issue while
processing record-level transactions.
Without database autonomics, any configured collision resolution methods may
attempt to correct the error. Otherwise, these conditions result in the file being
placed in *HLDERR status.
Errors on IFS objects
configured for user
journal replication
Invoked during the priming of IFS tracking entries when replicated IFS objects
are determined to be missing from the target system. Priming of tracking entries
occurs when a data group is started after a configuration change or when Deploy
Data Grp. Configuration (DPYDGCFG) is invoked.
Errors on data area and
data queue objects
configured for user
journal replication
Invoked during the priming of object tracking entries when replicated data area
and data queue objects are determined to be missing from the target system.
Priming of tracking entries occurs when a data group is started after a
configuration change or when the Deploy Data Grp. Configuration (DPYDGCFG)
is invoked.
Errors when DBAPY
cannot open the file or
apply transactions to the
file
Invoked when a temporary lock condition or an operating system condition exists
that prevents the database apply process (DBAPY) from opening the file or
applying transactions to the file. Without database autonomics, users typically
have to release the file so the database apply process (DBAPY) can continue
without error.
33
Policies which affect replication
Errors handled by automatic object recovery
MIMIX can detect and correct the most common object error situations that occur
during replication. When the Automatic object recovery policy is enabled, object
replication processes detect the types of errors listed in Table 4. When an error is
detected, MIMIX automatically attempts to correct the error by starting a job to
perform an appropriate recovery action.
Unless the object is explicitly excluded from replication for a data group, the
autonomic recovery action will synchronize the object to ensure that it is on the target
system.
Note: Object automatic recovery does not detect or correct the following problems:
• Missing spooled files on the target system.
• Files and objects that are cooperatively processed. Although the files and
objects are not addressed, problems with authorities for cooperatively
processed files and objects are addressed.
• Activity entries that are “stuck” in a perpetual pending status (PR, PS, PA, or
PB).
The recovery action also sends a report of a recovery in progress to the user
interface. In a 5250 emulator, the reports are on the Work with Recoveries display
(WRKRCY command). When the recovery action completes, the report is removed.
The Obj. rcy. notify on success policy determines whether a successful recovery
generates an informational notification.
Only when all recovery options are exhausted without success is an activity entry
placed in error status. Recovery actions that end in an error do not generate a
separate error notification because the error is already reflected in MIMIX status.
Table 4.
Errors detected and recoveries attempted by object autonomics during object replication
Error
Description
Missing objects
on target
system1
An object (library-based, IFS, or DLO) exists on the source system and is within the name
space for replication, but MIMIX detects that the object does not exist on the target
system. Without object automatic recovery, this results in a failed activity entry.
Notes:
• Missing spooled files are not addressed.
• Missing objects that are configured for cooperative processing are not synchronized.
However, any problems with authorities (*AUTL or *USRPRF) for the missing objects
are addressed.
Missing parent
objects on
target system1
Any operation against an object whose parent object is missing on the target system.
Without object autonomics, this condition results in a failed activity entry due to the
missing parent object.
Missing
*USRPRF
objects on
target system1
Any operation that requires a user profile object (*USRPRF) that does not exist on the
target system. Without object autonomics, this results in authority or object owner issues
that cause replication errors.
34
Policies which affect replication
Table 4.
Errors detected and recoveries attempted by object autonomics during object replication
Error
Description
Missing *AUTL
objects on
target system1
Any operation that requires a authority list (*AUTL) that does not exist on the target
system.Without object autonomics, this results in authority issues that cause replication
errors.
In-use
condition
Applications which hold persistent locks on objects can result in object replication errors if
the configured values for delay/retry intervals are exceeded. Default values in the data
group definition provide approximately 15 minutes during which MIMIX attempts to
access the object for replication. If the object cannot be accessed during this time, the
result is activity entries with errors of Failed Retrieve (for locked objects on the source
system) and Failed Apply (for locked objects on the target system) and a reason code of
*INUSE.
Notes:
1. The Number of third delay/retries policy and the Third retry interval policy determine
whether automatic recovery is attempted for this error.
2. Automatic recovery for this error is not attempted when the objects are configured for
cooperative processing.
1.
The synchronize command used to automatically recover this problem during replication will correct this error any time
the command is used.
35
Policies which affect auditing
Policies which affect auditing
Policies for auditing are divided into these subsets:
•
Policies that affect the behavior of all audits in an installation. These policies can
be overridden at the data group level. When set for a specific data group, these
policies affect all audits for the data group.
•
Policies that affect when audits automatically run and how those audits select
objects. These policies are set for each unique combination of audit and data
group.
Policies for auditing runtime behavior
The policies identified in Table 5 affect all audit runs regardless of whether the audit
was automatically submitted or manually invoked. These policies can be set for the
installation as well as overridden for an individual data group. The shipped default
values for both levels are indicated.
When the Set MIMIX Policies (SETMMXPCY) command specifies a data group
definition value of *INST, the policies being changed are effective for all data groups in
the installation, unless a data group-level override exists. When the data group
definition specifies a name, policies which specify the value *INST inherit their value
from the installation-level policy value and polices which specify other values are in
effect for only the specified data group.
Table 5.
Shipped default values of policies associated with auditing runtime behavior.
Policy
Shipped Values
Installation
Data Groups
Data group definition
*INST
Name
Automatic audit recovery
*ENABLED
*INST
Audit notify on success
*RULE
*INST
Notification severity
*RULE
*INST
Object only on target action
*DISABLED
*INST
Journal attribute differences action
• MIMIX configured higher
• MIMIX configured lower
*CHGOBJ
*NOCHG
*INST
*INST
User journal apply threshold action
*END
*INST
Maximum rule runtime
1440
*INST
Audit warning threshold1
7
*INST
Audit action threshold1
14
*INST
Audit level
*LEVEL30
*INST
Run rule on system
*MGT
*INST
36
Policies which affect auditing
Table 5.
Shipped default values of policies associated with auditing runtime behavior.
Policy
Shipped Values
Installation
Data Groups
Action for running audits
• Inactive data group
• Repl. process in threshold
*NOTRUN2
*NOTRUN
*INST
*INST
Audit history retention
• Minimum days
• Minimum runs per audit
• Object details
• DLO and IFS details
7
1
*YES
*YES
*INST
*INST
*INST
*INST
Synchronize threshold size
9,999,999
*INST
CMPRCDCNT commit threshold
*NOMAX
*INST
1.
2.
These policies are not limited to recovery actions.
This is the default shipped value on systems running MIMIX service pack 7.1.12.00 or higher. For
earlier software levels, the shipped default value is *NONE.
Policies for submitting audits automatically
The Audit rule, Audit schedule, and Priority audit policies control when audits are
automatically submitted. These policies do not have a shipped value for the
installation level. The shipped values for the data group level are listed in Table 6.
If the Audit level policy is disabled, all auditing is disabled, regardless of the values
specified for Audit schedule and Priority audit policies. This includes manually
submitted audits.
Each shipped audit rule has default values for submitting priority audits as well as
scheduled audits. The shipped values for a rule are used for all new data groups.
When you specify names for Data group definition and Audit rule on the
SETMMXPCY command, you can adjust the values for a specific audit of a single
data group.
Table 6.
Shipped default values of policies for automatically submitting audits.
Policy
Shipped Values
Installation
Data Groups
Data group definition
*INST
Name
Audit rule
–
Varies by rule
37
Policies which affect auditing
Table 6.
Shipped default values of policies for automatically submitting audits.
Policy
Shipped Values
Installation
Audit schedule
State
Frequency
Scheduled date
Scheduled day
Scheduled time
Relative day of month
–
Priority audit
State
Start after
Start until
New objects selected
Changed objects selected
Unchanged objects selected
Audited with no differences
–
1.
2.
3.
Data Groups
*ENABLED1
*WEEKLY1
*SUN2
Varies by rule, see Table 7.
*ENABLED3
0300003
080000
*DAILY
*DAILY
*WEEKLY
*MONTHLY
The State element in the Audit schedule policy is available in MIMIX version 7.1.12.00 and higher.
For data groups that existed before upgrading to version 7.1.12.00, if the Frequency specified was a
value other than *NONE, that value is preserved by the upgrade process and the State is set to
*ENABLED. If the Frequency value was *NONE, it is changed to *WEEKLY and the State set to
*DISABLED.
The shipped default for Scheduled day changed in MIMIX version 7.1. For data groups created after
installing version 7.1, the shipped default is *SUN (previously, it was *ALL). For data groups that
existed before upgrading to version 7.1, the previous value for Scheduled day remains unchanged.
The Priority audit policy is new in MIMIX version 7.1. The State element for the Priority audit policy is
available in MIMIX version 7.1.12.00 and higher. For data groups that existed before upgrading from
any version 7.0 level to version 7.1.12.00 or higher, State is set to *DISABLED and Start after is set
to 030000. For data groups that existed before upgrading from versions 7.1.01.00 through 7.1.11.00
to version 7.1.12.00 or higher, if the Start after value specified was a value other than *NONE, that
value is preserved by the upgrade process and the State is set to *ENABLED. However if the Start
after value was *NONE, it is changed to 030000 and State is set to *DISABLED.
When automatically submitted audits run
For each audit rule, its shipped values enable both prioritized audits and scheduled
audits to run automatically. A prioritized audit starts one or more times an hour every
day during the time range specified in the Priority audit policy. A scheduled audit runs
once at its specified time on the days or dates for its frequency as specified in the
Audit schedule policy. For scheduled audits, the shipped value for start time of each
38
Policies which affect auditing
audit rule is staggered, beginning at 2 a.m. Table 7 shows the default times for priority
audits versus scheduled audits.
Table 7.
MIMIX rules and their shipped default times for Audit schedule (SCHEDULE) policy.
Shipped
Priority
Start Range
Shipped
Scheduled
Time
Rule Name
Description
Job Name
n/a1
2:00 a.m.
#DGFE
Checks configuration for files using
cooperative processing.
Uses the Check Data Group File Entries
(CHKDGFE) command.
sdn_DGFE
All other
audits:
3 a.m.
to
8 a.m.
2:05 a.m.
#OBJATR
Compares all attributes for all object
types supported for replication.
Uses the Compare Object Attributes
(CMPOBJA) command
sdn_OBJATR
2:10 a.m.
#FILATR
Compares all file attributes.
Uses the Compare File Attributes
(CMPFILA) command.
sdn_FILATR
2:15 a.m.
#IFSATR
Compares IFS attributes.
Uses the Compare IFS Attributes
(CMPIFSA) command.
sdn_IFSATR
2:20 a.m.
#FILATRMBR
Compares basic file attributes at the
member level.
Uses the Compare File Attributes
(CMPFILA) command.
sdn_MBRATR
2:25 a.m.
#DLOATR
Compares all DLO attributes.
Uses the Compare DLO Attributes
(CMPDLOA) command.
sdn_DLOATR
2:30 a.m.
#MBRRCDCNT
Compares the number of current
records (*CURRDS) and the number of
deleted records (*NBRDLTRCDS) for
physical files that are defined to an
active data group.
Uses the Compare Record Counts
(CMPRCDCNT) command.
sdn_RCDCNT
Note: Equal record counts suggest but do
not guarantee that files are
synchronized. This audit does not
have a recovery phase. Differences
detected by this audit appear as not
recovered in the Audit Summary.
2:35 a.m.
1.
#FILDTA2
Compares file contents.
Uses the Compare File Data
(CMPFILDTA) command.
sdn_FILDTA
The #DGFE audit is not eligible for prioritized auditing because it checks configuration data, not objects.
39
Policies which affect auditing
2.
The #FILDTA audit and the Compare File Data (CMPFILDTA) command require TCP/IP communications as their communications protocol.
40
Changing auditing policies
Changing auditing policies
This topic describes how to change specific policies that affect auditing behavior and
when automatic audits will run. MIMIX service providers are specifically trained to
provide a robust audit solution that meets your needs.
Changing when automatic audits are allowed to run
Policies control aspects of when both prioritized auditing and scheduled auditing are
automatically submitted. To effectively audit your replication environment you may
need to fine-tune when one or both types of audits are submitted.
For both types of auditing, consider:
•
How much time or system resource can you dedicate to audit processing each
day, week, or month?
•
How often should all data within the database be audited? Business requirements
as well as time and system resources need to be considered.
•
Does automatic scheduling conflict with regularly scheduled backups?
•
Are there jobs running at the same time as audits that could lock files needing to
be accessed during recovery?
For scheduled auditing (which select all objects), also consider:
•
Are there are a large number of objects to be compared?
•
Are there a large number of objects for which a rule is expected to attempt
recovery?
•
Specific audits may have additional needs. See “Considerations for specific
audits” on page 127.
•
While you may decide to vary the scheduled times, it is recommended that you
maintain the same relative order indicated in “When automatically submitted
audits run” on page 38.
Changing scheduling criteria for automatic audits
Both scheduled audits and priority audits have scheduling information. A change to an
audit’s scheduling information is effective immediately. If an audit is in progress at the
time its scheduling information is changed, the change is effective on the next
automatic run of the audit.
Do the following from the management system:
1. Do one of the following to access the Schedule view of the Work with Audits
display:
•
From the MIMIX Intermediate Main Menu, select option 6 (Work with audits)
and press Enter. Then use F10 as needed to access the Schedule view.
•
Enter the command:
installation-library/WRKAUD VIEW(*SCHEDULE)
2. Type 37 (Change audit schedule) next to the audit you want to change and press
41
Changing auditing policies
Enter.
3. The Set MIMIX Policies (SETMMXPCY) command appears, showing the selected
audit rule and data group. The current values for the Audit schedule and Priority
audit policies are displayed. Do one of the following:
•
To change when MIMIX is scheduled to run the audit to check all configured
objects, specify the values you want for elements of the Audit schedule policy.
•
To change when MIMIX is allowed to submit priority-based runs of the audit
every day, specify values for the Start after and Start until elements of the
Priority audit policy.
4. To make the changes effective, press Enter.
Changing the selection frequency of priority auditing categories
When priority auditing is used, you can control how often objects within priorities are
eligible for selection. Objects which had differences in their previous audit are always
selected. For other priority classes, you can change how often objects within the class
are eligible for selection by a prioritized audit. For descriptions of the priority classes
with changeable frequencies, see the Priority audit policy description.
If an audit is in progress at the time its category frequency information is changed, the
change is effective on the next automatic run of the audit.
Do the following from the management system:
1. Do one of the following to access the Work with Audits display:
•
From the MIMIX Intermediate Main Menu, select option 6 (Work with audits)
and press Enter. Then use F10 as needed to access the Schedule view.
•
Enter the command:
installation-library/WRKAUD
2. Type 37 (Change audit schedule) next to the audit you want to change and press
Enter.
3. The Set MIMIX Policies (SETMMXPCY) command appears, showing the selected
audit rule and data group. Page Down to see the current values of the Priority
audit policy.
4. Specify values in the following prompts that indicate how often objects in each
category are eligible for selection by a priority audit.
•
New objects selected
•
Changed objects selected
•
Unchanged objects selected
•
Audited with no diff.
5. To make the changes effective, press Enter.
42
Changing auditing policies
Changing the audit level policy when switching
Regardless of the level you use for daily operations, Vision Solutions strongly
recommends that you perform audits at audit level 30 before the following events to
ensure that 100 percent of the data is valid on the target system:
•
Before performing a planned switch to the backup system.
•
Before switching back to the production system.
For more information about the risks associated with lower audit levels, see
“Considerations for user-defined rules” on page 652.
From a 5250 emulator, do the following from the management system:
1. From the command line type SETMMXPCY and press F4 (Prompt).
2. Verify that the value specified for Data group definition is *INST.
3. Press Enter to see all the policies and their current values.
4. For Audit level, specify *LEVEL30. Then press Enter.
Changing the system where audits are performed
The Run rule on system policy determines the system on which audits run. The
shipped default is to run all audits for the installation from the management system.
When changing the value of this policy, also consider your switching needs. Click this
link to see additional information about the Run rule on system policy.
Note: This procedure changes a policy value at the installation level. The installation
level value can be overridden by a data group level policy value. Therefore, if
a data group has value other than *INST for this policy, that value remains in
effect.
To change the policy for the installation, do the following
1. On the management system type the following command and press F4 (Prompt)
installation-library/SETMMXPCY
2. Verify that the value *INST appears for the Data group definition.
3. Locate the Run rule on system policy. Specify the value you want.
4. Press Enter.
Changing retention criteria for audit history
The Audit history retention policy determines whether to retain information about the
results of completed audits and the objects that were audited. The policy specifies
how many days to keep history information and how many audit runs to keep, as well
as whether details about audited library-based objects and audited DLO and IFS
43
Changing auditing policies
objects are to be kept with the history information. Each audit is evaluated individually
against the policy values.
The policy is checked when an audit runs to determine whether to keep details about
the objects audited by that run. The policy is also checked when system manager
cleanup jobs run to determine if any audit has history information which exceeds both
specified retention criteria. The policy value in effect at the time each check occurs
determines the result.
To change the audit history retention policy, do the following:
1. From the MIMIX Intermediate Main Menu, select option 6 (Work with Audits) and
press Enter.
2. Determine whether to change the policy for the installation or at the data group
level. From the Work with Audits display, do one of the following:
•
To change the policy for all audits in the installation, press F16 (Inst. policies).
Then, press Enter when the Set MIMIX Policies (SETMMXPCY) command
appears.
•
To change the policy for all audits for a specific data group, type 36 (Change
DG policies) next to any audit for the data group you want and press Enter.
3. Locate the Audit history retention policy. The current values for the level you
chose in Step 2 are displayed. Specify values for the elements you want to
change.
Note: When large quantities of objects are eligible for replication, specifying
*YES to retain either Object details or DLO and IFS details may use a
significant amount of disk storage. Consider the combined effect of the
quantity of replicated objects for each data group, the number of days to
retain history, the number of audits to retain, and the frequency in which
audits are performed.
4. To accept the changes, press Enter.
Restricting auditing based on the state of the data group
You may want to control when audits are allowed to run based on the state of the data
group at the time of the audit request. For example, if you end MIMIX so that a batch
process can run, you may want to prevent audits from running while data groups are
inactive. If a data group process has a backlog during peak activity, you may want to
prevent audits from running while the backlog exists. Or, you may want to prevent
only automatic recovery from occurring during a backlog or when the data group is
inactive. The Action for running audits policy provides the ability to define what audit
activity will be permitted based on the state of the data group at the time of audit
request. This policy can be set for an installation or for a specific data group.
Note: For installations running service pack 7.1.12.00 and higher, most audits check
for threshold conditions in all database and object replication processes,
including the RJ link. #FILDTA audits only check for threshold warning
conditions in the RJ link and database replication processes. #DLOATR audits
only check for threshold warning conditions in object replication processes.
44
Changing auditing policies
For installations running earlier service packs, only database and object apply
processes are checked for thresholds.
Restricting audit activity in an installation based on data group state: Do the
following from the management system:
Note: This procedure changes a policy value at the installation level. The installation
level value can be overridden by a data group level policy value. Therefore, if
a data group has value other than *INST for this policy, that value remains in
effect.
1. From the command line type SETMMXPCY and press F4 (Prompt).
2. Verify that the value specified for Data group definition is *INST.
3. Press Enter to see all the policies and their current values.
4. For Action for running audits, do the following:
a. Specify the value you want for Inactive data group that indicates the audit
actions to permit when the data group is inactive
b. Specify the value you want for Repl. process in threshold that indicates the
audit actions to permit when any replication process checked by an audit has
reached its configured threshold.
5. To accept the changes, press Enter.
Restricting audit activity for a specific data group based on its state: Do the
following from the management system:
1. From the command line type SETMMXPCY and press F4 (Prompt).
2. For the Data group definition, specify the full three-part name.
3. Press Enter to see all the policies and their current values.
4. For Action for running audits, do the following:
a. Specify the value you want for Inactive data group that indicates the audit
actions to permit when the data group is inactive
b. Specify the value you want for Repl. process in threshold that indicates the
audit actions to permit when any replication process checked by an audit has
reached its configured threshold.
5. To accept the changes, press Enter.
Preventing audits from running
There may be scenarios when you need to disable auditing completely for either an
installation or a specific data group. Auditing may not be desirable on a test data
group or during system or network maintenance.
The Audit level policy can be used to disable all auditing, including manually invoked
audits.The Audit level can be set for an installation or for specific data groups. Note
that an explicitly set value for a data group will override the installation value and may
still allow an audit to run.
45
Changing auditing policies
You can also prevent audits for a data group from being submitted automatically but
still allow them to be invoked manually. Automatic submission can be prevented for a
specific audit of a data group by values specified for its priority audit and audit
schedule policies.
In addition to auditing, automatic recovery during replication may need to be
prevented from running due to issues with applications or MIMIX features, For more
information, see “When to disable automatic recovery for replication and auditing” on
page 28.
Disabling all auditing for an installation
Note: This procedure changes a policy value at the installation level. The installation
level value can be overridden by a data group level policy value. Therefore, if
a data group has value other than *INST for this policy, that value remains in
effect.
Do the following from the management system:
1. From the command line type SETMMXPCY and press F4 (Prompt).
2. Verify that the value specified for Data group definition is *INST.
3. Press Enter to see all the policies and their current values.
4. Specify *DISABLED for the Audit level policy.
5. To accept the changes, press Enter.
Disabling all auditing for a data group
Do the following from the management system:
1. From the command line type SETMMXPCY and press F4 (Prompt).
2. For the Data group definition, specify the full three-part name.
3. Press Enter to see all the policies and their current values.
4. Specify *DISABLED for the Audit level policy.
5. To accept the changes, press Enter.
Disabling automatically submitted audits
You can control whether each audit for a data group can be submitted automatically
by priority or by schedule. The Priority audit and Audit schedule policies act
independently so that you can have both, one, or neither type of automatic auditing.
Disabling a scheduled audit: Do the following from the management system:
1. From the command line type SETMMXPCY and press F4 (Prompt).
2. For the Data group definition, specify the full three-part name.
3. For Audit rule, specify the name of the MIMIX rule.
4. Press Enter to see the current values for the Audit schedule policy.
5. Do one of the following:
46
Changing auditing policies
a. For installations running version 7.1.12.00 or higher, specify *DISABLED for
the State prompt.
b. For installations running earlier software levels, specify *NONE for the
Frequency prompt.
6. To accept the changes, press Enter.
Disabling a prioritized audit: Do the following from the management system:
1. From the command line type SETMMXPCY and press F4 (Prompt).
2. For the Data group definition, specify the full three-part name.
3. For Audit rule, specify the name of the MIMIX rule.
4. Press Enter to see the current values for the Priority audit policy.
5. Do one of the following:
a. For installations running version 7.1.12.00 or higher, specify *DISABLED for
the State prompt.
b. For installations running earlier software levels, specify *NONE for the Start
after prompt.
6. To accept the changes, press Enter.
47
Policies for switching with model switch framework
Policies for switching with model switch framework
In environments that do not use application groups, MIMIX Switch Assistant (which
implements MIMIX Model Switch Framework) is usually used for switching. MIMIX
Model Switch Framework cannot be used to switch application groups.
Table 8 identifies the policies associated with switching using MIMIX Model Switch
Framework and the shipped default values of those policies.
For these policies, MIMIX Switch Assistant uses only the policy values specified for
the installation. If MIMIX cannot determine whether a MIMIX Model Switch
Framework is defined, the switch framework policy is *DISABLED.
If the SETMMXPCY command specifies a data group name, the switch framework is
required to be *INST. The switch thresholds are *DISABLED by default but can be
changed.
The policies in Table 8 have no effect on application group switching.
Table 8.
Shipped values of policies used by MIMIX Switch Assistant.
Policy
Shipped Values
Installation
Data Groups
Data group definition
*INST
Name1
Switch warning threshold
90
*DISABLED
Switch action threshold
180
*DISABLED
Default model switch framework
MXMSFDFT
*INST
1.
A data group definition value of *INST indicates the policy is installation-wide. A name indicates the
policies are in effect only for the specified data group.
Specifying a default switch framework in policies
MIMIX Switch Assistant requires that you have a configured MIMIX Model Switch
Framework and that you specify it in the default model switch framework policy for the
installation. You may also want to adjust policies for thresholds associated with MIMIX
Switch Assistant.
If you do not have a configured MIMIX Model Switch Framework, contact your
Certified MIMIX Consultant.
From a 5250 emulator, do the following from the management system:
1. From the command line type SETMMXPCY and press F4 (Prompt).
2. Verify that the value specified for Data group definition is *INST.
3. Press Enter to see all the policies and their current values.
4. At the Default model switch framework prompt, specify the name of the switch
framework to use for switching this installation.
5. To accept the changes, press Enter.
48
Policies for switching with model switch framework
Setting polices for MIMIX Switch Assistant
If the value of the installation-level policy is disabled, you must change the policy in
order to use MIMIX Switch Assistant.
Do the following from the management system:
1. From the command line type SETMMXPCY and press F4 (Prompt).
2. Verify that the value specified for Data group definition is *INST.
3. Press Enter to see all the policies and their current values.
4. Specify values for the following fields:
a. For Switch warning threshold, the value 90 is recommended.
b. For Switch action threshold, the value 180 is recommended.
c. For Default model switch framework, specify the name of your MIMIX Model
Switch Framework.
5. To accept the changes, press Enter.
Setting policies when MIMIX Model Switch Framework is not used
If you do not use MIMIX Model Switch Framework for switching, you disable the
default model switch framework policy at the installation level.
Do the following from the management system:
1. From the command line type SETMMXPCY and press F4 (Prompt).
2. Verify that the value specified for Data group definition is *INST.
3. Press Enter to see all the policies and their current values.
4. At the Default model switch framework prompt, specify *DISABLED.
5. To accept the change, press Enter.
49
Policy descriptions
Policy descriptions
There are minor differences in the names of policies between user interfaces for a
5250 emulator and Vision Solutions Portal. The names shown here are those used in
the 5250 emulator. For a complete description of all policy values, see online help for
the command.
Data group definition - Select the scope of the policies to be set. When the value
*INST is specified, the policies being set by the command apply to all systems and
data groups in the installation, with the exception of any policy for which a data grouplevel override exists. When a three-part qualified name of a data group is specified,
the policies being set by the command apply to only that data group and override the
installation-level policy values.
Audit rule - Select the MIMIX rule for which an audit schedule will be set for the
specified data group definition. The Audit schedule policy determines when this rule
will audit the data group. The audit rule must specify the value *NONE when changing
any policy except the audit schedule.
Automatic object recovery — Determines whether to enable functions that
automatically start recovery actions to correct detected common object errors that
occur during replication from the system journal.
Automatic database recovery — Determines whether to enable functions that
automatically start recovery actions to correct detected common file errors that occur
during replication from the user journal.
Automatic audit recovery — Determines whether to enable audits to start
automatic recovery actions to correct differences detected during their compare
phase.
Object recovery notify on success — Determines whether automatic object
recovery actions send an informational (*INFO) notification upon successful
completion. This policy is only valid when the Automatic object recovery policy is
enabled.
Database recovery notify on success — Determines whether automatic database
recovery actions send an informational (*INFO) notification upon successful
completion. This policy is only valid when the Automatic database recovery policy is
enabled.
Audit notify on success — Determines whether activity initiated by audits,
including recovery actions, should automatically send an informational (*INFO)
notification upon successful completion. If an audit is run when the Automatic audit
recovery policy is disabled, successful notifications are sent only for the compare
phase of the audit.
Notification severity — Determines the severity level of the notifications sent when
a rule ends in error. This policy determines the severity of the notification that is sent,
not the severity of the error itself. The policy is in effect whether the rule is invoked
manually or automatically.
This policy is useful for setting up an order precedence for notifications at the data
group level. For example, if you set this policy for data group CRITICAL to be
50
Policy descriptions
*ERROR when the value for the installation-level policy is *WARNING, any error
notifications sent from data group CRITICAL will have a higher severity than those
from other data groups.
Object only on target action — Determines how the recovery action for specific
audits should handle objects that are configured for replication but exist only on the
target system. The following rules check for the only-on-target error: #OBJATR,
#IFSATR, #DLOATR, #FILATR, and #FILATRMBR. When the Automatic audit
recovery (AUDRCY) policy is enabled, these rules use the value from this policy to
attempt recovery for this error.
See “Policies in environments with more than two nodes or bi-directional replication”
on page 27 for additional information.
Journaling attribute difference action — Determines the recovery action to take for
scenarios in which audits have detected differences between the actual and
configured values of journaling attributes for objects journaled to a user journal. This
type of difference can occur for the Journal Images attribute and the Journal Omit
Open/Close attribute. Differences found on either the source or target object are
affected by this policy.
MIMIX configured higher
Determines the recovery for correcting a difference in which the MIMIX
configuration specifies an attribute value that results in a higher number of journal
transactions than the object's journaling attribute.
MIMIX configured lower
Determines the recovery action for correcting a difference in which the MIMIX
configuration specifies an attribute value that results in a lower number of journal
transactions than the object's journaling attribute.
DB apply threshold action — Determines what action to pass to the Compare File
Data (CMPFILDTA) command or the Compare Record Count (CMPRCDCNT)
command when it is invoked with *DFT specified for its DB apply threshold
(DBAPYTHLD) parameter. The command’s parameter determines what to do if the
database apply session backlog exceeds the threshold warning value configured for
the database apply process. This policy applies whenever these commands are used
and the backlog exceeds the threshold.
The shipped default for this policy causes the requested command to end and may
cause the loss of repairs in progress or inaccurate counts for members. You can also
set this policy to allow the request to continue despite the exceeded threshold.
DB apply cache — Determines whether to use database (DB) apply cache to
improve performance for database apply processes.1 When this policy is enabled,
MIMIX uses buffering technology within database apply processes in data groups that
specify *YES for journal on target (JRNTGT). This policy is not used by data groups
which specify JRNTGT(*NO) or by data groups whose target journals use journal
caching or journal standby functionality provided by the IBM feature for High
Availability Journal Performance (IBM i option 42).
1. This policy is not available in MIMIX Availability Manager.
51
Policy descriptions
Note: When DB apply cache is used, before and after journal images are sent to the
local journal on the target system.This will increase the amount of storage
needed for journal receivers on the target system if before images were not
previously being sent to the journal.
Access path maintenance — Determines whether MIMIX can optimize access path
maintenance during database apply processing as well as the maximum number of
jobs allowed per data group when performing delayed maintenance. Enabling
optimized access path maintenance improves performance for the database apply
process. To make any change to this policy effective, end and restart the database
apply processes for the affected data groups.
This policy and the access path maintenance function it controls are available on
systems running 7.1.15.00 or higher and replace the parallel AP maintenance
(PRLAPMNT) policy and its related function offered in earlier software levels. For
more information about either method of optimizing access path maintenance, see
the MIMIX Administrator Reference book.
Optimize for DB apply
Specify whether to enable optimized access path maintenance. When enabled,
the database apply processes are allowed to temporarily change the value of the
access path maintenance attribute for eligible replicated files on the target system.
Eligible files include physical files, logical files, and join logical files with keyed
access paths that are not unique and that specify *IMMED for their access path
maintenance.
Maximum number of jobs
Specify the maximum number of access path maintenance jobs allowed for a data
group when optimized access path maintenance is enabled. The actual number of
jobs varies as needed between a minimum of one job and the specified value. The
default value is 99.
Maximum rule runtime — Determines the maximum number of minutes an audit can
run when the Automatic audit recovery policy is enabled. The compare phase of the
audit is always allowed to complete regardless of this policy’s value. The elapsed time
of the audit is checked when the recovery phase starts and periodically during the
recovery phase. When the time elapsed since the rule started exceeds the value
specified, any recovery actions in progress will end. This policy has no effect on the
#MBRRCDCNT audit because it has no recovery phase. The shipped default for this
policy of 1440 minutes (24 hours) prevents running multiple instances of the same
audit within the same day. Valid values are 60 minutes through 10080 minutes (1
week).
Audit warning threshold — Determines how many days can elapse after an audit
was last performed before an indicator is set. When the number of days that have
elapsed exceeds the threshold, the indicator is set to inform you that auditing needs
your attention. The shipped default value of 7 days is at the limit of best practices for
auditing.
Note: It is recommended that you set this value to match the frequency with which
you perform audits. It is possible for an audit to be prevented from running for
several days due to environmental conditions or the Action for running audit
policy. You may not notice that the audit did not run when expected until the
52
Policy descriptions
Audit warning threshold is exceeded, potentially several days later. If you run
all audits daily, specify 1 for the Audit warning threshold policy. If you do not
run audits daily, set the value to what makes sense in your MIMIX
environment. For example, if you run the #FILDTA audit once a week and run
all other audits daily, the default value of 7 would cause all audits except
#FILDTA to have exposure indicated. The value 1 would be appropriate for the
daily audits but the #FILDTA audit would be identified as approaching out of
compliance much of the time.
Audit action threshold — Determines how many days can elapse after an audit was
last performed before an indicator is set. When the number of days that have elapsed
exceeds the threshold, the indicator is set to inform you that action is required
because the audit is out of compliance. The shipped default of 14 days is the
suggested value for this threshold, which is 7 days beyond the limit of best practices
for auditing.
Note: It is recommended that you set this value to match the frequency with which
you perform audits. It is possible for an audit to be prevented from running for
several days due to environmental conditions or the Action for running audit
policy. You may not notice that the audit did not run when expected until the
Audit action threshold is exceeded, potentially several days later. If you run all
audits daily, specify 1 for the Audit action threshold policy. If you do not run
audits daily, set the value to what makes sense in your MIMIX environment.
For example, if you run the #FILDTA audit once a week and run all other
audits daily, the default value of 14 would cause all audits except #FILDTA to
have exposure indicated. The value 2 would be appropriate for the daily audits
but the #FILDTA audit would be identified as approaching out of compliance
much of the time.
Audit level — Determines the level of comparison that an audit will perform when a
MIMIX rule which supports multiple levels is invoked against a data group. The policy
is in effect regardless of how the rule is invoked. The amount of checking performed
increases with the level number. This policy makes it easy to change the level of audit
performed without changing the audit scheduling or rules. No auditing is performed if
this policy is set to *DISABLED.
The audit level you choose for audits depends on your environment, and especially
on the data compared by the #FILDTA, #DLOATR, and #IFSATR audits. When
choosing a value, consider how much data there is to compare, how frequently it
changes, how long the audit runs, how often you run the audit, and how often you
need to be certain that data is synchronized between source and target systems.
Note: Best practice is to use level 30 to perform the most extensive audit. If you use
a lower level, consider its effect on how often you need to guarantee data
integrity between source and target systems.
Regardless of the level you use for daily operations, Vision Solutions strongly
recommends that you perform audits at audit level 30 before the following events to
ensure that 100 percent of the data is valid on the target system:
•
Before performing a planned switch to the backup system.
•
Before switching back to the production system.
53
Policy descriptions
For additional information, see “Guidelines and considerations for auditing” on
page 126 and “Changing auditing policies” on page 41.
Run rule on system — Determines the system on which to run audits. This policy is
used when audits are invoked with *YES specified for the value of the Use run rule on
system policy (USERULESYS) parameter on the Run Rule (RUNRULE) or Run Rule
Group (RUNRULEGRP) command. When *YES is specified in these commands, this
policy determines the system on which to run audits. While this policy is intended for
audits, any rule that meets the same criteria will use this policy.
The policy’s shipped default value, *MGT, runs audits from the management system.
In multi-management environments where both systems defined to a data group are
management systems, the value *MGT will run audits only on the target system.
You can also set the policy to run audits from the network system, the source or target
system, or from a list of system definitions. When both systems of a data group are in
the specified list, the target system is used.
When choosing the value for the Run rule on system policy, also consider your
switching needs.
Action for running audits — Determines the type of audit actions permitted when
certain conditions exist in the data group. If a condition exists at the time of an audit
request, audit activity is restricted to the specified action. If multiple conditions exist
and the values specified are different, only the most restrictive of the specified actions
is allowed. If none of the conditions are present, the audit requests are performed
according to other policy values in effect.
Inactive data group
Specify the type of auditing actions allowed when any replication process required
by the data group is inactive. For example, a data group of TYPE(*ALL) is
considered inactive if any of its database or object replication processes is in a
state other than active. This element has no effect on the #FILDTA and
#MBRRCDCNT audits because these audits can run only when the data group is
active.
Repl. process in threshold
Specify the type of auditing actions allowed when a threshold warning condition
exists for any process used in replicating the class of objects checked by an
audit1. If a checked process has reached its configured warning value, auditing is
restricted to the specified actions. Most audits check for threshold conditions in all
database and object replication processes, including the RJ link. #FILDTA audits
only check for threshold warning conditions in the RJ link and database replication
processes. #DLOATR audits only check for threshold warning conditions in object
replication processes.
Audit history retention — Determines criteria for retaining historical information
about audit results and the objects that were audited. History information for an audit
includes timestamps indicating when the audit was performed, the list of objects that
were audited, and result statistics. Each audit, a unique combination of audit rule and
1. This behavior applies to instances running service pack 7.1.12.00 or higher. Instances running
earlier services packs check for thresholds on only the database apply and object apply processes.
54
Policy descriptions
data group, is evaluated separately and its history information is retained until the
specified minimum days and minimum runs are both met. When an audit exceeds
these criteria, system manager cleanup jobs will remove the historical information for
that audit from all systems and will remove the audited object details from the system
on which the audit request originated. The values specified at the time the cleanup
jobs run are used for evaluation.
Minimum days
Specify the minimum number of days to retain audit history for each completed
audit. Valid values range from 0 through 365 days.The shipped default is 7 days.
Minimum runs per audit
Specify the minimum number of completed audits for which history is to retained.
Valid values range from 1 through 365 runs. The shipped default is 1 completed
audit.
Object details
Specify whether to retain the list of audited objects and their audit status for each
completed audit of library-based objects. The specified value in effect at the time
an audit runs determines whether object details for that run are retained. The
specified value has no effect on cleanup of details for previously completed audit
runs. Cleanup of retained details occurs at the time of audit history cleanup. The
shipped default is *YES.
DLO and IFS details
Specify whether to retain the list of audited objects and their audit status for each
completed audit of DLO and IFS objects. The specified value in effect at the time
an audit runs determines whether object details for that run are retained. The
specified value has no effect on cleanup of details for previously completed audit
runs. Cleanup of retained details occurs at the time of audit history cleanup. The
shipped default is *YES.
Note: When large quantities of objects are eligible for replication, specifying *YES to
retain either Object details or DLO and IFS details may use a significant
amount of disk storage. Consider the combined effect of the quantity of
replicated objects for all data groups, the number of days to retain history, the
number of audits to retain, and the frequency in which audits are performed.
Synchronize threshold size — Determines the threshold, in megabytes (MB), to
use for preventing the synchronization of large objects during recovery actions. When
any of the Automatic system journal recovery, Automatic user journal recovery, or
Automatic audit recovery policies are enabled, all initiated recovery actions use this
policy value for the corresponding synchronize command's Maximum sending size
(MB) parameter. This policy is useful for preventing performance issues when
synchronizing large objects.
Number of third delay retry attempts — Determines the number of times to retry a
process during the third delay/retry interval. This policy is used when the Automatic
system journal recovery policy is enabled. Object replication processes use this policy
value when attempting recovery of an in-use condition that persists after the data
group’s configured values for the first and second delay/retry intervals are exhausted.
The shipped default is 100 attempts.
55
Policy descriptions
This policy and its related policy, Third delay retry interval, can be disabled so that
object replication does not attempt the third delay/retry interval but still allow
recoveries for other errors.
Third delay retry interval — Determines the delay time (in minutes) before retrying a
process in the third delay/retry interval. This policy is used when the Automatic
system journal recovery policy is enabled. Object replication processes use this policy
value when attempting recovery of an in-use condition that persists after the data
group’s configured values for the first and second delay/retry intervals are exhausted.
The shipped default is 15 minutes.
Switch warning threshold — Determines how many days can elapse after the last
switch was performed before an indicator is set for the installation. When the number
of days that have elapsed exceeds this threshold, the indicator is set to inform you
that switching may need your attention. The shipped default is 90 days, which is
considered at the limit of best practices for switching.
The indicator is associated with the Last switch field. The Last switch field identifies
when the last completed switch was performed using the default model switch
framework (DFTMSF) policy.
Switch action threshold — Determines how many days can elapse after the last
switch was performed before an indicator is set for the installation. When the number
of days that have elapsed exceeds this threshold, the indicator is set to inform you
that action is required. The shipped default of 180 days is the suggested value for this
threshold, which beyond the limit of best practices for switching.
The indicator is associated with the Last switch field. The Last switch field identifies
when the last completed switch was performed using the default model switch
framework (DFTMSF) policy.
Default model switch framework — Determines the default MIMIX Model Switch
Framework to use for switching. This value is used by configurations which switch via
model switch framework. The shipped default value is MXMSFDFT, which is the
default model switch framework name for the installation. If the default name is not
being used, this value should be changed to the name of the MIMIX Model Switch
Framework used to switch the installation.
Independent ASP library ratio — Determines the number for n in a ratio (n:1) of
independent ASP libraries (n) on the production system to SYSBAS libraries on the
backup system1. For each switchable independent ASP defined to MIMIX by a device
resource group, a monitor with the same name as the resource group checks this
ratio. When the number of independent ASP libraries falls to a level that is below the
specified ratio, the monitor sends a notification to inform you that action may be
required. This signals that your recovery time objective could be in jeopardy because
of a prolonged independent ASP switch time.
CMPRCDCNT commit threshold — Determines the threshold at which a request to
compare record counts (CMPRCDCNT command or #MBRRCDCNT audit) will not
perform the comparison due to commit cycle activity on the source system. The value
specified is the maximum number of uncommitted record operations that can exist for
1. The library ratio monitor and the policy it uses require a license key for MIMIX® Global™.
56
Policy descriptions
files waiting to be applied at the time the compare request is invoked. Each database
apply session is evaluated against the threshold independently. As a result, it is
possible that record counts will be compared for files in one apply session but will not
be compared for files in another apply session. For additional information see the
MIMIX Administrator Reference book.
Procedure history retention — Specifies criteria for retaining historical information
about procedure runs that completed or completed with errors. History information for
a procedure includes timestamps indicating when the procedure was run and detailed
information about each step within the procedure. Each procedure run, a unique
combination of procedure name and application group, is evaluated separately and its
history information is retained until the specified minimum days and minimum runs are
both met. When a procedure run exceeds these criteria, system manager cleanup
jobs will remove the historical information for that procedure run from all systems. The
values specified at the time the cleanup jobs run are used for evaluation.
Minimum days
Specifies the minimum number of days to retain procedure run history. The default
value is 7.
Minimum runs per procedure
Specifies the minimum number of completed procedure runs for which history is to
retained. This value applies to procedures of all other types except *SWTPLAN
and *SWTUNPLAN. The default value is 1.
Min. runs per switch procedure
Specifies the minimum number of completed switch procedure runs for which
history is to retained. This value applies to procedures of type *SWTPLAN and
*SWTUNPLAN that are used to switch an application group. The default value is
12.
Audit schedule — Determines the scheduling information that MIMIX uses to
automatically submit audit requests for the specified data group and rule that will
check all objects selected by data group configuration entries. Only configuration
entries associated with the specified type of rule are used.
To allow an audit to be automatically submitted, *ENABLED must be specified for
State1. Changes to this policy are effective immediately. If an audit is in progress at
the time of the change, the change will be reflected in the next scheduled run of the
audit.
Scheduled dates are entered and displayed in job date format. When the job date
format is Julian, the equivalent month and day are used to determine when to
schedule audit requests.
State1
Specify whether scheduled auditing is enabled or disabled for this data group and
audit rule.
1. The State element is available in installations running MIMIX version 7.1.12.00 or higher. In installations running earlier software levels, scheduled auditing requires specifying a value other than
*NONE for Frequency and specifying values for Scheduled time and either Scheduled date or
Scheduled day. Frequency is qualified by the values specified in the other elements
57
Policy descriptions
Frequency
Specify how often the audit request is submitted. The values specified for other
elements further qualify the specified frequency.
Scheduled date
Select a value or specify a date, in job date format, on which the audit request is
submitted.
Scheduled day
Select the day or days of the week on which the audit request is submitted. If
today is the day of the week that is specified and the scheduled time has not
passed, the audit request is submitted today. Otherwise, the job is submitted on
the next occurrence of the specified day. For example, if it is 11:00 a.m. on a
Friday when you set the audit schedule to specify Friday for Scheduled day and
12:00:00 for Scheduled time, the audit request is submitted today. If you are
setting the policy at 4:00 p.m. on a Friday or at 11:00 a.m. on a Monday, the audit
request is submitted the following Friday.
Scheduled time
Select a value or specify a time in 24-hour format at which the audit request is
submitted on the scheduled date or day. Although the time can be specified to the
second, the activity involved in submitting a job and the load on the system may
affect the exact time at which the job is submitted.
Time can be specified with or without a time separator.
Without a time separator, specify a string of 4 or 6 digits (hhmm or hhmmss)
where hh = hours, mm = minutes, and ss = seconds. Valid values for hh range
from 00 to 23. Valid values for mm and ss range from 00 to 59.
With a time separator, specify a string of 5 or 8 digits where the time separator
specified for your job is used to separate the hours, minutes, and seconds. If this
command is entered from the command line, the string must be enclosed in
apostrophes. If a time separator other than the separator specified for your job is
used, this command will fail.
Relative day of month
Select a value or specify one or more numbers with which to qualify what day a
monthly audit request is submitted, relative to its occurrence in the month. A
relative day is only valid when the schedule Frequency is Monthly and Scheduled
day is a value other than None.
For example, if Frequency is Monthly, Scheduled day is Tuesday and Thursday,
and Relative day of month is 1, the audit request is submitted on the first Tuesday
and first Thursday of every month. If both 1 and 4 are specified for relative day,
the audit request is submitted on the first Tuesday, first Thursday, fourth Tuesday,
and fourth Thursday of the month.
Priority audit — Determines when priority-based audit requests for the specified
data group and rule are allowed to automatically start and how often replicated
objects are eligible for auditing based on their priority classification. The #DGFE rule
does not support priority auditing.
To allow priority-based auditing to be performed, *ENABLED must be specified for
State.1 Changes to this policy are effective immediately. If an audit is in progress at
58
Policy descriptions
the time of the change, the change will be reflected in the next priority-based run of
the audit.
State1
Specify whether priority auditing is enabled or disabled for this data group and
audit rule.
Start after
Select a value or specify a time after which priority-based audits are allowed to
start. This is the beginning of a range of time during which priority-based audits
can start each day. The value *ANY allows priority-based audits to run repeatedly
throughout the day.
Note: Times specified for Start after and Start until elements is in 24-hour format
and can be specified with or without a time separator. Without a time
separator, specify a string of 4 or 6 digits (hhmm or hhmmss) where hh =
hours, mm = minutes, and ss = seconds. Valid values for hh range from 00
to 23. Valid values for mm and ss range from 00 to 59. With a time
separator, specify a string of 5 or 8 digits where the time separator
specified for your job is used to separate the hours, minutes, and seconds.
If this command is entered from the command line, the string must be
enclosed in apostrophes. If a time separator other than the separator
specified for your job is used, this command will fail.
Start until
Specify the end of the time range during which priority-based audits are allowed to
start. Priority-based audits can start until this time. This value is ignored when
Start after is *ANY.
New objects selected
Select the frequency at which new objects are considered for auditing. A new
object is one that has not been audited since it was created.
Changed objects selected
Select the frequency at which changed objects are considered for auditing. A
changed object is one that has been modified since the last time it was audited.
Unchanged objects selected
Select the frequency at which unchanged objects are considered for auditing. An
unchanged object is one that has not been modified since the last time it was
audited.
Audited with no diff.
Select the frequency at which objects with no differences are considered for
auditing. An object with no differences is one that has not been modified since the
last time it was audited and has been successfully audited on at least three
consecutive audit runs.
1. The State element is available in installations running MIMIX 7.1.12.00 or higher. In installations
running earlier software levels, priority auditing requires a value other than *NONE for Start after.
59
Checking application group status
Checking status in environments with
application groups
CHAPTER 3
Monitoring status of environments that use application groups begins at the level of
the application group and may include investigation into additional displays for more
detailed information. The following displays are typically used:
•
Work with Procedure Status (WRKPROCSTS command)
•
Work with Application Groups (WRKAG command)
•
Work with Node Entries (WRKNODE command)
•
Work with Data Rsc. Grp. Ent. (WRKDTARGE command)
•
Work with Data Groups (WRKDG command)
Note: This chapter does not include status for application groups that are configured
for an IBM i clustering environment. If you are using clustering or have
MIMIX® Global™ configured, see the MIMIX Operations with IBM i Clustering
book for status information within a clustering environment.
Checking application group status
The status view of the Work with Application Groups display provides a summary of
all status associated with an environment configured with application groups.
1. Do one of the following to access the Work with Application Groups display:
•
Select option 1 (Work with application groups) from the MIMIX Basic Main
Menu.
•
Select option 5 (Work with application groups) from the MIMIX Intermediate
Main Menu.
•
Enter the command: WRKAG
2. If necessary, use F10 to access the status view.
Figure 3.
Status view of Work with Application Groups display
60
Checking application group status
Work with Application Groups
System:
SYSA
Monitors . . . . . :
*ACTIVE
Notifications . . :
*NONE
Type options, press Enter.
1=Create
2=Change
10=End
12=Node entries
Opt
__
__
App
Group
__________
SAMPLEAG
App
Status
4=Delete
5=Display
13=Data resource groups
App Node
Status
6=Print
9=Start
15=Switch
Data Rsc
Data Node Repl.
Grp Status Status
Status
*ACTIVE
*ACTIVE
Proc.
Status
*COMP
Bottom
Parameters or command
===> _________________________________________________________________________
F3=Exit
F4=Prompt
F5=Refresh
F6=Create
F9=Retrieve
F10=View config
F12=Cancel
F13=Repeat
F18=Subset
F23=More options
F24=More keys
All status columns except the App Status column are summations of multiple
processes. Investigation into lower-level displays may be necessary to determine the
cause of a problem.
Ideal status conditions exist when the fields and columns have the following values:
•
The Monitors field is *ACTIVE.
•
The Notifications field is *NONE.
•
The Proc. Status column is *COMP.
•
For a non-cluster application group, the App Node Status and Repl. Status fields
are *ACTIVE. The App Status, Data Rsc Grp Status, and Data Node Status
columns will always be blank.
For any other status values, see the following:
•
“Resolving problems reported in the Monitors field” on page 61
•
“Resolving problems reported in the Notifications field” on page 63
•
“Resolving problems reported in Status columns” on page 64
Resolving problems reported in the Monitors field
The Monitors field located in the upper right corner of the Work with Application
Groups display summarizes the status of the MIMIX monitors on the local system.
Each node or system in the product configuration has MIMIX monitors which run on
that system to check for specific potential problems. A status of *ACTIVE indicates
that all enabled monitors on the local system are active.
61
Checking application group status
Table 9 shows possible status values for the Monitors field that require user action.
For a complete list of possible values, press F1 (Help).
Table 9.
Monitor field status values that may require user action
Monitor Status
Description
*ATTN
Either one or more monitors on the local system failed or there are both
active and inactive monitors on the local system.
*INACTIVE
All enabled monitors on the local system are inactive.
Do the following:
1. Press F14 (Monitors) to display the list of monitors on the local system on the
Work with Monitors display.
2. Check the Status column for status values of FAILED, FAILED/ACT, and
INACTIVE.
3. If the monitor is needed on the local system as indicated in Table 10, use option 9
(Start) to start the monitor.
Table 10.
Possible monitors and the nodes on which they should be active
Monitor
When and Where Needed
journal-name - remote journal link monitor
Checks the journal message queue for indications of
problems with the remote journal link. A monitor exists for
both the local and remote system of the RJ link.
Primary node and the
current Backup node of
application groups which
perform logical replication.
MMIASPMON - independent ASP threshold monitor
Checks the QSYSOPR message queue for indications that
the independent ASP threshold has been exceeded. This
monitor improves the ability to detect overflow conditions
that put your high availability solution at risk due to
insufficient storage.
On all nodes which control
an independent ASP.
MMNFYNEWE - monitor for new object notification entries
Monitors the source system for the newly created libraries,
folders, or directories that are not already included or
excluded for replication by a data group configuration.
Primary node when the
application group is
configured for logical
replication.
62
Checking application group status
Table 10.
Possible monitors and the nodes on which they should be active
Monitor
When and Where Needed
short-data-group-name_PAPM - Parallel access path
maintenance group monitor. When this monitor exists, there
are always associated monitors of one of the following
types:
• short-data-group-namePAPMnnn - Parallel access path
maint monitor nnn
• short-data-group-nameJobname - Parallel access path
maint monitor job-name
Target node of data group
replication processes when
Parallel access path
maintenance policy has
been enabled.
Note: These monitors and the policy which enables them are only available on systems running
software levels earlier than 7.1.15.00. The replacement for this function on systems running
7.1.15.00 or higher does not use monitors. For more information about optimizing access path
maintenance, see the MIMIX Administrator Reference book.
Resolving problems reported in the Notifications field
The Notifications field located in the upper right corner of the Work with Application
Groups display summarizes the status of notifications that exist for the MIMIX
installation. Notifications are sent by MIMIX processes, such as monitors or audits, to
inform you of potential problems. A value of *NONE indicates that no new notifications
exist.
Table 11 shows possible status values for the Notifications field that require user
action.
Table 11.
Notification field status values that may require user action
Notification
Status
Description
*ERROR
Action is required. At least one new notification exists with a severity of
*ERROR.
*WARNING
At least one new notification exists with a severity of *WARNING, which
indicates that the operation may be successful but an error exists. There
are no new notifications with a severity of *ERROR.
*INFO
At least one new notification exists with a severity of *INFO. There are
no new notifications with severity of *ERROR or *WARNING.
Do the following:
1. Press F15 (Notifications) to display the list of notifications for the installation on
the Work with Notifications display.
2. Use option 5 (Display) to view any notifications with a status of *NEW.
3. Take any further action indicated to resolve the problem.
4. When the problem is resolved, use either option 46 (Acknowledge) or option 4
(Remove) to address the notification itself. Notifications can only be removed
from the system on which they originated.
63
Checking application group status
Resolving problems reported in Status columns
Except for the App Status column, all other columns on the Work with Application
Groups display represent summations of status for multiple nodes or multiple data
resource groups associated with the application groups. Investigation into lower-level
displays may be necessary to determine the cause of the problem.
Troubleshooting Tip: When investigating problems, begin with the Proc. Status
column. A problem with procedure status can affect values in other columns.
When any procedure status problems are resolved, refresh the display. Then
check the other columns beginning the left-most column that is reporting a
problem. Resolve the most severe problem in that column first, then refresh the
display. Investigate problems in the remaining columns from left to right.
To address the most common problems with status for application groups, do the
following:
1. Resolve any problems reported in the Proc. Status column using Table 12.
2. Resolve any *ATTN status problems first, using Table 13.
3. Then address less severe problems, using Table 14.
For a complete list of status values for each column, press F1 (Help).
Resolving a procedure status problem
The Proc. Status column represents a summary of the most recent run of all
procedures defined for the application group.
Table 12.
Procedure Status values that require attention
Column Value
Description and Action
*ACTIVE
One or more of the last started runs of the procedures to run are still
active or queued. Wait for the procedure to complete. Do not attempt to
correct other status problems reported on the display until the procedure
completes. Use option 21 (Procedure status) to view the status of the
last started runs of procedures for the application group.
*ATTN
One or more of the last started runs of the procedures for the application
group have a status that requires attention. Use option 21 (Procedure
status) to view the status of the last started runs of the procedures for the
application group. The resulting procedures shown on the Work with
Procedure Status display. which have status values of *ATTN,
*CANCELED, *FAILED, *MSGW, or *PENDCNL require user action.
Also, it may be necessary to check status of the steps within the
procedure to resolve a step problem before the procedure can continue.
Do not attempt to correct other status problems reported on the Work
with Application Groups display until the procedure problems have been
resolved. For detailed information, see “Working with status of
procedures and steps” on page 77.
Note: The status *COMP indicates that the most recently started run of each
procedure for the application group has completed as directed. This includes
procedures that completed with errors and cancelled or failed procedures
64
Checking application group status
whose status have been acknowledged by user action. For any individual
procedure that completed with errors, user action is recommended to
investigate the cause of the error and assess its implications.
Resolving an *ATTN status for an application group
The value *ATTN can appear in each column of the Work with Application Groups
display to indicate that user action is required to correct a problem.
Important! Check the status of the Proc. Status column and address any problem
indicated by *ATTN or *ACTIVE status before attempting to resolve any problem
reported in other columns. Use “Resolving a procedure status problem” on
page 64.
If there are no procedure problems, each of the other columns with an *ATTN status
must be addressed individually, starting from the left-most column.
Table 13.
Resolving *ATTN status for columns (except Proc. Status) on the Work with Application Groups display
*ATTN Status
in Column
Description and Actions for *ATTN Status
App Node
Status
The App Node Status column is a summary of the status of the nodes
associated with the application group. The status includes the MIMIX
system manager, journal manager, target journal inspection, and collector
services jobs for the nodes in the application group.
*ATTN indicates that the node status and the MIMIX manager status
values do not match. Investigate the status of the associated nodes and
MIMIX managers using option 12 (Node entries). For additional
information see “Status for Work with Node Entries” on page 66.
Replication
Status
The Replication Status column is a summary status of data replication
activity for the data resource groups associated with an application group.
*ATTN indicates that data replication for at least one data group for the
data resource groups has a status that does not match the status of the
appropriate data resource group, has a failed state, an error condition, is
active with an incorrect source system, has audit errors, or has pending
recoveries. To determine the cause, use option 13 (Data resource
groups) to identify the data resource group where the problem exists. For
more information, see “Status for Work with Data Resource Group
Entries” on page 68.
65
Status for Work with Node Entries
Resolving other common status values for an application group
Table 14 lists other common problems with application group status and identifies how
to begin to their resolution.
Table 14.
Other problem statuses which may appear in multiple columns on the Work with
Application Groups display
Column
Value
Description and Action
*ATTN
Each column has a unique recovery. See “Resolving an *ATTN status for
an application group” on page 65.
*INACTIVE
The current status of the resource group or node is inactive. This status is
possible in the Repl. Status column.
• If all columns with a status value are *INACTIVE, the application group
may have been ended intentionally. Use option 9 (Start) to start the
application group.
• If this value appears only in the App Node Status column, the application
resource group nodes are all inactive and all MIMIX manager jobs are
also inactive. Use option 12 (Node entries) to investigate further. For
more information see “Status for Work with Node Entries” on page 66.
• If this value appears only in the Repl. Status column, logical replication is
not active. Use option 13 (Data resource groups) to investigate. For
more information see “Status for Work with Data Resource Group
Entries” on page 68.
*UNKNOWN
The current status is unknown. The local node is a network node in a noncluster application group and does not participate in the recovery domain.
Its status cannot be determined.
When this status appears in all columns, do one of the following:
• Enter the command WRKSYS. On the Work with Systems display, check
the status of Cluster Services for the local system definition. If
necessary, use option 9 (Start) to start cluster services.
• Sign on to a node that is active and use the WRKAG command to check
the application group status. If the status is still *UNKNOWN, use option
12 (Node entries) to check the status of Cluster Services on the node.
Status for Work with Node Entries
The Work with Node Entries displays a list of the nodes associated with an application
group or a data resource group. The Resource group and Type fields at the top of the
display indicate what the nodes are associated with.
Figure 4.
Status view of Work with Node Entries display for an application group which does
66
Status for Work with Node Entries
not participate in a cluster
Work with Node Entries
System:
Application group
. . . . . :
Type options, press Enter.
1=Add
2=Change
4=Remove
Opt
__
__
__
Node
________
SYSB
SYSA
SYSA
SAMPLEAG
5=Display
6=Print
9=Start
-------------Current------------Role
Sequence Data Provider
Manager
Status
*PRIMARY
*BACKUP
*ACTIVE
*ACTIVE
1
*PRIMARY
*PRIMARY
10=End
Bottom
Parameters or command
===> _________________________________________________________________________
F3=Exit
F4=Prompt
F5=Refresh
F6=Add
F7=Systems
F9=Retrieve
F10=View config
F11=Sort by node
F12=Cancel
F18=Subset
F24=More keys
For each node listed, check the Manager Status column for status values that require
attention. For a complete list of status values for each field and column, press F1
(Help).
Manager Status - This column indicates the status of all of the MIMIX system
manager, journal manager, target journal inspection, and collector services jobs for
the specified node.
Table 15.
Manager Status values that require user action.
Status
Value
Description and Action
*ATTN
At least one of the system manager, journal manager, target journal
inspection, or collector services jobs for the node has failed.
When all the nodes listed do not have the same value, use F7 (Systems) to
access the Work with Systems display.
• Check the status of the system and journal managers, target journal
inspection, and collector services.
• Use option 9 (Start) to start the managers and services that are not active
on the node.
*INACTIVE
All system manager, journal manager, target journal inspection, and
collector services jobs for the specified node are inactive. This may be
intentional when MIMIX is ended to perform certain activities.
Use F7 (Systems) to access the Work with Systems display.
67
Status for Work with Data Resource Group Entries
Status for Work with Data Resource Group Entries
The Work with Data Resource Group Entries display lists the data resource groups
associated with an application group. Each entry identifies a data resource group and
the summary of the replication status from its associated data groups.
Figure 5. The Work with Data Resource Group Entries display for an application group that
does not participate in a cluster
Work with Data Rsc. Grp. Ent.
System:
Application group
. . . . . :
Type options, press Enter.
1=Add
2=Change
4=Remove
10=End
12=Node entries
Opt
__
__
Resource
Group
__________
AGRSGRP
Type
*DTA
SYSA
SAMPLEAG
5=Display
6=Print
14=Build environment
Resource
Group
Status
8=Data groups
15=Switch
Node
Status
9=Start
Replication
Status
*ACTIVE
Bottom
Parameters or command
===> _________________________________________________________________________
F3=Exit
F4=Prompt
F5=Refresh
F6=Add
F9=Retrieve
F12=Cancel
F13=Repeat
F18=Subset
F19=Load
F21=Print list
Resource Group Status - This column identifies the status of the data resource
group. In environments that do not include IBM i clustering, this column is always
blank.
Node Status - This column identifies the status of the nodes for the data resource
group. In environments that do not include IBM i clustering, this column is always
blank
Replication Status - The value in this column is a summary status of data replication
activity for the data resource group. The status includes the status of all data group
processes, replication direction, replicated object and file entries, audits, and
recoveries.
68
Status for Work with Data Resource Group Entries
Figure 16 identifies status values for replication that require user action.
Table 16.
Replication Status values that require user action.
Status Value
Description and Action
*ATTN
One or more of the following problems exist for data groups within the
data resource group.
• The source system of an active data group is not the primary node of
of its application group.
• A data group has a failed state, an error condition, audit errors, or
pending recoveries.
To prevent damage to data in your environment, it is important that you
begin by determining which system should be the source for the data
groups. Do the following:
1. From this display, use option 8 (Data groups) to check which system
is the current source for the data group.
2. Determine which node has the role of current primary for the
application group. From the Work with Application Groups display,
use option 12 (Node entries), then check the current node role.
If the current primary node is correct and a data group with an incorrect
source system is active, end the data group and contact CustomerCare.
If the data groups in question have the correct source system but the
primary node for the application group is not correct, you need to
change the recovery domain for the application group to make the
correct node become primary. Use “Changing the sequence of backup
nodes” on page 71.
Once you have ensured that the data groups have the correct source
system, resolve any error conditions reported on the Work with Data
Groups display.
Note: Not all data groups should necessarily be active. Only the data groups
currently being used for data replication should be active. You will need
to look at the current node roles and data providers for the node entries
to determine which data groups should be active.
*INACTIVE
All replication in the data resource group is inactive. This may be normal
if replication was ended to perform certain activities. Use option 8 (Data
groups) to access the Work with Data Groups display.
69
Verifying the sequence of the recovery domain
Verifying the sequence of the recovery domain
Ensuring that sequence of the current backup nodes is set properly is critical to a
successful and predictable switch process. The current sequence of backup nodes
should match your recovery guidelines.
Do the following to confirm the sequence of the current backup nodes before
performing a switch and before removing or restoring a backup node from the cluster.
1. From the MIMIX Intermediate Main Menu, type 5 (Work with application groups)
and press Enter.
2. From the Work with Application Groups display, type 12 (Node entries) next to the
application group you want and press Enter.
3. The Work with Node Entries display appears, showing current information for the
nodes. Confirm that the current backup nodes have the sequence order that you
expect.
Note: It is important that you are viewing current information on the status view
of the display. Figure 6 shows an example of how the resulting Work with
Node Entries display appears with current status information. If you see
configured information instead, press F10 (View status).
4. If you need to change the sequence of current backup nodes, use “Changing the
sequence of backup nodes” on page 71.
Figure 6.
Example of displaying the current sequence information for backup nodes
Work with Node Entries
System:
Application group
. . . . . :
Type options, press Enter.
1=Add
2=Change
4=Remove
Opt
__
__
__
__
__
Node
________
NODEA
NODEB
NODEC
NODED
NODED
APP1
5=Display
6=Print
9=Start
-------------Current------------Role
Sequence Data Provider
Manager
Status
*PRIMARY
*BACKUP
*BACKUP
*BACKUP
*ACTIVE
*ACTIVE
*ACTIVE
*ACTIVE
1
2
3
*NONE
NODEA
NODEA
NODEA
10=End
Bottom
Parameters or command
===> _________________________________________________________________________
F3=Exit
F4=Prompt
F5=Refresh
F6=Add
F7=Systems
F9=Retrieve
F10=View config
F11=Sort by node
F12=Cancel
F18=Subset
F24=More keys
70
Changing the sequence of backup nodes
Changing the sequence of backup nodes
Use this procedure if you need to change the sequence of the current backup nodes.
This procedure may change the configured sequence for multiple nodes so that you
can achieve the desired sequence for backup nodes. The changes are not effective
until Step 5 is performed.
Do the following from an active application group:
1. From the Work with Application Groups display, type 12 (Node entries) next to the
application group you want and press Enter.
2. The Work with Node Entries display appears. Using F10 to toggle between
configuration view and status views, confirm that the node with the configured role
of *PRIMARY is the same node that is shown as the current *PRIMARY role.
•
If the same node is identified as *PRIMARY for the current role and the
configured role, skip to Step 4.
•
If the configured *PRIMARY node does not match the current *PRIMARY
node, perform Step 3 to correct this situation.before making any changes to the
configured sequence of backup nodes.
Figure 7 is an example of how configuration information appears on the Work with
Node Entries display.
3. Perform this step only if you need to correct the configured primary node to match
the current primary node. This step will demote the configured primary node to a
backup, then promote the correct node to become the configured primary node.
Do the following:
a. From the configuration view of the Work with Node Entries display, type 2
(Change) next to the configured primary node and press Enter.
b. On the Change Node Entry (CHGNODE) display, specify *BACKUP for Role
and press Enter. Then specify *FIRST for List position and press Enter.
c. On the Work with Node Entries display, press F5 (Refresh) to view changes.
All nodes in the configured view should have *BACKUP roles.
d. If necessary toggle to the status view to confirm which node is the current
primary node. Type 2 (Change) next to the current primary node and press
Enter.
e. On the Change Node Entry (CHGNODE) display, specify *PRIMARY for Role
and press Enter. Then press Enter two more times.
f. On the Work with Node Entries display, press F5 (Refresh) to view changes.
You should see the correct node as the configured primary node.
Note: The numbering for the backup sequence may not update; however, the
relative order for the configured backup sequence remains unchanged.
Gaps in configured sequence numbers are ignored when switching to a
backup. As long as the relative order is correct, it is not necessary to
change the configured sequence of backup nodes just to remove gaps
in numbering.
71
Changing the sequence of backup nodes
g. If the configured backup sequence is what you expect, skip to Step 5 to make
the change effective.
4.
To change the sequence of backup nodes, do the following:
a. From the configured view of the Work with Node Entries display, type 2
(Change) next to the backup node whose sequence you want to change.
b. On the Change Node Entry (CHGNODE) display, specify *BACKUP for Role
and press Enter. Then specify either *FIRST or a number for List position and
press Enter.
Note: If you specify a number, it cannot already be used in the configured
sequence list.
c. On the Work with Node Entries display, press F5 (Refresh) to view changes.
d. Repeat Step 4 until the correct sequence is shown on the configuration view.
Note: Gaps in configured sequence numbers are ignored when switching to a
backup. For example, in a configuration with two backup nodes, there is
no operational difference between a backup sequence of 1, 2 and a
backup sequence of 2, 5 as long as the same nodes are specified in the
same relative order.
5. To make the changes to the backup order effective, do the following:
a. Press F12 (Cancel) to return to the Work with Application Groups display.
b. Type 9 (Start) next to the application group you want and press F4 (Prompt).
c. On the Start Application Group (STRAG) display, specify *CONFIG for Current
node roles and press Enter.
d. The Procedure prompt appears. If needed, specify a different value and then
press Enter.
6. Confirm that the node entries have changed. Type 12 (Node entries) next to the
application group and press Enter. If necessary, use F10 to access the status
view. The current backup nodes should be in the new order.
72
Changing the sequence of backup nodes
Figure 7.
Example of displaying the configured sequence information for backup nodes:
Work with Node Entries
System:
Application group
. . . . . :
Type options, press Enter.
1=Add
2=Change
4=Remove
Opt
__
__
__
__
__
Node
________
NODEA
NODEB
NODEC
NODED
NODED
APP1
5=Display
6=Print
9=Start
10=End
-----------Configured-----------Role
Sequence Data Provider
*PRIMARY
*BACKUP
*BACKUP
*BACKUP
1
4
5
*PRIMARY
*PRIMARY
*PRIMARY
*PRIMARY
Bottom
Parameters or command
===> _________________________________________________________________________
F3=Exit
F4=Prompt
F5=Refresh F6=Add
F7=Systems
F9=Retrieve
F10=View status
F11=Sort by node
F12=Cancel
F18=Subset
F24=More keys
Examples of changing the backup sequence
The following examples illustrate problems with the current backup sequence and
how to correct them.
Example 1 - Changing the backup sequence when primary node is ok
Table 17 shows a four-node environment where the current backup sequence does
not reflect the desired behavior in the event of a switch. Also, the relative order of the
73
Changing the sequence of backup nodes
configured backup sequence does not match the relative order of either the current
sequence or the desired sequence.
Table 17.
Example 1, showing discrepancies in backup sequences
Desired Order
Initial Values, Example 1
Work with Node Entries
Status View
Opt
__
__
__
__
__
Node
________
NODEA
NODEB
NODED
NODEC
-----------Current--------------Role
Sequence Data Provider
*PRIMARY
*BACKUP
*BACKUP
*BACKUP
1
2
3
*PRIMARY
*PRIMARY
*PRIMARY
*PRIMARY
Configured View
Opt
__
__
__
__
__
Node
________
NODEA
NODEC
NODEB
NODED
-----------Configured-----------Role
Sequence Data Provider
*PRIMARY
*BACKUP
*BACKUP
*BACKUP
1
2
3
*PRIMARY
*PRIMARY
*PRIMARY
*PRIMARY
Each row in Table 18 shows a change to be made to the nodes on the configured view
of the Work with Node Entries display. The rows are in the order that the changes
need to occur to correct this example configuration to the desired order.
Table 18.
Order in which to change nodes to achieve the desired configuration for example 1
Node to
Change
Change To
NODEB
Role = *BACKUP
Position = *FIRST
Effect on Configured Order, Example 1
Intermediate step
Configured View
Opt
__
__
__
__
__
Node
________
NODEA
NODEB
NODEC
NODED
Notes
-----------Configured-----------Role
Sequence Data Provider
*PRIMARY
*BACKUP
*BACKUP
*BACKUP
1
2
3
*PRIMARY
*PRIMARY
*PRIMARY
*PRIMARY
74
Changing the sequence of backup nodes
Table 18.
Order in which to change nodes to achieve the desired configuration for example 1
Node to
Change
Change To
Effect on Configured Order, Example 1
NODED
Role = *BACKUP
Position = *FIRST
Notes
Configured View
Opt
__
__
__
__
__
Node
________
NODEA
NODED
NODEB
NODEC
-----------Configured-----------Role
Sequence Data Provider
*PRIMARY
*BACKUP
*BACKUP
*BACKUP
1
2
3
*PRIMARY
*PRIMARY
*PRIMARY
*PRIMARY
Desired
configuration but it
is not effective
until STRAG
ROLE (*CONFIG)
is performed.
Example 2 - Correcting the configured primary node and changing the backup
sequence
Table 19 shows a four-node environment where the current backup sequence does
not reflect the desired behavior in the event of a switch. Also, the current and
configured primary node do not match.The configured primary node must be
corrected first, before attempting to correct any backup node sequence problems.
Table 19.
Example 2, showing discrepancies in primary node and backup sequences
Desired Order
Initial Values, Example 2
Work with Node Entries
Status View
Opt
__
__
__
__
__
Node
________
NODEA
NODEB
NODED
NODEC
-----------Current--------------Role
Sequence Data Provider
*PRIMARY
*BACKUP
*BACKUP
*BACKUP
1
2
3
*PRIMARY
*PRIMARY
*PRIMARY
*PRIMARY
Configured View
Opt
__
__
__
__
__
Node
________
NODEB
NODEC
NODEA
NODED
-----------Configured-----------Role
Sequence Data Provider
*PRIMARY
*BACKUP
*BACKUP
*BACKUP
1
2
3
*PRIMARY
*PRIMARY
*PRIMARY
*PRIMARY
75
Changing the sequence of backup nodes
Each row in Table 20 shows a change to be made to the nodes on the configured view
of the Work with Node Entries display. The rows are in the order that the changes
need to occur to correct this example configuration to the desired order.
Table 20.
Order in which to change nodes to achieve the desired configuration for example 2.
Node to
Change
Change To
NODEB
Role = *BACKUP
Position = *FIRST
Effect on Configured Order, Example 2
NODEA
Node
________
NODEB
NODEC
NODEA
NODED
Role =
*PRIMARY
-----------Configured-----------Role
Sequence Data Provider
*BACKUP
*BACKUP
*BACKUP
*BACKUP
1
2
3
4
*PRIMARY
*PRIMARY
*PRIMARY
*PRIMARY
Configured View
Opt
__
__
__
__
__
NODED
Intermediate step
Configured View
Opt
__
__
__
__
__
Node
________
NODEA
NODEB
NODEC
NODED
Role = *BACKUP
Position = *FIRST
-----------Configured-----------Role
Sequence Data Provider
*PRIMARY
*BACKUP
*BACKUP
*BACKUP
1
2
3
*PRIMARY
*PRIMARY
*PRIMARY
*PRIMARY
Configured View
Opt
__
__
__
__
__
Node
________
NODEA
NODED
NODEB
NODEC
Notes
-----------Configured-----------Role
Sequence Data Provider
*PRIMARY
*BACKUP
*BACKUP
*BACKUP
1
2
3
*PRIMARY
*PRIMARY
*PRIMARY
*PRIMARY
Intermediate step,
corrects configured
*PRIMARY.
The sequence
number for Backup
3 may appear as 4.
The relative order is
equivalent.
Desired
configuration but it
is not effective until
STRAG ROLE
(*CONFIG) is
performed.
76
Working with status of procedures
and steps
CHAPTER 4
This chapter describes how to work with procedures and steps. Procedures are used
to perform operations for application groups. All procedures are associated with an
application group. This chapter does not apply to configurations that do not use
application groups.
When working with status of procedures and steps, it is important to understand how
multiple jobs are used to process the steps in a procedure. A procedure uses multiple
asynchronous jobs to run the programs identified within its steps. Starting a procedure
starts one job for the application group and an additional job for each of its data
resource groups. These jobs operate independently and persist until the procedure
ends. Each persistent job evaluates each step in sequence for work to be performed
within its domain. When a job for a data resource group encounters a step that acts
on data groups, it spawns an additional job for each subordinate data group. Each
spawned data group job performs the work for that step and then ends.
This chapter contains the following topics:
•
“Displaying status of procedures” on page 78 describes how to display the status
of procedure runs, including the most recent run as well as runs kept for their
status history.
•
“Resolving problems with procedure status” on page 80 describes the conditions
which cause each procedure status value and the actions required to resolve
problem statuses. This includes how to resolve procedure inquiry messages and
failed or canceled procedures.
•
“Displaying status of steps within a procedure run” on page 83 describes how to
display status of steps within a procedure as well as the differences between the
collapsed and expanded views of the Work with Step Status display.
•
“Resolving problems with step status” on page 85 describes the conditions which
cause each step status value and the actions required to resolve problem
statuses. This includes how to resolve step inquiry messages and failed or
canceled steps.
•
“Acknowledging a procedure” on page 89 describes how to manually change a
procedure with a status of *CANCELED, *FAILED, or *COMPERR to an
acknowledged status.
•
“Running a procedure” on page 90 describes how to start a user procedure and
the parameter that controls the step at which the procedure begins.
•
“Canceling a procedure” on page 92 describes how to cancel an active procedure.
77
Displaying status of procedures
Displaying status of procedures
You can view the status of runs of procedures from the Work with Procedure Status
display. The term “the last run” of a procedure refers to the most recently started run
of a procedure, which may be in progress or may have completed. Also, the status of
other previously performed runs of procedures may be available, subject to the
current settings of the Procedure history retention policy.
The Work with Procedure Status display lists procedures in reverse chronological
order so that the most recently started procedures are at the top of the list.
Procedures that have never been requested to run do not appear on this display.
Figure 8 shows an example of the Work with Procedure Status display subsetted to
show only runs of a specific procedure and application group.
F11 toggles between views that show the Start time column and columns for the
Duration of the procedure and the Node on which the procedure was started.
Timestamps are in the local job time. If you have not already ensured that the systems
in your installation use coordinated universal time, see “Setting the system time zone
and time” on page 311.
Figure 8.
A subsetted view of the Work with Procedure Status display.
Work with Procedure Status
System:
Type options, press Enter.
5=Display
6=Print
8=Step status
12=Cancel
13=Change status
Opt
__
__
Procedure
SWTPLAN
SWTPLAN
App Group
SAMPLEAG
SAMPLEAG
Type
*SWTPLAN
*SWTPLAN
SYSTEMA
9=Run
11=Display message
14=Resume
Status
*COMPLETED
*COMPLETED
---Start Time---03/01/10 11:25:05
03/01/10 11:04:58
Bottom
Parameters or command
===> _________________________________________________________________________
F3=Exit
F4=Prompt
F5=Refresh
F9=Retrieve
F11=Duration
F12=Cancel
F13=Repeat
F18=Subset
F21=Print list
Displaying status of the last run of all procedures
To display the status of the last run of all procedures for an application group, do the
following:
1. From the MIMIX Basic Main Menu, select option 1 (Work with application groups).
2. The Work with Application Groups display appears. Type 21 (Procedure status)
78
Displaying status of procedures
next to the application group you want and press Enter.
The last run of all procedures for the application group are listed on the Work with
Procedure Status display.
3. Locate the procedure you want and check the value of the Status column.
Displaying available status history of procedure runs
To display status of all available runs of a selected procedure, do the following:
1. From the MIMIX Basic Main Menu, select option 1 (Work with application groups).
2. The Work with Application Groups display appears. Type 20 (Procedures) next to
the application group you want and press Enter.
3. The Work with Procedures display appears, listing all procedures for the selected
application group. Type 14 (Procedure status) next to the procedure you want and
press Enter.
All available runs for the selected procedure are listed on the Work with Procedure
Status display. The most recently started procedure runs are at the top of the list,
and may still be active.
4. Locate the run of the procedure you want and check the value of the Status
column.
Note: To view status of all runs of all procedures for all application groups, you can
either press F20 (Procedure status) from the Work with Application Groups
display, press F14 (Procedure status) from the Work with Procedures display,
or enter the command: WRKPROCSTS.
79
Resolving problems with procedure status
Resolving problems with procedure status
Table 21 identifies the possible status values that can appear on the Work with
Procedure Status display and identifies the action to take to resolve reported
problems.
Table 21.
Procedure status values with action required
Category
Status Value
Description and Action Required
Active
*ACTIVE
The procedure is currently running. No steps require attention.
*ATTN
The procedure requires attention. Either there is a step with a status of
*MSGW, or there is an active step and one or more steps with step
status values of *ATTN, *CANCEL, *FAILED, or *IGNERR.
Action Required: Determine the status of each step and the action
required to correct that status. See “Resolving problems with step
status” on page 85.
*MSGW
A step within the procedure is waiting for a response to an inquiry
message. The procedure cannot process the step or any subsequent
steps without a reply to the message.
Action Required: Display and respond to the inquiry message using
“Responding to a procedure in *MSGW status” on page 81.
*PENDCNL
A request to cancel the procedure is in progress. When the activity for
the steps in progress at the time of the cancel request ends, the
procedure status changes to *CANCELED.
*QUEUED
A request to run the procedure is currently waiting on the job queue.
When the procedure becomes an active job, the procedure status
changes to *ACTIVE.
*CANCELED
Either the procedure was canceled and did not complete, or steps within
the procedure were canceled as a response to inquiry messages from
the steps. The procedure was partially performed.
Action Required: Use “Resolving a *FAILED or *CANCELED
procedure status” on page 82 to determine the state of your
environment and whether to resume the procedure or to acknowledge
its status.
*FAILED
The procedure failed. Jobs for one or more steps had errors. Those
steps were configured to end if they failed. The procedure was partially
performed.
Action Required: Use “Resolving a *FAILED or *CANCELED
procedure status” on page 82 to determine the state of your
environment and whether to resume the procedure or to acknowledge
its status
Resumable
80
Resolving problems with procedure status
Table 21.
Procedure status values with action required
Category
Status Value
Description and Action Required
Acknowledged
*ACKCANCEL
The procedure was canceled and a user action acknowledged the
cancellation so that the procedure can no longer be resumed.
*ACKFAILED
The procedure failed and a user action acknowledged the failure so that
the procedure can no longer be resumed.
*ACKERR
The procedure completed with errors and a user action acknowledged
the procedure. It is assumed that the user reviewed the steps with
errors. A status of completed with errors is only possible when the steps
with errors had been configured (within the procedure) to ignore errors
or a user’s response to a step in message wait status was to ignore the
error and continue running the procedure. After the step is
acknowledged, the procedure status changes to *ACKERR.
*COMPERR
The procedure completed with errors. One or more steps had errors and
were configured to continue processing after an error.
Action Recommended: Investigate the cause of the error and assess
its implications.
*COMPLETED
The procedure completed successfully.
Completed
Responding to a procedure in *MSGW status
A procedure in *MSGW status is effectively paused at a known point in its processing
as a result of a runtime attribute on one of its steps. The procedure sent an inquiry
message because a step specified *MSGW for its Action before step (BEFOREACT)
attribute. All jobs for the procedure have completed processing all previous steps and
are waiting to run the step’s program. An operator response is required.
To respond to a procedure in *MSGW status, do the following from the Work with
Procedure Status display:
1. To see which step is waiting, type 8 (Step status) next to the procedure and press
Enter.
2. The Work with Step Status display appears. The information on this display can be
used to determine which step is waiting to start. You will see steps with values of
*COMP, *IGNERR, or *DSBLD followed by no status for all remaining steps. The
first step with no status is the step that is waiting to start. Based on that step,
determine how to respond to the message and whether you are ready to respond.
3. You cannot display or respond to the procedure message from the Work with Step
Status display. Press F12 to return to the Work with Procedure Status display.
4. Type 11 (Display message) next to the procedure in *MSGW status and press
Enter.
5. You will see the message “Procedure name for application group name requires
response. (G C).” Do one of the following:
81
Resolving problems with procedure status
•
A response of G (Go) is required to start processing the step. Type G and press
Enter.
•
A response of C (Cancel) will cancel the procedure. Type C and press Enter.
Resolving a *FAILED or *CANCELED procedure status
When a procedure fails or is canceled, subsequent attempts to run the same
procedure will fail until user action is taken. You need to determine the best course of
action for your environment based on the implications of the partially performed
procedure. This topic will assist you in evaluating the cause of the failure or
cancellation, as well as the state of other steps within the procedure.
Important! Steps with failed or canceled jobs need to be resolved. Other
asynchronous jobs may have successfully processed the same step and
continued on to process other subsequent steps before the procedure ended. The
actions taken by those steps as well as by completed steps which preceded the
problem are not reversed. Some steps may not have been processed at all.
Do the following from the Work with Procedure Status display:
1. Type 8 (Step status) next to the *FAILED or *CANCELED run of the procedure
and press Enter.
2. The Work with Step Status display appears. Look for steps with a status of
*CANCEL, *FAILED, or *ATTN. Also use F7 (Expand) to see status for the jobs
which processed the steps.
A procedure with *FAILED status did not complete due to errors. In the collapsed
status view, one or more steps will have a status *ATTN or *FAILED. Other jobs
may have processed subsequent steps before the procedure ended. In the
expanded view, look for one or more jobs with a status of *FAILED. For detailed
information use “Resolving problems with step status” on page 85.
A procedure with *CANCELED status did not complete due to user action. Any of
the following may have occurred:
•
A user cancelled an inquiry message sent by the procedure because a step
was configured to wait for a reply before starting. This scenario is identified by
the absence of steps with status values of *FAILED, *CANCEL, or *ATTN.
Instead, you will see steps with values of *COMP, *IGNERR, or *DSBLD
followed by no status for all remaining steps. The first step with no status is the
step that waited to start. Continue with step Step 3.
•
A user cancelled an inquiry message sent by a step which had a job that ended
in error. At least one step in the collapsed view will have a status of *ATTN or
*CANCEL. One or more steps will have job with a status of *CANCEL in the
expanded view. Other jobs may have processed subsequent steps before the
procedure ended. For detailed information use “Resolving problems with step
status” on page 85.
•
A user canceled the procedure by using option 12 (Cancel) from the Work with
Procedure Status display or by using the Cancel Procedure (CNLPROC)
command. Steps in the collapsed view could have any status except *ACTIVE
or *MSGW. Determine if there are any jobs with status values of *FAILED or
82
Displaying status of steps within a procedure run
*CANCEL in the expanded view. Other jobs may have processed subsequent
steps before the procedure ended. For detailed information use “Resolving
problems with step status” on page 85.
3. After you have completed your evaluation and have taken any needed corrective
action to resolve why jobs failed or were canceled, determine how to best
complete the procedure. Choices are:
•
Resume the procedure. If you resume a failed procedure, processing will begin
with the step that failed. If you resume a canceled procedure, processing will
begin with steps following the cancelled step. Optionally, if you were unable to
resolve a problem for a step in error, you can override the attributes of that step
for when the procedure is resumed. See “Resuming a procedure” on page 91.
•
Acknowledge the procedure status. Procedures with a status of *CANCELED
or *FAILED can be acknowledged (set to *ACKCANCEL or *ACKFAILED,
respectively) to indicated you have investigated the problem steps and want to
run the procedure again starting at its first step. This option should only be
used after you have evaluated the effect of activity performed by the
procedure. See “Acknowledging a procedure” on page 89.
Displaying status of steps within a procedure run
The Work with Step Status display provides access to detailed information about
status of steps for a specific run of a procedure for an application group.
Timestamps are in the local job time. If you have not already ensured that the systems
in your installation use coordinated universal time, see the MIMIX Administrator
Reference book for the setting system time topic.
To display step status for a procedure run, do the following:
1. Use one of the following to access the run of the procedure you want:
•
“Displaying status of the last run of all procedures” on page 78
•
“Displaying available status history of procedure runs” on page 79
2. From the Work with Procedure Status display, type 8 (Step status) next to the run
of the procedure you want and press Enter.
3. Press F7 (Expand) to view status of the individual jobs used to process each step.
The steps listed on the Work with Step Status display appear in sequence number
order as defined by steps in the procedure. If the procedure is in progress, the display
shows status for the steps that have run, the start time and status of the step that is in
progress, and blank status and start time for steps that have not yet run.
83
Displaying status of steps within a procedure run
Collapsed view - Figure 9 shows the initial collapsed view of the Work with Step
Status display. In this view, each step of the procedure is shown as a single row and
step status represents the summary of all jobs used by the step.
Figure 9.
Collapsed view of the Work with Step Status display.
Work with Step Status
System:
SYSTEMA
Procedure:
SWTPLAN
App. group:
SAMPLEAG
Type:
*SWTPLAN
Procedure status:
*COMPLETED
Start time:
03/01/10 11:04:58
Type options, press Enter.
5=Display
6=Print
8=Work with job
Opt
__
__
__
__
__
__
__
__
Step
Program
MXCHKCOM
MXCHKCFG
ENDUSRAPP
MXENDDG
MXENDRJLNK
MXAUDACT
MXAUDCMPLY
MXAUDDIFF
Type
*AGDFN
*DGDFN
*AGDFN
*DGDFN
*DGDFN
*DGDFN
*DGDFN
*DGDFN
Node
Type
*LOCAL
*NEWPRIM
*PRIMARY
*NEWPRIM
*NEWPRIM
*NEWPRIM
*NEWPRIM
*NEWPRIM
11=Display message
Start
Time
11:05:00
11:05:00
11:05:00
11:05:01
11:05:16
11:05:18
11:05:19
11:06:10
Duration
00:00:01
00:00:01
00:00:03
00:00:05
00:00:01
00:00:01
00:00:01
00:00:54
Status
*COMP
*COMP
*COMP
*COMP
*COMP
*COMP
*COMP
*COMP
Jobs
Pend
*NO
*NO
*NO
*NO
*NO
*NO
*NO
*NO
More...
Parameters or command
===> _________________________________________________________________________
F3=Exit
F4=Prompt
F5=Refresh
F7=Expand
F9=Retrieve
F12=Cancel
F13=Repeat
F15=Cancel proc.
F18=Subset
F21=Print list
Expanded view - Figure 10 shows an example of an expanded view. In the expanded
view, step programs of type *AGDFN will have one row for each node on which the
step runs. Steps which run step programs at the level of the data resource group or
data group are expanded to have multiple rows so that the status of the step for each
data resource group or data group is visible. For step programs of type
*DTARSCGRP, there will be a summary row for the application group followed by a
row for each data resource group within the application group. For step programs of
type *DGDFN, there will be a summary row for the application group, then for each
data resource group, there is a summary row for the data resource group followed by
a row for each of its data groups. Summary rows are identified by a dash (-) in the
columns that are being summarized.
84
Resolving problems with step status
Also, for step programs of type *AGDFN, the Data Rsc. Grp. column and the Data
Group column will always be blank. For step programs of type *DTARSCGRP, the
Data Group column will always be blank.
Figure 10. Expanded view of the Work with Step Status display.
Work with Step Status
System:
SYSTEMA
Procedure:
SWTPLAN
App. group:
SAMPLEAG
Type:
*SWTPLAN
Procedure status:
*COMPLETED
Start time:
03/01/10 11:04:58
Type options, press Enter.
5=Display
6=Print
8=Work with job
Opt
__
__
__
__
__
__
__
__
Step
Program
MXCHKCOM
MXCHKCFG
MXCHKCFG
MXCHKCFG
MXCHKCFG
MXCHKCFG
MXCHKCFG
MXCHKCFG
Data
Rsc. Grp.
Data
Group
DRG1
DRG1
DRG1
DRG1
DRG2
DRG2
DG1A
DG1B
DG1C
DG2A
11=Display message
Node
LTIAS01
LTIAS02
LTIAS02
LTIAS02
LTIAS02
LTIAS02
LTIAS02
LTIAS02
Start
Time
11:05:00
11:05:00
11:05:00
11:05:00
11:05:00
11:05:00
11:05:00
11:05:00
Duration
00:00:01
00:00:01
00:00:01
00:00:01
00:00:01
00:00:01
00:00:01
00:00:01
Status
*COMP
*COMP
*COMP
*COMP
*COMP
*COMP
*COMP
*COMP
More...
Parameters or command
===> _________________________________________________________________________
F3=Exit
F4=Prompt
F5=Refresh
F7=Expand
F9=Retrieve
F12=Cancel
F13=Repeat
F15=Cancel proc.
F18=Subset
F21=Print list
Resolving problems with step status
When working with step status, it is important that you understand how multiple jobs
are used to process the steps in a procedure. At any given time, job activity may be in
progress for multiple steps. Or, one job may have failed processing a step while other
jobs may have already processed that step and continued beyond it.
Important! Before you take action to resolve a problem with status for a step, be
sure you understand the current state of your environment as a result of
completed steps and steps in progress, as well as the effect of any action you
take.
Table 22 identifies the possible status values that can appear on the Work with Step
Status display and the action to take to resolve reported problems.
Table 22.
Step status values with action required
Status
Value
Description and Action Required
blank
The procedure has started but processing has not yet started for the step.
85
Resolving problems with step status
Table 22.
Step status values with action required
Status
Value
Description and Action Required
*ATTN
The step requires attention. The value *ATTN can only appear in the collapsed
view or on a summary row in the expanded view. If the procedure status is
considered active, at least one job submitted by this step has a status of
*FAILED, *CANCEL or *MSGW. If the procedure status is *FAILED or
*CANCELED, this step has at least one job that has not started or has a status
of *CANCEL or *FAILED.
Action Required: Use F7 to see the expanded view. Determine the specific
data resource group or data group for which the problem status exists. Then
address the status indicated for that job.
*ACTIVE
The step is currently running.
*COMP
The step has successfully completed.
*DSBLD
The step has been disabled and did not run.
*CANCEL
or
*FAILED
One or more jobs used by the step ended in error. In the expanded view of
status, the job is identified as *CANCEL or *FAILED. The status is due to the
error action specified for the step.
• For *CANCEL status, user action canceled the step. The step ran, ended in
error, and issued an inquiry message. The user’s response to the message
was Cancel.
• For *FAILED status, the step ran, one or more jobs ended in error. The
Action on error attribute specified to quit the job.
The type of step program used by the step determines what happens to other
jobs for the step and whether subsequent steps are prevented from starting,
as follows:
• If the step program is of type *DGDFN, jobs that are processing other data
groups within the same data resource group continue. When they
complete, the data resource group job ends. Subsequent steps that apply
to that data resource group or its data groups will not be started. However,
subsequent steps will still be processed for other data resource groups and
their data groups.
• If the step program is of type *DTARSCGRP, subsequent steps that apply
to that data resource group or its data groups will not be started. Jobs for
other data resource groups may still be running and will process
subsequent steps that apply to their data resource groups and data groups.
• If the step program is of type *AGDFN, subsequent steps that apply to the
application group will not be started. Jobs for data resource group or data
group steps may still be running and will process subsequent steps that
apply to their data resource groups and data groups.
When all asynchronous jobs for the procedure finish, the procedure status is
set to *CANCELED or * FAILED, accordingly. If both canceled and failed steps
exist when the procedure ends, the procedure status will be *FAILED.
Action Required: Determine the cause of the problem using “Resolving
*CANCEL or *FAILED step statuses” on page 88.
86
Resolving problems with step status
Table 22.
Step status values with action required
Status
Value
Description and Action Required
*IGNERR
The step ran and an error occurred, but processing ignored the error and
continued.
Action Recommended: Use option 8 (Work with job) to determine the cause
of the failure. Consider whether any changes are needed to your procedure or
step or to your operating environment to prevent this error from occurring
again.
*MSGW
The step ran and issued a message that is waiting to be answered. One or
more jobs for the step ended in error. Step attributes require that an operator
respond to the message.
Action Required: Determine which job issued the message, investigate the
problem, and then respond to the inquiry message using “Responding to a
step with a *MSGW status” on page 87.
Responding to a step with a *MSGW status
When a step or a job for step has a status of *MSGW, it is the result of an error
condition. An inquiry message was sent because the step specified *MSGW for its
Action on error attribute. An operator response is required before any additional
processing for the job can occur.
To respond to a step in *MSGW status, do the following from the Work with Step
Status display:
1. To see which job is waiting, use F7 to view the Expanded view.
2. To view information about what caused the job to end in error, type 8 (Work with
job) next to job with *MSGW status and press Enter.
3. On the Work with Job display, type 10 (Display job log, if active, on job queue, or
pending) and press Enter.
4. The job log is displayed. Use F1 to view details of any of the messages. Find the
error that caused the job to end. You will see the inquiry message in the job log;
however you cannot respond to it from here.
5. Press F12 twice to return to the Work with Step Status display.
6. Type 11 (Display message) next to the step job in *MSGW status and press Enter.
7. You will see the message “Error in step at sequence number number in procedure
name. (R C I).” Do one of the following:
•
A response of R (Retry) will retry processing the step program within the same
job. Type R and press Enter.
•
A response of C (Cancel) will set the job status to *CANCEL as indicated in the
expanded view of step status. Subsequent steps are handled in the same
manner as if the Action on error has specified the value *QUIT. Type C and
press Enter.
87
Resolving problems with step status
•
A response of I (Ignore) will set the job status to *IGNERR as indicated in the
expanded view of step status, and processing continues as if the job had not
ended in error. Type I and press Enter.
Resolving *CANCEL or *FAILED step statuses
Evaluate the cause of the failure or cancellation, as well as the state of other steps
within the procedure. All steps with failed or canceled jobs need to be resolved.
Important! For any step which ended in error, other asynchronous jobs may have
successfully processed the same step and continued on to process other
subsequent steps. The actions taken by those steps as well as by completed
steps which preceded the problem cannot be reversed.
Do the following from the Work with Step Status display:
1. Use F7 to view the Expanded view.
2. All steps which have a job that has a step status of *CANCEL or *FAILED must be
evaluated and the cause of the problem must be resolved. To view information
about why a job had an error processing a step, do the following:
a. Type 8 (Work with job) next to the job you want and press Enter.
b. On the Work with Job display, type 4 (Work with spooled file) and press Enter.
c. Display the spooled file for the job and check for the cause of the error.
d. Evaluate whether any immediate action is needed due to the condition which
caused the error. Consider the nature and severity of the error.
3. If the procedure is still active and you need to take corrective action or perform
additional investigation, cancel the procedure using F15 (Cancel proc.). Any steps
that are currently running will complete, then the procedure status is set to
*CANCELED.
4. Check which steps have completed, failed, were canceled, or have not yet started.
Then evaluate the current state of your environment as a result. If needed, take
corrective action that is appropriate for the extent of the errors and the extent to
which steps completed.
Note: It is strongly recommended that you cancel the procedure, if it is active,
before attempting any corrective action.
5. Determine how to best complete the procedure in the current state of your
environment. When the procedure is *FAILED or *CANCELED, your choices are:
•
Resume the procedure from the point where the procedure ended. If you
resume a failed procedure, processing will begin with the step that failed. If you
resume a canceled procedure, processing will begin with steps following the
cancelled step. Optionally, if you were unable to resolve a problem for a step in
error, you can override the attributes of that step for when the procedure is
resumed. See “Resuming a procedure” on page 91.
•
Acknowledge the procedure status allowing the procedure for *CANCELED or
*FAILED to be resumed starting with the first step. This choice indicates you
have investigated the problem steps and want to run the procedure again
88
Acknowledging a procedure
starting at its first step. This option should only be used after you have
evaluated the effect of activity performed by the procedure. See
“Acknowledging a procedure” on page 89.
Acknowledging a procedure
Acknowledging a procedure allows you to manually change the status of procedures
that either failed or have errors in order to control where the next attempt to run the
procedure will start. Procedures with a status of *CANCELED, *FAILED, or
*COMPERR can be acknowledged (set to *ACKCANCEL, *ACKFAILED, or
*ACKERR, respectively) to indicated you have investigated the problem steps.
A procedure of *CANCELED or *FAILED allows you to rerun the procedure from its
first step. Once acknowledged, a procedure with either of these statuses cannot be
resumed from the point where the procedure ended. This is appropriate when you
have determined that your environment will not be harmed if the next attempt to run
starts at the first step.
A *COMPERR procedure that is acknowledged (*ACKERR) can never be resumed
because the procedure completed. By acknowledging a procedure with this status,
you are confirming the problems have been reviewed.
The last run of a procedure with a status of *ACKCANCEL or *ACKFAILED and the
last run of the set of start/end/switch procedures can be returned to their previous
status (*CANCELED or *FAILED, respectively). The next attempt to run the procedure
will resume at the failed or canceled step or at the first step that has not been started.
Note: Acknowledging the last run of a failed or canceled procedure will acknowledge
all previous failed or canceled runs of the procedure.
Important! Before changing status of a procedure, it is important that you evaluate
and understand the effect of the partially performed procedure on your environment.
Changing procedure status does not reverse the actions taken by preceding steps
that completed or the actions performed by other asynchronous jobs which did
complete the same step and then processed subsequent steps. It may not be
appropriate for the next run of the procedure to begin with the first step, for example,
if the failure occurred in a step which synchronizes data or changes states of MIMIX
processes. Likewise, it may not be appropriate to return to the previous status to
resume a procedure run was not recently run.
To change the status of a procedure, do the following:
1. From the Work with Procedure Status display type 13 (Change status) next to the
failed or canceled procedure you want and press Enter.
2. The Change Procedure Status (CHGPROCSTS) display appears. Specify the
value you want for the Status prompt and press Enter.
3. If you specified *ACK in Step 2, the Start time prompt appears, displaying the
timestamp of the selected procedure run. Do one of the following:
•
To acknowledge only the selected failed or canceled run, press Enter.
•
To acknowledge all previously failed or canceled runs of the selected
procedure, specify *ALL for Start time and press Enter.
89
Running a procedure
Running a procedure
The procedure type determines what command to use to run the procedure. For an
application group, multiple procedures of type *USER can run at the same time if they
have unique names. Only one run of a uniquely named procedure of type *USER can
occur at a time.
All other procedure types must be invoked by the application group command
associated with the procedure type. For example a procedure of type *START can
only be invoked by the Start Application Group (STRAG) command.
Where should the procedure begin? The value specified for the Begin at step
(STEP) parameter on the request to run the procedure determines the step at which
the procedure will start. The status of the last run of the procedure determines which
values are valid.
The default value, *FIRST, will start the specified procedure at its first step. This value
can be used when the procedure has never been run, when its previous run
completed (*COMPLETED or *COMPERR), or when a user acknowledged the status
of its previous run which failed, was canceled, or completed with errors
(*ACKFAILED, *ACKCANCEL, or *ACKERR respectively).
Other values are for resolving problems with a failed or canceled procedure. When a
procedure fails or is canceled, subsequent attempts to run the same procedure will
fail until user action is taken. You will need to determine the best course of action for
your environment based on the implications of the canceled or failed steps and any
steps which completed.
The value *RESUME will start the last run of the procedure beginning with the step at
which it failed, the step that was canceled in response to an error, or the step
following where the procedure was canceled. The value *RESUME may be
appropriate after you have investigated and resolved the problem which caused the
procedure to end. Optionally, if the problem cannot be resolved and you want to
resume the procedure anyway, you can override the attributes of a step before
resuming the procedure.
The value *OVERRIDE will override the status of all runs of the specified procedure
that did not complete. The *FAILED or *CANCELED status of these procedures are
changed to acknowledged (*ACKFAILED or *ACKCANCEL) and a new run of the
procedure begins at the first step.
.
To run a procedure of type *USER, do the following:
1. From the Work with Procedures or Work with Procedure Status display type 9
(Run) next to the user procedure you want and press F4 (Prompt).
2. Specify the value you want for Begin at step and press Enter.
To run a procedure type other than *USER, do the following:
From a command line, enter the application group command associated with the
procedure type. For example a procedure of type *START can only be invoked by
the Start Application Group (STRAG) command.
90
Running a procedure
To resume a procedure with a status of *CANCELED or *FAILED, see “Resuming
a procedure” on page 91 .
Resuming a procedure
To resume a procedure of with a status of *CANCELED or *FAILED, do the following:
1. Investigate and resolve problems for steps with errors. See “Resolving problems
with step status” on page 85.
2. Optional: If the problem cannot be resolved, and you want to resume the
procedure anyway, use the Override Step (OVRSTEP) command to change the
configured value of the step for when the procedure is resumed. See “Overriding
the attributes of a step” on page 91.
3. For a procedure of type *USER, from the Work with Step Status display use F14
(Resume proc.). For all other procedure types, from a command line, enter the
appropriate application group command and specify *RESUME as the value for
Begin at step (STEP).
Overriding the attributes of a step
The attributes of a step can be overridden to change the configured value of the step
for the current run of the procedure by using the Override Step (OVRSTEP)
command. The attributes determine whether the step is run or actions if the step
errors for the current run of the procedure when it is resumed.
The OVRSTEP command can be used for a procedure that has a status of active
(*ACTIVE, *ATTN, *MSGW, *PENDCNL, or *QUEUED), *CANCELED or *FAILED
and steps that have a status of *CANCEL or *FAILED. The overridden values apply
only for the current run of the procedure when it is resumed.
Note: Regardless of procedure status, attributes cannot be overridden for a required
MIMIX step or any step with a step status of *COMP or *IGNERR.
A procedure with a status of *CANCELED or *FAILED requires user action to resolve
a problem. If the problem cannot be resolved and you want to resume the procedure
anyway, you can use the OVRSTEP command to disable the step in error or specify
the error action to occur when the step is retried.
Important! Overriding the attributes of a step should only be done after you have
considered how rerunning the step impacts your environment. It is important that
you understand the implications for steps which preceded the cancellation or
failure in the last run of the procedure. Processing for steps that completed is not
reversed.
The changes made when using the OVRSTEP command will only apply to the current
run of the procedure. The attributes that can be changed will vary depending on the
statuses of the specified procedure and step. Consider the following:
•
When the specified procedure has a status of *ACKCANCEL, *ACKFAILED,
*ACKERR, *COMPLETED, or *COMPERR, no attributes can be overridden on
any step in the procedure.
91
Canceling a procedure
•
When the specified procedure has a status that is considered active (*ACTIVE,
*ATTN, *MSGW, *PENDCNL, or *QUEUED), only the Action on error
(ERRACT) can be overridden.
•
When the specified procedure has a status that can be resumed (*CANCELED
or *FAILED), the Action before step (BEFOREACT), Action on error
(ERRACT), or State (STATE) can be overridden only on steps that have not yet
run, that failed, or that were canceled.
Do the following from the Work with Step Status display:
1. Press F7 (Expand) to view status of the individual jobs used to process each step.
2. Type 13 (Override step) next to the step you want and press Enter.
3. On the Override Step (OVRSTEP) display, specify the values you want and press
Enter. From the Work with Step Status display, use F14 (Resume proc.) to resume
the procedure. See “Resuming a procedure” on page 91.
Canceling a procedure
Use this procedure to cancel a procedure with a status that is considered active. This
includes procedure statuses of: *ACTIVE, *ATTN, *MSGW, *PENDCNL, and
*QUEUED.
Important! Use this command with caution. Processing ends without reversing
any actions performed by completed steps, which may leave your environment in
an undesirable state. For example, ending a switch procedure could result in
partially switched data.
The status of the procedure will be changed to *PENDCNL. If there are any inquiry
messages waiting for an operator response, they are processed as if the response
was Cancel. When all activity for currently running steps end, the status of the
procedure will be automatically changed to *CANCELED.
To cancel an active procedure, do one of following:
•
From the Work with Procedure Status display, type 12 (Cancel) next to the
procedure you want and press Enter.
•
From the Work with Step Status display, press F15 (Cancel proc.).
A procedure that has been canceled can be resumed later, as long as its status has
not been changed to *ACKCANCEL. When a canceled procedure is resumed,
processing begins immediately after the point where it was ended.
92
Monitoring status with MIMIX
Availability Status
CHAPTER 5
The MIMIX Availability Status display is useful in environments that do not use
application groups.
Note: The MIMIX Availability Status should not be used in environments that use
application groups.
The MIMIX Availability Status display, shown in Figure 11, provides one location for
quickly assessing the overall state of an entire MIMIX installation, reflecting both
source and target systems. The status values are prioritized and are a composite view
reflecting both source and target systems. In addition to determining status, unique
features of this display enable its use as the starting point for performing routine
actions and resolving problems.
To access this display, do one of the following:
•
Select option 1 on the MIMIX Basic Main Menu
•
Enter the command WRKMMXSTS and press Enter.
Figure 11 shows the MIMIX Availability Status display.
Figure 11. MIMIX Availability Status window. This example shows that MIMIX is active but
the installation is not complying with best practices for switching (red) and audits (yellow).
Additional fields - In the upper right corner of the display, additional fields report
information that is relevant to maintaining the installation.
Recoveries - Identifies the total number of recoveries in progress for the
93
installation. Active recoveries represent problems detected and being corrected
by MIMIX AutoGuard. Before certain activity, such as ending MIMIX, it is important
that there are no recoveries in progress in the installation. If more than 9999
recoveries exist, the field displays ++++.
Last switch - This field is only displayed when there is a value specified for the
Default model switch framework policy. The date indicates when the last
completed switch was performed using the switch framework specified in the
policy. If you have not yet performed a switch using the switch framework defined
in policies, this date is when the MIMIX environment was first started or when the
system managers were started and explicitly reset the configuration.
Activity/Status - The main area of the display provides a reporting area for status of
activity in key areas. Replication, Audits and notifications, and Services.
For each activity area, status represents a summation of multiple processes. The text
shown within each activity area changes to identify the most severe problem within its
processes. Text, as well as background color, also identify the summarized status and
indicate what action is appropriate.
Blue indicates there are no problems with the activity and that no action is
required.
Yellow indicates warnings that may need your attention.
Red indicates errors or inactive processes that require immediate action.
Options - On this display, the activity you select with an option and the status of the
activity determines what you see as the result of using the option. This behavior is
unlike that of options on other MIMIX displays. The following subtopics describe the
results of using the available options.
Option 5 (Display details) from the MIMIX Availability Status display results in a
display showing detailed status for the selected activity. Take option 5 next to the
item to access detailed information for the activity.
•
For Replication, the result is the Work with Data Groups display.
•
For Audits and notifications, the result is the Summary view of the Work with
Audits display. (To see details for notifications, press F20 (Command line), then
enter the command WRKNFY.)
•
For Services, the result is the Work with Systems display for status of the
MIMIX managers. (To see details for monitors, press F4 (MIMIX Menu), then
use option 12 (Work with monitors).)
Option 9 (Troubleshoot) from the MIMIX Availability Status display results in the
appropriate display to use as a starting point for troubleshooting the stated
problem for the selected activity. The stated problem reflects the highest severity
problem present. Other less severe problems may exist, they may be reflected on
the subsequent display but will not be reflected on the MIMIX Availability Status
display until higher severity problems are resolved.Take option 9 next to the item
to access detailed information for the activity.
•
For Replication, the result is the Work with Data Groups display.
94
Checking replication status from the MIMIX Availability Status display
•
For Audits and notifications, the result is dependent on the severity of the
stated problem. All auditing conditions are prioritized before any notifications.
For audits with status conditions, the result is the Summary view of the Work
with Audits display. For audits with compliance conditions, the result is the
Compliance view of the Work with Audits display. For notifications with errors,
the result is the Work with Notifications display.
•
For Services, the result is dependent on the severity of the stated problem. All
system manager, journal manager, and target journal inspection errors are
prioritized before any monitor errors. For system manager, journal manager,
and target journal inspection errors, the result is the Work with Systems
display. For monitor errors, the result is the Work with Monitors display.
Checking replication status from the MIMIX Availability
Status display
The first activity listed on the MIMIX Availability Status display is Replication, as
shown in Figure 11. The replication area summarizes status of replication activity for
all data groups in the installation. This includes processes required for replication and
also reflects potential problems.
Status values are shown by color while message text within the highlighted area
indicates the nature of any problem.
Blue - There are no problems with replication processes and no action is required.
Yellow - Warnings exist that may need your attention. Possible causes include:
•
A file is being synchronized by MIMIX AutoGuard. This condition usually
resolves itself.
•
A process has a backlog which has reached its threshold.
•
An object on the target system is not journaled as expected.
•
Journal state or cache are not as expected.
Red - Conditions exist that require immediate action or a switch is in progress.
Possible scenarios that require immediate action include:
•
Error conditions
•
Processes required for replication are not active
•
Some objects are not journaled and therefore cannot be replicated
•
Journal state or cache is not as expected.
Status may change due to warnings or problems with any of the replication
processes, with replication errors associated with data group entries (file, object, IFS
tracking, and object tracking), or with a change in switch status.
To begin resolving problems, use option 9 (Troubleshoot) to access the Work with
Data Groups display, from which you can view detailed information and take action.
See “The Work with Data Groups display” on page 99 for more information.
95
Checking audit and notification status from the MIMIX Availability Status display
Note: Replication status can indicate action required (red) while a switch is in
progress. When you are ready to switch from the backup system to the
production system, press F4 (MIMIX Menu). From there, use option 5 to
continue switching.
Checking audit and notification status from the MIMIX
Availability Status display
The middle activity listed the MIMIX Availability Status display is Audits and
notifications, as shown in Figure 11. This activity area summarizes status of all audit
activity, problems with audit results, audit compliance, and new notifications for a
MIMIX installation.
Status values are shown by color while message text within the highlighted area
indicates the nature of any problem.
Blue - No action is required. No audits are active, have differences, or are out of
compliance, and there are no new error or warning notifications.
Yellow - An audit or notification may need your attention. An out-of-compliance
audit is running its compare phase, an audit is approaching an out-of-compliance
state, or a new warning notification exists.
Red - A condition exists that requires immediate action. An audit has failed, had
unresolved differences, is out-of-compliance, was prevented from running
because of policy values, or a new error notification exists.
Status may change due to the highest severity condition with audits, audit results,
audit compliance, or new notifications.
To begin resolving problems, use option 9 (Troubleshoot) to access the appropriate
display for the indicated problem.
•
For audit status problems, see “Resolving audit problems” on page 133.
•
To resolve audit compliance problems, the audits must be run. See “Running an
audit immediately” on page 131 and “Displaying audit compliance” on page 144.
•
For additional information about notifications see “Displaying notifications” on
page 160.
Checking status of supporting services from the MIMIX
Availability Status display
The last activity listed on the MIMIX Availability Status display is Services, as shown
in Figure 11. This area summarizes status and also reflects potential problems with
system managers, journal managers, target journal inspection, collector services, and
all enabled monitors for the installation.
Status values are shown by color while message text within the highlighted area
indicates the nature of any problem.
Blue - There are no problems for the managers, target journal inspection, collector
96
Checking status of supporting services from the MIMIX Availability Status display
services, and monitors. No action is required.
Red - A system manager, journal manager, target journal inspection, collector
service, or a monitor is in a state that requires immediate action. The status text
indicates which problem occurred and where you can see detailed information.
To begin resolving problems, use option 9 (Troubleshoot) to access the appropriate
display.
When the text in the Services area indicates a problem with system managers, journal
managers, target journal inspection, or collector services option 9 will access the
Work with Systems display, from which you can view detailed information and take
action. See “Working with system-level processes” on page 149 for more information.
When the text in the Services area indicates a problem with a monitor, option 9 will
access the Work with Monitors display. For more information about working with
monitors, see the Using MIMIX Monitor book.
97
CHAPTER 6
Working with data group status
This chapter describes common MIMIX operations that help keep your MIMIX
environment running. In order for MIMIX to provide a hot backup of your critical
information, all processes associated with replication must be active at all times.
Supporting service jobs must also be active. MIMIX allows you to display and monitor
the statuses of these processes.
The topics included in this chapter are:
•
“The Work with Data Groups display” on page 99 describes the errors reported on
this display and provides procedures for resolving them.
•
“Working with the detailed status of data groups” on page 105 describes how to
access detailed status for a data group.
•
“Identifying replication processes with backlogs” on page 115 describes what
fields to check for detailed status of a data group.
98
Running H/F 1
The Work with Data Groups display
From the Work with Data Groups display you can start and end replication, track
replication status, perform a data group switch, as well as work with files, objects, and
tracking entries in error and access displays for data group entries and tracking
entries.
Do one of the following to access the Work with Data Groups display:
•
From the MIMIX Intermediate Main menu, select option 1 (Work with data groups)
and press Enter.
•
From the MIMIX Availability Status display, type 5 (Display details) next to
Replication and press Enter.
Figure 12. Sample Work with Data Groups display. The display uses letters and colored highlighting to call your attention to warning and problem conditions. This example shows items in
color which would appear with color highlighting on the display. If you are viewing this page in
printed form, the color may not be shown.
CHICAGO
11:02:05
Type options, press Enter.
Audits/Recov./Notif.: 001 / 002 / 003
5=Display definition
8=Display status
9=Start DG
10=End DG
12=Files needing attention
13=Objects in error
14=Active objects
15=Planned switch
16=Unplanned switch ...
---------Source----------------Target--------ErrorsOpt Data Group System
Mgr DB Obj DA
System
Mgr DB Obj
DB
Obj
__ APP1
LONDON
A
I
CHICAGO
A
I
__ APP2
LONDON
A
A
A
CHICAGO
A
A
A
__ APP3
LONDON
A
I
CHICAGO
A
I
2
__ CRITICALAP LONDON
A
R
A A
CHICAGO
A
A
A
1
4
__ RJAPP4
LONDON
A
L
CHICAGO
A
I
Work with Data Groups
F3=Exit
F10=Legend
F5=Refresh
F13=Repeat
Bottom
F7=Audits F8=Recoveries
F9=Automatic refresh
F16=DG definitions F23=More options
F23=More keys
For each data group listed, you can see the current source system and target system
processes, and the number of errors reported.
The following fields and columns are available.
Audit/Recov./Notif. -This field is located in the upper right corner of the Work with Data
Groups display. The first number is the total number of audits that require action to
correct a problem or that require your attention to prevent a situation from becoming a
problem. The second number indicates the number of active recoveries, including
those resulting from audits.The third number indicates the number of new notifications
that require action or attention. If more than 999 items exist in any field, the field will
display +++. When a field is highlighted in red, a problem exists. When a field is
104
Running H/F 1
highlighted in yellow, at least one out-of-compliance audit is currently active or an
audit is approaching out of compliance. For details, see “Problems reflected in the
Audits/Recov./Notif. field” on page 101.
Data group - When a data group name is highlighted, a problem exists. For details,
see “Problems reflected in the Data Group column” on page 101
Source - The following columns provide summaries of processes that run on the
source system. For details about status values, see “Replication problems reflected in
the Source and Target columns” on page 103.
Mgr - Represents a summation of the system manager and the journal manager
processes on the source system of the data group.
DB - Represents the status of the remote journal link. It is possible to have an
active status in this column even though the data group has not been started.
When the RJ link is active, database changes will continue to be sent to the target
system. MIMIX can read and apply these changes once the data group is started.
For data groups configured for source-send replication, this represents the status
of the database send process.
Obj - Represents a summation of the object processes that run on the source
system. These include the object send, object retrieve and container send
processes.
DA - This column represents the status of the data area polling process when the
data group replicates data areas through the data area poller. This column does
not contain data when data areas are replicated through the user journal with
advanced journaling or through the system journal.
Target - The following columns provide summaries of processes that run on the target
system. For details about status values, see “Replication problems reflected in the
Source and Target columns” on page 103.
Mgr - Represents a summation of the system manager, journal manager, and
target journal inspection processes on the target system of the data group. Target
journal inspection status includes status of inspection jobs for both target journals
(user and system) for the data group.
DB - Represents the summation of status for the database reader process, the
database apply process, and access path maintenance jobs1. For data groups
configured for source-send replication, this column represents the summation of
the status of database apply processes and access path maintenance jobs.
Obj - Represents the object apply processes.
Errors - When any errors are indicated in the following columns (DB and Object), they
are highlighted in red.
DB - Represents the sum of the number for database files, IFS objects, *DTAARA
and *DTAQ objects that are on hold due to errors plus the number of logical (LF)
and physical (PF) files that have access path maintenance1 failures for the data
1. Access path maintenance status and errors are reported on the Work with Data Groups display
only in installations running MIMIX 7.1.15.00 or higher. Access path maintenance jobs run only if
the access path maintenance (APMNT) policy is enabled.
104
Running H/F 1
group. To work with a subsetted list of file errors and access path errors, use
option 12 (Files needing attention). For a subsetted list of IFS object errors, use
option 51 (IFS tracking entries not active), For a subsetted list of *DTAARA and
*DTAQ errors, use option 53 (Object tracking entries not active).
Obj - Represents a count of the number of objects for which at least one activity
entry is in a failed state. To work with a subsetted list, use option 13 (objects in
error).
For additional information, see “Working with files needing attention (replication and
access path errors)” on page 210, “Working with tracking entries” on page 219, and
“Working with objects in error” on page 224.
Problems reflected in the Audits/Recov./Notif. field
When the Audits field is highlighted in reverse red, at least one audit has failed, has
unresolved differences, is out of compliance, or was not run due to a policy. When it is
highlighted in reverse yellow, at least one out-of-compliance audit is currently active
or an audit is approaching out of compliance. For more information about audits, see
“Displaying audit runtime status” on page 129.
The Recov. (recoveries) field indicates the number of active recoveries, including
those resulting from audits. Active recoveries are an indication of problems detected
by MIMIX AutoGuard which is attempting to correct them. For more information about
recoveries, see “Displaying recoveries” on page 164.
When the Notif. (notifications) field is highlighted in reverse red, at least one new
notification with a severity of *ERROR exists. When it is highlighted in reverse yellow,
at least one new notification with a severity of *WARNING exists. For more
information about notifications, see “Displaying notifications” on page 160.
Problems reflected in the Data Group column
When a data group name is highlighted in color, journaling problems exist that affect
replication of one or more types of data.
Table 23.
Conditions which highlight the data group name in color.
Color
Possible Problems
Red
One of the following conditions exists:
• FIles, IFS tracking entries, or object tracking entries defined to the data
group are not journaled or not journaled correctly on the source system.
• The source side journal is in standby or inactive state.
104
Running H/F 1
Table 23.
Conditions which highlight the data group name in color.
Color
Possible Problems
Yellow
One of the following conditions exists:
• Files, IFS tracking entries, or object tracking entries defined to the data group
are not journaled or journaled correctly on the target system. This is only
enforced if the data group is set up to journal on the target system as defined
in the data group definition.
• Data group file entries, IFS tracking entries, or object tracking entries are on
hold for reasons other than an error.
• The journal cache value for the source journal does not match the configured
value in the journal definition.
• The journal cache value for the target journal does not match the expected
cache value and the database apply session is active. If another data group
is using the journal definition as a source journal, the actual journal cache
value may be different than the configured value.
• The target journal state value for the target journal does not does not match
the expected state value and the database apply session is active. If another
data group is using the journal definition as a source journal, the actual state
may be different than the configured value.
Note: In a cooperative processing environment, files, IFS tracking entries, or object
tracking entries being added dynamically to the configuration for user journal
replication may reflect an intermediate state of not journaled until they have been
synchronized and become active to MIMIX.
Resolving problems highlighted in the Data Group column
In most environments, the most likely causes indicated in Table 23 are problems with
journaling. Problems associated with journal state or journal cache are only reported
in data groups which are configured to use those high availability journal performance
enhancements.
Journaling problems: If the data group name is highlighted in red or yellow, do the
following to check for and resolve journaling problems:
1. Check for not journaled conditions for each of the following:
•
To determine which files are not journaled, use option 17 (File entries) for the
data group. Then press F10 (journaled view) to see journaling status.
•
To determine which IFS tracking entries are not journaled, use option 50 (IFS
tracking entries) for the data group. Then press F10 (journaled view) to see
journaling status.
•
To determine which object tracking entries are not journaled, use option 52
(object tracking entries) for the data group. Then press F10 (journaled view) to
see journaling status.
2. To start journaling for a file or a tracking entry, use option 9 (Start journaling) to
start journaling.
3. You can use option 11 (Verify journaling) to verify that journaling has started.
104
Running H/F 1
Journal cache or journal state problems: If the data group name is highlighted in
red or yellow, do the following to check for and resolve problems:
1. From the Work with Data Groups display, use option 8 (Display status).
2. From the Data Group Status display, press F8 (Database).
3. The Jrn State and Cache Src and Tgt fields are located In the upper left corner of
the Data Group Database Status display. For each system (Src or Tgt) status of
the journal state is shown first, followed by the status of the journal cache. The
example below shows v for value in all for status positions. If any of these fields
are highlighted, there is a problem. Use “Resolving a problem with journal cache
or journal state” on page 119.
Jrn State and Cache
Src: v v
Tgt: v v
Manager problems reflected in the Source and Target columns
The status of needed system-level processes is reflected in the Mgr column for the
source and target system. The managers must be active for replication to occur. For
any status other than A (active), use “Working with system-level processes” on
page 149.
Replication problems reflected in the Source and Target columns
The status of each process is represented by a status letter and the color of the box
surrounding the letter. Table 24 describes the letters and colors used for status of the
replication process summaries shown in the Source and Target columns.
Table 24.
Possible status values for source and target process summaries
I
Inactive (highlighted red) – The process is currently not active.
L
Inactive RJ link (highlighted red) – The RJ link is currently not active. This status is
only displayed in the database source column when a data group uses MIMIX RJ
support.
A
Active (highlighted blue) – The process is currently active. For the database source
column, this value indicates that the send/receive processes are active.
C
RJ Catch-up mode (highlighted blue) – The remote journal is currently in catch-up
mode. This status can only be displayed in the database source column for data
groups that use remote journaling. Catch-up mode indicates that the operating
system is transferring journal entries from the source system journal to the remote
journal as quickly as possible. When the database reader process is active, MIMIX
processes the journal entries as they reach the target system.
R
Active RJ link (highlighted blue) – The RJ link is currently active. This status is only
displayed in the database source column when a data group uses MIMIX RJ
support.
U
Unknown (highlighted white) – The status of the process cannot be determined
possibly because of an error or communications problem.
104
Running H/F 1
Table 24.
Possible status values for source and target process summaries
J
RJ Link in Threshold (highlighted turquoise) – The RJ link has fallen behind its
configured threshold. View detailed status to determine the extent of the backlog.
T
Threshold reached (highlighted turquoise) – A process has fallen behind a
configured threshold. View detailed status to determine which process has
exceeded its backlog threshold and to determine the extent of the backlog. See
“Working with the detailed status of data groups” on page 105
X
Switch mode (highlighted red) – The data group is in the middle of switching the
data source system and status may not be retrievable or accurate.
P
Partially active (highlighted red) - At least one subprocess is active, but one or
more subprocesses is not active. This status is only displayed in process columns
that represent multiple processes. The data group name may also be shown in a
highlighted field of red.
In the Target DB column, partial status is also possible when all other processes,
including database apply, are active but access path maintenance1 is enabled and
does not have at least one active job.
D
Disabled – The process is currently not active and the data group is disabled.
Note: The status value for a disabled data group is the letter D displayed in standard
format. No colored blocks are used.
W
1.
Waiting at a recovery point (highlighted red) - The process is currently suspended
at a recovery point.
Access path maintenance is available only on installations running 7.1.15.00 or higher.
Note: Use F10 (Legend) to view a pop-up window that displays the status values
and colors. To remove the pop-up window, press Enter or F12 (Cancel).
Setting the automatic refresh interval
You can control how frequently the data shown on the Work with Data Groups display
is refreshed by doing the following:
1. Press F9 (Automatic refresh).
2. The Automatic Refresh Value pop-up appears. Specify how long you want the
system to wait before refreshing the information and press Enter.
The status displayed will automatically refresh when the specified interval passes. To
end the automatic refresh process, press Enter.
104
Working with the detailed status of data groups
Working with the detailed status of data groups
Basic support for detailed data group status is available in the 5250 emulator
interface.
The Data Group Status display (DSPDGSTS command) uses multiple views to
present status of a single data group. The views identify and provide status for each of
the processes used by the data group. Error conditions for the data group as well as
process statistics and information about the last entry processed by each replication
process are included. Some fields are repeated on more than one view.
The data group configuration determines what fields are visible. If the data group is
database only, the object fields are not shown. Similarly, if the data group is object
only, the database fields are not shown.
Displaying data group detailed status
Detailed status is available for one data group at a time. There are multiple ways of
locating and subsetting to the data group.
Do the following to access detailed status for a data group:
1. Use one of the following to locate the data group you want:
•
To select a data group from a list of all data groups in the installation, select
option 6 (Work with data groups) on the MIMIX Basic Main Menu and press
Enter.
•
To select a data group from a subsetted list for an application group, from the
Work with Application Groups display use option 13 (Data resource groups) to
select a resource group. On the resulting display use option 8 (Data groups).
2. The Work with Data Groups display appears. Type an 8 (Display status) next to
the data group you want and press Enter.
3. The Data Group Status display shows a merged view of data group activity on the
source and target systems. (See Figure 13.)
Only fields for the type of information replicated by the data group are displayed.
For example, if the data group replicates only objects from the system journal, you
will only see fields for system journal replication. If the data group replicates from
both the system journal and the user journal, you will see fields for both. To see
additional status information for object processes or database processes, do the
following:
•
If the data group contains object information, press F7 (Object) to view
additional object status displays. The Data Group Object Status display
appears.
•
If the data group contains database information, press F8 (Database) to view
additional database status displays. The Data Group Database Status display
appears. Tracking entry information for advanced journaling is also available.
4. For object information, there are three views. For database information, there are
four views available. Use F11 to change between views.
105
Working with the detailed status of data groups
Note: If the data group contains both database and object information, you can
toggle between object details and database details by using the F7 and F8
keys.
Merged view
The initial view displayed is the merged view. This view summarizes status for the
replication paths configured for the data group. The status of each process is
represented by the color of the box surrounding the process and a status letter. Table
25 shows possible status values.
Figure 13 shows a sample of the merged view of the Data Group Status display. The
data group in this view is configured for user journal replication using remote
journaling and for system journal replication. Also, access path maintenance is
enabled.
Figure 13. Merged view of data group status. The inverse highlighted blocks are not shown in
this example.
Data Group Status
17:39:36
Data group . . . . : CRITICALAP
Database errors . . . . :
1
Elapsed time . . . : 00:52:51
Objects in error/active :
4 /
0
Transfer definition: PRIMARY-A
State. . . . . . . . . : *ASYNCPEND
--------------------------- Source Statistics --------------------------System: LONDON-A
Jrn Mgr-A
RJLNK Mon-A
Receiver
Sequence #
Date
Time
Trans/Hour
Database
Source Jrn. LONDN0002 >0,000,002,591 4/20/08 11:02:35
Link-A
RJ Tgt Jrn. LONDN0002 >0,000,002,591 4/20/08 11:02:35
Last Read . LONDN0002 >0,000,002,591 4/20/08 11:02:35
Entries not read:
0 Est. time to read:
Object
Current . . AUDRCV0108
22,314,732 4/22/08 17:37:13
748
Send-I
Last Read . AUDRCV0103
22,175,464 4/21/08 11:05:56
*SHARED
Entries not read :
139,268 Est. time to read:
--------------------------- Target Statistics --------------------------System: CHICAGO-A
Jrn Mgr-A
DB Rdr- A
AP Maint-A
RJLNK Mon-A
Sys Jrn Insp -A Last Received
Unprocessed
Entry Count
Est Time
User Jrn Insp-A
Sequence #
Entry Count
Trans/Hour
To Apply
DB Apply-A
>0,000,002,590
Obj Apply-A
22,023,868
4
F3=Exit
F5=Refresh
F10=Restart statistics
F7=Object view
F12=Cancel
F8=Database
F14=Start DG
F9=Automatic refresh
F24=More keys
Note: Journal sequence numbers shown in the Source Statistics and Target
Statistics areas may be truncated if the journal supports *MAXOPT3 for the
receiver size and the journal sequence number value exceeds the available
display field. When truncation is necessary, the most significant digits (left-
106
Working with the detailed status of data groups
most) are omitted. Truncated journal sequence numbers are prefixed by '>'.
This is shown in Figure 13.
Table 25.
Possible values for detailed status. Not all statuses are used by each process.
Color and
Status
Description
Red
When displayed on the Data group, Database errors, or Objects in
error fields, a problem exists that requires action.
Red - I
The process is inactive.
Red - W
The process is suspended at a recovery point. This status is only
available for apply processes.
Yellow
When displayed on the Data group field, a problem exists that may
require attention.
Yellow - P
One or more of the processes is active but others are inactive. On the
merged view, this status is only possible for the Object Send field.
Turquoise - T
The process has a backlog which exceeds its configured threshold. On
fields which summarize status for multiple processes, use F7 and F8 to
view the specific threshold. The -T is not shown in statistical fields. If a
threshold condition persists over time, refer to the MIMIX Administrator
Reference book for information about possible resolutions.
White - U
The status of the process is unknown.
Blue - A
The process is active.
Blue - C
The RJ Link is in catch-up mode.This status is only possible for the
Database Link process in the merged view and the RJ link field in
some database views.
Green - D
The data group is disabled. This also means the data group is currently
inactive.
Top left corner: The top left corner of the Data Group Status display identifies the
data group, the elapsed time, and the status of the transfer definition in use. The
elapsed time is the amount of time that has elapsed since you accessed this display
or used the F10 (Restart statistics) key.
Top right corner: The top right corner of the display identifies the number of errors
identified by MIMIX. If the workstation supports colors, the number files and objects in
error will be displayed in red.
•
The Database errors field identifies the number of errors in user journal replication
processes. This includes all file entries, IFS tracking entries, and object tracking
entries in error. When access path maintenance1 is enabled, this also includes the
number of logical and physical files that have access path maintenance failures
for the data group.
1. Access path maintenance is available only on installations running MIMIX 7.1.15.00 or higher.
107
Working with the detailed status of data groups
•
The Objects in error/active fields indicate the number of objects that are failed and
the number of objects with pending activity entries. The first number in these fields
indicates the number of objects defined to the data group that have a status of
*FAILED. The second number indicates the number of objects with active
(pending) activity entries.
•
The State field identifies the state of the remote journal link. The values for the
state field are the same as those which appear on the Work with RJ Links display.
This field is not shown if the data group uses source-send processes for user
journal replication.
Source statistics: The middle of the display shows status and summarized statistics
for the journals being used for replication and the processes that read from them. The
following process fields are possible:
System - Identifies the current source system definition. The status value is an
indication of the success in communicating with that system.
Jrn Mgr - Displays the status of the journal manager process for the source
system.
DA Poll - Displays the status of the data area poller. This field is present only if the
data group replicates data areas using this process.
RJLNK Mon - Displays status of the RJLNK monitor on the source system. This
field is present only for data groups that use remote journaling.
Database (Link or Send) - Identifies the status of the process which transfers user
journal entries from the source system to the target system.
Link - Displayed when the data group is configured for remote journaling. The
status is that of the of the RJ link.
Send -Displayed when the data group id configured for MIMIX source-send
processes. The status is that of the database send process.
Object Send - Displays a summation of status from the object send, object
retrieve, and container send processes. The highest priority status from each
process determines the status displayed. Use F7 (Object view) to see the
individual processes. When the data group uses a shared object send job, either
the value *SHARED or a three-character job prefix is displayed below the Send
process status, The value *SHARED indicates that the data group uses the MIMIX
generated shared object send prefix for this source system. A three-character
prefix indicates this data group uses a shared object send job on this system that
is shared only with other data groups which specify the same prefix.
For the Database and Object processes, additional fields identify current journal
information, the last entry that has been read by the process, and statistics related to
arrival rate, entries not read, and estimating the time to read.
Current - For the Database Send and Object Send processes, this identifies the
last entry in the currently attached journal receiver. This information is used to
show the arrival rate of entries to the journals.
Note: If the data group uses remote journaling, current information is displayed
in two rows, Source jrn and RJ tgt jrn. The source journal sequence
number refers to the last sequence number in the local journal on the
108
Working with the detailed status of data groups
source system. The remote journaling target journal sequence number
refers to the last sequence number in the associated remote journal on the
target system.
Transactions per hour - For current journal information, this is based on the
number of entries to arrive on the journal over the elapsed time the statistics have
been gathered. For last read information, this is based on the actual number of
entries that have been read over the elapsed time the statistics have been
gathered.
Last Read - Identifies the journal entry that was last read and processed by the
object send, database send, or database reader.
Transactions per hour - For current journal fields, this is based on the number of
entries to arrive on the journal over the elapsed time the statistics have been
gathered. For last read fields, this is based on the actual number of entries that
have been read over the elapsed time the statistics have been gathered and will
change due to elapsed time and the rate at which entries arrive in the journal.
Entries not read - This a calculation of the number of journal entries between the
last read sequence number and the sequence number of the last entry in the
current receiver for the source journal. An asterisk (*) preceding this field indicates
that the journal receiver sequence numbers have been reset between the last
entry in the current receiver and the last read entry.
Estimated time to read - This is a calculation using the entries not read and the
transactions per hour rate. This calculation is intended to provide an estimate of
the length of time it may take the process (database reader, database send, or
object send) to complete reading the journal entries.
Target statistics: The lower part of the display shows status and summarized
statistics for all target system processing. The following process fields are possible:
System - Identifies the current target system definition. The status value is an
indication of the success in communicating with that system.
Jrn Mgr - Displays the status of the journal manager process for the target system.
DB Rdr - Displays status of the database reader. This field is present only for data
groups that use remote journaling.
AP Maint - Displays status of the access path maintenance1 processes. This field
is only present when optimized access path maintenance has been enabled.
RJLNK Mon - Displays status of the RJLNK monitor on the target system. This
field is present only for data groups that use remote journaling.
Sys Jrn Insp - Displays the status of target journal inspection for the system
journal (QAUDJRN) on the target system of the data group. This field is displayed
when the journal definition for the system journal on the current target system
permits target journal inspection and the data group is enabled and has been
started at least once.
1. Access path maintenance is available only on installations running MIMIX 7.1.15.00 or higher. In
earlier levels of MIMIX, if parallel access path maintenance is enabled, its status is displayed in
the Prl AP Mnt field that appears in this location.
109
Working with the detailed status of data groups
User Jrn Insp - Displays the status of target journal inspection for the user journal
on the target system of the data group. This field is displayed when the journal
definition for the user journal on the current target system permits target journal
inspection and the data group is enabled, performs user journal replication,
permits journaling on target, and has been started at least once.
DB Apply and Obj Apply - Each field displays the combined status for the apply
jobs in use by the process. For each process, additional fields show statistics for
the last received journal sequence number, number of unprocessed entries,
approximate number of transactions per hour being processed, and the
approximate amount of time needed to apply the unprocessed transactions for all
database or object apply sessions.
Object detailed status views
Figure 14, Figure 15, and Figure 16 show samples of the information available when
you use F7 (Object) to view the detailed object information. Use F11 to move between
the three views of detailed object status. On each view, you can use the F1 (Help) key
to see a description of that view’s contents.
In all object views, journal sequence numbers may be truncated if the journal supports
*MAXOPT3 for the receiver size and the journal sequence number value exceeds the
available display field. When truncation is necessary, the most significant digits (leftmost) are omitted. Truncated journal sequence numbers are prefixed by '>'.
The possible status values are indicated in Table 25, with the following additional
status values that are unique to several system journal replication processes.
The Min, Act, and Max fields for the Retrieve, Send, and Apply processes indicate the
minimum, active, and maximum number of jobs for each process. The number of
active jobs vary based on the work load. The active count is highlighted with color for
the following conditions:
Red - The number of active jobs is zero (0).
Yellow - The number of active jobs is greater than zero (0) but less than the
minimum number of processes.
Turquoise - The process has a backlog that exceeds its configured threshold.
When this occurs, the backlog field for the process is also highlighted in the color
turquoise.
Blue - The number of active jobs is equal to or greater than the minimum number
of processes.
110
Working with the detailed status of data groups
Figure 14 and Figure 17 show the active count highlighted.
Figure 14. Data group detail status, object view 1.
Data Group Object Status
Data group . . . . :
Elapsed time . . . :
CRITICALAP
00:52:51
System:
Objects in error . .
CHICAGO
17:50:00
4
Send Process -I
*SHARED
Jrn Manager -A
Receiver
Sequence #
Date
Time
Trans/Hour
Current . . AUDRCV0108
10,022,314,732 4/22/08 17:37:13
748
Last Read . AUDRCV0103
10,022,175,464 4/21/08 11:05:56
Entries not read:
139,268 Est. time to read:
--------------------- Object Retrieve/Container Send ---------------------Retrievers
Retrieve
Senders
Send
Containers Containers
Min Act Max
Backlog
Min Act Max Backlog
Sent
Per Hour
1
0
5
1
0
5
1,145
------------------------------- Object Apply ------------------------------Applies
Apply
Active
Entries
Entries
Min Act Max Backlog
Objects
Sequence #
Applied
Per Hour
1
1
5
4 >0,022,023,871
1,133
F3=Exit
F5=Refresh
F9=Automatic refresh
F7=Merged view
F11=View 2
F8=Database view
F12=Cancel
F24=More keys
Figure 15. Data group detail status, object view 2.
Data Group Object Status
Data group . . . . :
Elapsed time . . . :
CRITICALAP
00:52:51
System:
Objects in error . .
CHICAGO
17:57:31
4
Send Process -I
*SHARED
Jrn Manager -A
Receiver
Sequence #
Date
Time
Trans/Hour
Current . . AUDRCV0108
10,022,314,732 4/22/08 17:37:13
748
Last Read . AUDRCV0103
10,022,175,464 4/21/08 11:05:56
Entries not read:
139,268 Est. time to read:
--------------------- Object Retrieve/Container Send ---------------------Retrievers
Retrieve
Senders
Send
Containers Containers
Min Act Max
Backlog
Min Act Max Backlog
Sent
Per Hour
1
0
5
1
0
5
1,145
------------------------------- Object Apply ------------------------------Applies
Apply
------------- Last Applied ------------Min Act Max
Backlog
Sequence #
Type
Object
1
1
5
0 >0,022,023,871 *DOC
BVT#I/PBBDOCXX.002
F3=Exit
F5=Refresh
F9=Automatic refresh
F7=Merged view
F11=View 3
F8=Database view
F12=Cancel
F24=More keys
111
Working with the detailed status of data groups
Figure 16. Data group detail status, object view 3.
DG Object Journal Entry Detail
Data group . . . . :
Source system:
Current entry
Last read entry
Last received
System:
CHICAGO
18:01:20
CRITICALAP
LONDON-A
Entry
TSF
-
Sequence #
10,022,314,732
10,022,175,464
Receiver
AUDRCV0108
AUDRCV0103
-
Date
Time
4/22/08 17:37:13
4/21/08 11:05:56
Target system:
CHICAGO-A
------------------------------- Object Send ------------------------------Entry
Sequence # Date
Time
Type
Object
Active
TCO >0,022,023,868 4/20/08 13:59:23 *DOC
BVT#I/PBBDOCXX.002
Processed TCO >0,022,023,868 4/20/08 13:59:23 *DOC
BVT#I/PBBDOCXX.002
------------------------------- Object Apply ------------------------------Entry
Sequence # Date
Time
Type
Object
Processed TCA >0,022,023,871 4/20/08 13:59:23 *DOC
BVT#I/PBBDOCXX.002
F3=Exit
F5=Refresh
F9=Automatic refresh
F7=Merged view
F11=View 1
F8=Database view
F12=Cancel
F24=More keys
Database detailed status views
Figure 17, Figure 18, Figure 19, and Figure 20 show samples of the information
available when you use F8 (Database) to view the detailed database information. On
each view, you can use the F1 (Help) key to see a description of that view’s contents.
In database views that include sequence numbers, the journal sequence numbers
may be truncated if the journal supports *MAXOPT3 for the receiver size and the
journal sequence number value exceeds the available display field. When truncation
is necessary, the most significant digits (left-most) are omitted. Truncated journal
sequence numbers are prefixed by '>'.
Most fields that display status of a process have some or all of the possible values
indicated in Table 25. Possible values for the Jrn State and Cache (Src and Tgt) fields
are indicated in Table 27 (journal state) and Table 28 (journal cache).
The data group configuration determines whether the Send process field is replaced
by the RJ Link field. When remote journaling is configured, the RJ Link and DB Rdr
fields are shown.
The AP Maint field is displayed on views 1 and 2 (Figure 17 and Figure 18) only when
the access path maintenance1 policy is enabled. When present, this field displays the
status of the access path maintenance job that persists while the database apply
process is active.
1. Access path maintenance is available only on installations running 7.1.15.00 or higher.
112
Working with the detailed status of data groups
In the top right corner of database views 1 and 2 (Figure 17 and Figure 18), these
fields display combined counts of replicated entries and errors for file entries, IFS
tracking entries, and object tracking entries:
•
File and Tracking entries
•
Not journaled Src Tgt - If the number of not journaled errors on either system
exceeds 99,999, that system’s field displays +++++.
•
Held due to error
•
Access path maint. errors
•
Held for other reasons
Database view 4 (Figure 20) separates this information into columns for file entries,
IFS tracking entries, and object tracking entries.
If a data group has multiple database apply sessions you will see an entry for each
session in the Apply Status column on database views 1, 2, and 3 (Figure 17, Figure
18, and Figure 19). Each session has its own status value. In these sample figures
there is only one apply session (A) which is active (-A).
Figure 17. Data group detail status—database view 1.
In this example, the Link status of -A and the presence of the Reader status indicate that the
data group uses remote journaling and access path maintenance. The display also shows that
journal standby state is active and journal caching is not active. The unprocessed entry count
indicates that the final journal entry has not been applied. The > character preceding
sequence numbers for the apply session indicate truncated sequence numbers that are associated with *MAXOPT3 support.
CHICAGO
18:07:02
Data group . . . . : CRITICALAP
File and Tracking entries :
12
Elapsed time . . . : 00:52:51
Not journaled Src:
1 Tgt:
1
Jrn State and Cache
Src: A N Tgt: A N
Held due to error . . . . :
1
RJ Link-A
AP Maint-A
Access path maint. errors :
1
Jrn Mgr-A
DB Rdr- A
Held for other reasons . :
0
Receiver
Sequence #
Date
Time Trans/Hour
Source Jrn. LONDN0002 12,345,678,900,000,002,591 4/20/08 11:02:35
Rj Tgt Jrn. LONDN0002 12,345,678,900,000,002,591 4/20/08 11:02:35
Last Read . LONDN0002 12,345,678,900,000,002,591 4/20/08 11:02:35
Entries not read:
0 Est. time to read:
------------------------------- Database Apply
--------------------------Apply
Received
Processed
Unprocessed
Entry Count Est Time Open
Status
Sequence #
Sequence #
Entry Count
Trans/Hour To Apply Commit
A-A >0,000,002,593 >0,000,002,592
1
*NO
Data Group Database Status
F3=Exit
F5=Refresh
F9=Automatic refresh
F7=Object view
F11=View 2
System:
F8=Merged view
F12=Cancel
F24=More keys
113
Working with the detailed status of data groups
Figure 18. Data group database status—view 2.
In this example, the Link status of A and the presence of the Reader status indicates that the
data group uses remote journaling. The display also shows that access path maintenance is
used and active, and that journal standby state is active and journal caching is not active.
CHICAGO
16:07:03
Data group . . . . : CRITICALAP
File and Tracking entries. :
12
Elapsed time . . . : 00:52:51
Not journaled Src:
1 Tgt:
1
Jrn State and Cache
Src: A N Tgt: A N
Held due to error . . . . :
1
RJ Link-A
AP Maint-A
Access path maint. errors :
1
Jrn Mgr-A
DB Rdr- A
Held for other reasons . :
0
Receiver
Sequence # Date
Time
Trans/Hour
Source Jrn. LONDN0002 12,345,678,900,000,002,591 4/20/08 11:02:35
Rj Tgt Jrn. LONDN0002 12,345,678,900,000,002,591 4/20/08 11:02:35
Last Read . LONDN0002 12,345,678,900,000,002,591 4/20/08 11:02:35
Entries not read:
0 Est. time to read:
------------------------------- Database Apply
--------------------------Apply
Received
Apply point
Clock Time Hold MIMIX Log
Open
Status
Sequence #
Sequence #
Difference
Sequence #
Commit Id
A-A >0,000,002,590 >0,000,002,590
Data Group Database Status
F3=Exit
F5=Refresh
F9=Automatic refresh
F7=Object view
F11=View 3
System:
F8=Merged view
F12=Cancel
F24=More keys
Figure 19. Data group database status, view 3.
System:
DG Database Jrn Entry Detail
Data group . . . . :
Source system:
Current entry
RJ target entry
Last read entry
Last received
CHICAGO
18:16:04
CRITICALAP
LONDON-A
Entry
Sequence #
UMX 12,345,678,900,000,002,591
UMX 12,345,678,900,000,002,591
UMX 12,345,678,900,000,002,591
- 12,345,678,900,000,002,590
Receiver
LONDN0002
LONDN0002
LONDN0002
-
Date
4/20/08
4/20/08
4/20/08
4/20/08
Time
11:02:35
11:02:35
11:02:35
11:01:04
Target system:
CHICAGO-A
------------------------------- Database Apply ----------------------------Apply Entry
Sequence # Date
Time
Object
Library
Member
A-A
UMX >0,000,002,590 4/20/08 11:01:04
F3=Exit
F5=Refresh
F9=Automatic refresh
F7=Object view
F11=View 1
F8=Merged view
F12=Cancel
F24=More keys
114
Identifying replication processes with backlogs
Figure 20. Data group detail status—database view 4. In this example, the combined number
of file and tracking entries shown in Figure 17 and Figure 18 are separated into separate columns for file entries, IFS tracking entries, and object tracking entries.
File and Tracking Entry Status
Data group . . . . :
System:
CHICAGO
16:07:03
CRITICALAP
Number of entries . . . . :
Not journaled on source . :
Not journaled on target . :
Held due to error . . . . :
Access path maint. errors :
Held for other reasons . .:
F3=Exit
F5=Refresh
F9=Automatic refresh
File
Entries
7
1
0
0
1
0
IFS Trk
Entries
3
0
1
1
0
F7=Object view
F11=View 1
Obj Trk
Entries
2
0
0
1
0
F8=Merged view
F12=Cancel
F24=More keys
Identifying replication processes with backlogs
If replication processes are active and have no reported error conditions, a replication
process that has exceeded its backlog threshold will have a status that reflects this
condition. However, if a replication process is inactive or has an error condition with a
higher priority status, the threshold condition will not be visible in the process status
until the process is started or the problem is resolved. Also, a backlog may exist but
not be large enough to exceed the threshold setting, or the threshold warning setting
may have been disabled (set to *NONE).
Do the following to check for a backlog condition:
1. To access the details for a data group, use the procedure in “Displaying data
group detailed status” on page 105.
2. Use F7 or F8 on the Data Group Status display to locate the appropriate view for
the process you want to check. Table 26 identifies this information and the
115
Identifying replication processes with backlogs
appropriate fields for each process.
Table 26.
Location of fields which identify backlogs and threshold conditions for replication processes
Process
Description
View
RJ Link
For remote journaling configurations,
differences between journal entries
identified by Source Jrn and Last read.
For MIMIX source-send configurations,
differences between journal entries
identified by Current and Last Read.
• Entries not read
Sequence # 1
• Last Read Date and
Time 2
Unprocessed Entry Count
• Apply Status
• Unprocessed Entry
Count
The backlog is the quantity of journal entries that have not been read from the system journal.
The time difference between the last entry that was read by the process and the last entry in
the system journal can also be an indication of a backlog.
Multiple data groups sharing the object send job is one possible cause of a persistent
backlog.
Merged view,
Object views 1, 2,
and 3
Object
Retrieve
• RJ tgt jrn Sequence # 1
• RJ tgt jrn Date and
Time 2
The backlog is the number of entries waiting to be applied to the target system. Each apply
session is listed as a separate entry with its own backlog.
Database views 1, 2,
and 3
Object
Send
Differences between journal entries
identified by Source Jrn and RJ Tgt jrn
for the database link.
The backlog is the quantity of journal entries that are waiting to be read by the process. The
time difference between the last entry that was read by the process and the last entry in the
journal on the source system can also be an indication of a backlog. This may be a temporary
condition due to maximized log space capacity. If the log space capacity was reached, the
database reader job will be idle until the database apply job is able to catch up. If the
condition is unable to resolve itself, action may be required.
Merged view,
Database views 1
and 2
DB Apply
Fields Highlighted
When Threshold
Exceeded
The backlog is the quantity of source journal entries that have not been transferred from the
local journal on the source system to the remote journal on the target system. The time
difference between the last entry in each journal can also be an indication of a backlog.
Merged view,
Database views 1
and 2
DB Reader
or DB Send
Fields to Check for Backlog
Differences between transactions
identified for Object Current and Last
Read
• Entries not read
Sequence # 1
• Last Read Date and
Time 2
The backlog is the number of entries for which MIMIX is waiting to retrieve objects.
Object views 1 and 2
Retrieve Backlog
• Retrievers, Act column
• Retrieve Backlog
116
Data group status in environments with journal cache or journal state
Table 26.
Process
Location of fields which identify backlogs and threshold conditions for replication processes
Description
View
Container
Send
Fields Highlighted
When Threshold
Exceeded
The backlog is the number of packaged objects for entries that are waiting to be sent to the
target system.
Object views 1 and 2
Object
Apply
Fields to Check for Backlog
Container Send Backlog
• Senders, Act column
• Container Send
Backlog
The backlog is the number of entries waiting to be applied to the target system.
Object views 1 and 2
Apply Backlog
• Applies, Act column
• Apply Backlog
Notes:
1. When highlighted, the threshold journal entry quantity criterion is exceeded.
2. When highlighted the threshold time criterion is exceeded.
Data group status in environments with journal cache or
journal state
Additional information is reported within data group status configured to use MIMIX
support for IBM’s High Availability Journal Performance IBM i option 42, Journal
Standby feature and Journal caching. When these high availability journal
performance enhancements are in use, conditions that require action or attention are
reflected in these locations:
•
The data group name is highlighted on the Work with Data Groups display. The
possible problems associated with journal cache or journal state are identified
Table 23 in topic “Problems reflected in the Data Group column” on page 101.
•
Jrn State and Cache (Src and Tgt) fields within the data group detailed status are
highlighted. These fields are on the database views 1 and 2 of the Data Group
Database Status display (Figure 17, Figure 18 respectively, shown in “Database
detailed status views” on page 112). The possible values for the Jrn State and
Cache (Src and Tgt) fields are indicated in Table 27 (journal state) and Table 28
(journal cache).
The Jrn State and Cache (Src and Tgt) fields reflect journal standby state and journal
caching actual values for the journals when the IBM high availability performance
enhancements are installed on the systems defined to the data group. These fields
appear on database views 1 and 2 (Figure 17 and Figure 18). The target journal state
and cache values are set on the journal when the database apply session is started.
Journal State - The status values indicate the actual state value for the source and
117
Data group status in environments with journal cache or journal state
target journals. Table 27 shows the possible values for each field.
Journal Cache - The status indicate the actual cache value for the source and
target journals. Table 28 shows the possible values for each field.
For each system (Src or Tgt) status of the journal state is shown first, followed by the
status of the journal cache. If a problem exists with journal state or journal cache, the
data group name is also highlighted with the same color. For information about
resolving journal cache or journal state problems, see “Resolving a problem with
journal cache or journal state” on page 119.
Table 27.
Possible status values for Journal State fields
Field
Color and
Status
Either system
White
U
Unknown. MIMIX was not able to retrieve values, possibly
because the journal environment has not yet been built.
No color
A
Journal state is active
No color
X
The required IBM feature, IBM i option 42 - High
Availability Journal Performance, is not installed on this
system
No color
S
Journal is in standby state as expected
Red
S
Source journal is in standby state but that state is not
expected.
Red
I
Source journal in inactive state but that state is not
expected.
Yellow
S
Target journal state or cache is not as expected and the
database apply session is active
Yellow
I
Target journal state is inactive but that state is not
expected.
Source
Target
Description
blank blank
Table 28.
Field
The IBM feature is installed but the data group is
configured to not journal on the target system.
Possible status values for Journal Cache fields
Color and
Status
White
Description
U
Unknown. MIMIX was not able to retrieve values, possibly
because the journal environment has not yet been built
118
Data group status in environments with journal cache or journal state
Table 28.
Possible status values for Journal Cache fields
Field
Color and
Status
Either
System
No color
X
The required IBM feature, IBM i option 42 - High
Availability Journal Performance, is not installed on this
system
No color
Y
Caching is active
No color
N
Caching is not active.
Yellow
Y
Source journal cache value is not as expected.
Yellow
N
Source journal cache value is not as expected.
Yellow
Y
Target journal cache value not as expected and the
database apply session is active.
Yellow
N
Target journal cache value not as expected and the
database apply session is active.
Source
Target
Description
blank blank
The IBM feature is installed but the data group is
configured to not journal on the target system.
Resolving a problem with journal cache or journal state
Problems with journal state or journal cache can cause the name of a data group to
be highlighted on the Work with Data Groups display. If the data group name is
highlighted in red or yellow, do the following to check for and resolve problems:
1. From the Work with Data Groups display, use option 8 (Display status).
2. From the Data Group Status display, press F8 (Database).
3. The Jrn State and Cache Src and Tgt fields are located In the upper left corner of
the Data Group Database Status display. For each system (Src or Tgt) status of
the journal state is shown first, followed by the status of the journal cache. The
example below shows v for value in all for status positions. Based on the status
displayed in these fields, you can take the actions described in the following steps
to correct the problem:
Jrn State and Cache
Src: v v
Tgt: v v
4. Source system journal state (first Src: value) - If the source system state is red
and the value for the journal state is standby (S) or inactive (I), the journal state
must be changed and all data replicated through the user journal must be
synchronized. Do the following:
a. Press F12 (Cancel) to return to the Work with Data Groups display. Note which
system is specified as the source system for the data group.
b. Use option 45 (Journal Definitions) to view the journal definitions used for the
data group in error.
119
Data group status in environments with journal cache or journal state
c. On the Work with Journal Definitions display, determine the journal name and
library specified for the system that is the source system for the data group.
d. Specify the name and library of the source system journal in the following
command:
CHJRN CHGJRN JRN(library/name) JRNSTATE(*ACTIVE)
e. All data replicated through the user journal must be synchronized. For detailed
information about synchronizing a data group, refer to your Runbook or to the
MIMIX Administrator Reference book.
5. Source system journal cache (second Src: value) - If the source system cache
is yellow, the actual status does not match the configured value in the journal
definition used on the source system. Do the following:
a. Press F12 (Cancel) to return to the Work with Data Groups display. Note which
system is specified as the source system for the data group.
b. Use option 45 (Journal Definitions) to view the journal definitions used for the
data group in error.
c. On the Work with Journal Definitions display, use option 5 (Display next to the
journal definition listed for the source system.
d. Check the value of the Journal caching (JRNCACHE) parameter.
e. Determine which value is appropriate for journal cache, the configured value or
the actual status value. Once you have determined this, either change the
journal definition value or change the journal cache (CHGJRN command) so
that the values match.
6. Target system state (first Tgt: value) or Target system cache (second Tgt:
value) - If the target system state or cache is yellow, the actual value for state or
cache does not match the configured value. Do the following:
a. Press F12 (Cancel) to return to the Work with Data Groups display. Note which
system is specified as the target system for the data group.
b. Use option 45 (Journal Definitions) to view the journal definitions used for the
data group in error.
c. On the Work with Journal Definitions display, use option 5 (Display next to the
journal definition listed for the target system.
d. Check the value of the following parameters, as needed:
• Target journal state (TGTSTATE)
• Journal caching (JRNCACHE)
e. Determine why the actual status of the journal state or journal cache does not
match the configured value of the journal definition used on the target system.
f. Determine which values are appropriate for journal state and journal cache, the
configured value or the actual status value. Once you have determined this,
either change the journal definition value or change the journal state or cache
(CHGJRN command) so that the values match.
120
CHAPTER 7
Working with audits
Audits are defined by and invoked through rules and influenced by policies. Aspects
of audits include schedules, status, reported results, and their compliance status.
MIMIX is shipped so that auditing can occur automatically. For day-to-day operations,
auditing requires minimal interaction to monitor audit status and results. MIMIX user
interfaces separate audit runtime status, compliance status, and scheduling
information onto different views to simplify working with audits. Compliance errors and
runtime errors require different actions to correct problems.
This chapter provides information and procedures to support day-to-day operations
as well as to change aspects of the auditing environment. The following topics are
included.
•
“Auditing overview” on page 122 describes concepts associated with auditing and
describes the differences between automatic priority audits and automatic
scheduled audits.
•
“Guidelines and considerations for auditing” on page 126 identifies considerations
for specific audits, auditing best practices, and recommendations for checking the
audit results.
•
“Displaying audit runtime status” on page 129 identifies the Audit Summary
interfaces and provides procedures for common activities with audits, such as
running audits immediately and resolving reported problems.
•
“Displaying audit history” on page 137 describes how to display history for specific
audits of a data group.
•
“Working with audited objects” on page 139 describes how to display a list of
objects compared by one or more audits.
•
“Working with audited object history” on page 142 describes how to access the
audit history for a specific object.
•
“Displaying audit compliance” on page 144 identifies the Audit Compliance
interfaces and describes how to determine if an audit is audit has a compliance
problem.
•
“Displaying scheduling information for automatic audits” on page 147 describes
how to access the Audit Schedule interfaces, how to display when prioritized
audits will run, and how to display when scheduled audits will run.
121
Auditing overview
Auditing overview
All businesses run under rules and guidelines that may vary in the degree and in the
methods by which they are enforced. In a MIMIX environment, auditing provides rules
and enforcement of practices that help maintain availability and switch-readiness at
all times.
Not using or limiting audit use does little to confirm the integrity of your data. These
approaches can mean lost time and issues with data integrity when you can least
afford them.
In reality, successful auditing means finding the right balance somewhere between
these approaches:
•
Audit your entire replication environment every day. The benefit of this approach
is knowing that your data integrity exposure is limited to data that changed since
the last audit. The trade-off with this approach can be time and resources to
perform audits.
•
Audit only replicated data that “needs” auditing. This approach can be faster and
use fewer resources because each audit typically has fewer objects to check. The
trade-offs are determining what needs auditing and knowing when objects were
last audited.
MIMIX makes auditing easy by automatically auditing all objects periodically and
auditing a subset of objects every day. MIMIX also provides the ability to fine-tune
aspects of auditing behavior and their automatic submission and the ability to
manually invoke an audit at any time.
Components of an audit
Together, three components identify a unique audit. Each component must exist to
allow an audit to run.
Rule - A program by which an audit is defined and invoked. Each rule shipped with
MIMIX pre-defines a compare command to be invoked and the possible actions
that can be initiated, if needed, to correct detected problems. When invoked, each
rule can check only the class of objects associated with its compare command.
Names of rules shipped with MIMIX begin with the pound sign (#) character.
Data group - A data group provides the context of what to check and how results
are reported. Multiple audits (rules) exist for each data group.
Note: Audits are not allowed to run against disabled data groups.
Schedule - Each unique combination of audit rule and data group has its own
schedule, by which it is automatically submitted to run. MIMIX ships default
scheduling information associated with each shipped rule. Scheduling can be
adjusted for individual audits through policies. A manually invoked audit can be
thought of as an immediate override of scheduling information.
Although people use the terms “audit” and “rule” interchangeably, a rule is a
component of an audit. The process of auditing runs a rule program.
122
Auditing overview
Phases of audit processing
The process of auditing consists of a compare phase and a recovery phase.
In the compare phase of an audit, the identified audit rule initiates a specific compare
command against the data group. The Audit level policy determines if an audit is
allowed to run and how aggressively an audit checks your environment during its
compare phase. If a shipped audit rule provides more than one audit level, each level
provides increasingly more checking capability.
If there are detected differences when the compare phase completes, the audit enters
its recovery phase to start automatic recovery actions as needed. MIMIX attempts to
correct the differences and sends generated reports, called recoveries, to the user
interface. MIMIX removes these generated reports when the recovery action
completes successfully. If the recovery job fails to correct the problem, MIMIX
removes the recovery and sends an error notification to the user interface.
Most audit rules support a recovery phase. MIMIX is shipped with defaults that enable
audits to enter the recovery phase automatically when needed. The recovery phase
can be optionally disabled in the Automatic audit recovery policy.
Object selection methods for automatic audits
MIMIX provides two approaches to performing audits automatically. The biggest
difference between these approaches is how objects are selected to be audited. The
other significant difference is when each type of audit is allowed to run.
•
In scheduled object auditing, an audit run selects all objects that are configured
for the data group and within the class of objects checked by the audit. MIMIX
automatically runs an audit according to its specified scheduling criteria. Each
time a scheduled audit runs, all eligible configured objects are selected.
•
In prioritized object auditing, an audit run selects replicated objects according to
their internally assigned priority category and an auditing frequency assigned to
the category. The result is often a subset of the objects replicated by the data
group. Each time a prioritized audit runs, its subset of objects selected to check
may be unique. MIMIX automatically runs a prioritized audit periodically within its
specified time range every day. It may run approximately once per hour or more
often during its time range.
An audit that is manually invoked from the Work with Audits display in a 5250
emulator is an immediate run of a scheduled audit. Priority audits cannot be manually
invoked from this display. From Vision Solutions Portal, you have the ability to perform
an immediate run of either method of auditing.
Prioritized auditing can reduce the impact of auditing on resources and performance.
This benefits customers who cannot complete IFS audits, cannot audit every day, or
do not audit at all because of time or resource issues.
When both types of auditing are used, you can achieve a balance between verifying
data integrity and resources. Either or both types of automatic auditing can be
disabled, although that is not recommended.
123
Auditing overview
How priority auditing determines what objects to select
MIMIX determines the auditing priority of each replicated object based on its most
recent change, most recent audit, and the frequency specified for auditing priority
categories. At any time, every replicated object falls within one of several
predetermined categories. Objects in each category are eligible for selection
according to the frequency assigned to their category.
Each prioritized audit runs approximately once per hour, or more often, every day
during its time range specified in the Priority audit policy. Each time the audit starts, it
selects only the objects eligible in each category.
Table 29.
Priority auditing categories
Category
Description
Eligibility Frequency
Objects not equal
Objects that had any value other than
equal (*EQ) in their most recent audit. This
includes objects for which a detected
difference was automatically resolved.
Objects in this category
have the highest priority
and are always
selected.
New objects
A new object is one that has not been
audited since it was created.
Changed objects
A changed object is one that has been
modified since the last time it was audited.
Unchanged
objects
An unchanged object is one that has not
been modified since the last time it was
audited.
Objects in these
categories are eligible
for selection according
to the category
frequency specified in
the Priority audit policy.
Audited with no
differences
An object with no differences is one that
has not been modified since the last time it
was audited and has been successfully
audited with no changes on at least three
consecutive audit runs. Objects remain in
this category until a change occurs.
The #FILDTA audit always selects all members of a file for which auditing is less than 100
percent complete. The occurs in all of the above object selection categories.
Initially, the objects selected by a prioritized audit may be nearly the same as those
selected by a scheduled audit. However, over time the number of objects selected by
a prioritized object stabilizes to a subset of those selected by a scheduled audit.
When both scheduled and priority audits are allowed for the same rule and data
group, MIMIX may not start a prioritized audit if the scheduled audit will start in the
near future.
How audits are submitted automatically
When MIMIX is started (STRMMX command), all system-level processes necessary
for replication and auditing are started, including the master monitor. On each system,
the master monitor starts job scheduling activities for auditing. This ensures that
124
Auditing overview
audits are submitted automatically according to the polices in effect for when to run
priority audits and scheduled audits.
The time specified in policies is local to each system. At the appropriate time for each
audit, a job is initiated on each system in the data group. MIMIX uses the Run rule on
system policy to determine where the audit should run and immediately ends the audit
job if it is not on the appropriate system.
For a scheduled audit, the Audit schedule policy determines the time and frequency of
when the audit runs. A scheduled audit can be set to run on specific dates or days of
the week, or on relative days of the month.
For a prioritized audit, the Priority audit policy determines the range of time during
which the audit can start each day. A prioritized audit can run multiple times during the
specified range, approximately once per hour or more often.
If you start replication through procedures or processes that invoke the Start Data
Group (STRDG) command, you also need to ensure that the master monitor is started
on all systems in your installation (STRMSTMON command) so that automatic
auditing can occur.
Audit status and results
When audits complete or end in error, their status is reported in the audit summary
user interfaces In a 5250 emulator, this is on the Work with Audits display (WRKAUD
command). In Vision Solutions Portal, this is the Audits portlet. A summary of all audit
status also “bubbles up” to the level of data group interfaces.
The information available about each audit identifies the status of actions performed
by its rule, how the audit selected objects for comparison, the audit’s compliance
status, policy values which affect the actions of each phase, and scheduling
information. When a phase completes, its timestamps and statistics are also
available.
When audit recoveries are enabled, you can control the severity level of the
notifications that are returned when the rule ends in error with the Notification severity
policy.
You can also view job logs associated with notifications and recoveries. Job logs are
accessible from the system on which the audit comparison or recovery job ran.
Audit compliance
Compliance is an indication of whether an audit ran within the time frame of the
compliance thresholds set in auditing policies.
For audits configured for scheduled object auditing or both scheduled and prioritized
object auditing, compliance status is based on the last run of a scheduled audit or a
user-invoked audit. For audits configured for only prioritized object auditing,
compliance status is based on the last run, which may have been a prioritized audit or
a user-invoked audit. A user-invoked audit or a scheduled audit checked all objects
that are configured for the data group and within the class of objects checked by the
audit whereas a prioritized audit may have checked only a subset of those objects.
125
Guidelines and considerations for auditing
Guidelines and considerations for auditing
Auditing is most effective when it is performed regularly and you take action to
investigate and resolve any reported differences that cannot be automatically
corrected.
Auditing best practices
Regular auditing helps you detect problems in a timely manner and can help you to
address detected problems during normal operations instead of during a crisis. Policy
values for auditing are shipped with defaults set to values that Vision Solutions
recommends as best practice. New data groups and new installations will
automatically use these policy values. If you determine that default policy values do
not meet your auditing needs, you can customize the policy settings. Auditing best
practices include:
Automatically auditing: MIMIX is shipped so that auditing occurs automatically.
•
Allow both priority audits and scheduled audits to run automatically. This provides
a balance between checking all objects periodically and checking a subset of
objects every day. You can adjust the Priority audit and Audit schedule policies
that control when each type of audit is automatically submitted to meet the needs
of your environment.
•
Allow audits to perform the most extensive comparison possible. The shipped
value (level 30) for the Audit level policy enables this. If you choose to run audits
at a lower audit level, be aware of the risks, especially when switching.
•
Allow audits to perform automatic recovery actions. This provides automatic
correction of detected problems. Recovery is possible when the Automatic audit
recovery policy is enabled.
•
Allow MIMIX to run all audits even if you do not replicate certain object types (such
as DLOs). This ensures that if you add new objects in the future, you will be
automatically auditing them. Audits that do not have any objects to check
complete quickly with little use of system resources.
Manually auditing: In addition, manually invoke audits in these conditions:
•
Before switching, run all audits at audit level 30. Click this link to see additional
information about the audit level policy.
•
If you make configuration changes, run the #DGFE audit to check actual
configuration data against what is defined to your configuration. Click this link to
see additional information about when to run the #DGFE audit.
Where to run audits: Run audits from a management system. For most
environments, the management system is also the target system. If you cannot run
rules from the management system due to physical constraints or because of
complex configurations, you can change the Run rule on system policy to meet your
needs. Click this link to see additional information about the Run rule on system
policy.
126
Guidelines and considerations for auditing
Considerations for specific audits
#DGFE audit - This audit is not eligible for prioritized auditing because it checks
configuration data, not objects. As a result, configuration problems for a data group
can only be detected when a scheduled audit or a manually invoked audit runs.
Run the #DGFE audit during periods of minimal MIMIX activity to ensure that
replication is caught up and that added or deleted objects are reflected correctly in the
journal. If the command is run during peak activity, it may contain errors or indicate
that files are in transition.
In addition to regularly scheduled audits, check your configuration using the #DGFE
audits for your data groups whenever you make configuration changes, such as
adding an application or creating a library. Running the audit prior to audits that
compare attributes ensures that those audits will compare the objects and attributes
you expect to be present in your environment.
#DLOATR audit - This audit supports multiple levels of comparisons. The level used
is controlled by the value of the Audit level policy in effect when the audit runs. The
#DLOATR audit compares attributes as well as data for objects defined to a data
group when audit level 20 or 30 is used. Audit level 10 compares only attributes.
When data is compared the audit may take longer to run and may affect performance.
#FILDTA audit - This audit supports multiple levels of comparisons. The level used is
is controlled by the value of the Audit level policy in effect when the audit runs. The
#FILDTA audit compares all file member data defined for file members defined to a
data group only when audit level 30 is used. Level 10 and level 20 compare 5 percent
and 20 percent of data, respectively. Lower audit levels may take days or weeks to
completely audit file data. New files created during that time may not be audited.
Regardless of the audit level you use for regular auditing, Vision Solutions strongly
recommends running a level 30 audit before switching.
#IFSATR audit - This audit supports multiple levels of comparisons. The level used is
controlled by the value of the Audit level policy in effect when the audit runs. The
#IFSATR audit compares data when audit level 20 or 30 is used. At level 10, only
attributes are compared. Regardless of the audit level you use for regular auditing,
Vision Solutions strongly recommends running a level 30 audit before switching.
#MBRRCDCNT audit - This audit compares the number of current records
(*CURRDS) and the number of deleted records (*NBRDLTRCDS) for physical files
that are defined to an active data group. Equal record counts suggest but do not
guarantee that files are synchronized.
The #MBRRCDCNT audit does not have a recovery phase. Differences detected by
this audit appear as not recovered in the Audit Summary.
In some environments using commitment control, the #MBRRCDCNT audit may be
long-running. Refer to the MIMIX Administrator Reference book for information about
improve performance of this audit.
Recommendations when checking audit results
Consider these recommendations when you check results of audits:
•
Always review the results of the audits. Audit results reflect only what was
127
Guidelines and considerations for auditing
actually compared. Some objects may not have been compared due to object
activity or due to the audit level policy value in effect, even when no differences
(*NODIFF) are reported. You may need to take actions other than running an audit
to correct detected issues. For example, you may need to change a procedure so
that target system objects are only updated by replication processes.
•
Be aware of priority auditing behavior. Priority audits differ from other audits in
how they select objects to audit and in the number of objects selected. Be aware
of the implications of those differences when checking audit results. Priority audits
select replicated objects based on their auditing eligibility. As a result, priority
audits cannot check newly created source objects until after their create
transactions have been replicated. Priority audits can return results indicating that
zero (0) objects were selected. This occurs when no objects were eligible for
selection by an audit.
•
Deleted objects reported as not found. Audits can report not found conditions
for objects that have been deleted. A not found condition is reported when a
delete transaction is in progress for an object eligible for selection when the audit
runs.This is more likely to occur when there are replication errors or backlogs at
the time the audit runs.
•
Fixing one error may expose another. It may take multiple iterations of running
audits with recoveries before the results are clean. Recovering from one error may
result in a different error surfacing the next time the audit is performed. For
example, a recovery that adds data group file entries may result in detecting a
database relationship difference (*DBRIND) error the next time the audit is
performed, where the root problem is that a library of logical files is not identified
for replication.
•
Watch for trends in the audit results. Trends may indicate situations that need
further investigation. For example, objects that are being recovered for the same
reason every time you run an audit can be an indication that something in your
environment is affecting the objects between audits. In this case, investigating the
environment for the cause may determine that a change is needed in the
environment, in the MIMIX configuration, or in both. Trends may also indicate a
MIMIX problem, such as reporting an object as being recovered when it was not.
Report these scenarios to MIMIX CustomerCare. You can do this by creating a
new case using the Case Management page in Support Central.
128
Displaying audit runtime status
Displaying audit runtime status
The audit summary view of the Work with Audits display shows audit runtime status in
the Audit Status column. F11 toggles between variations of audit summary views.
Do the following:
1. Do one of the following to access the Summary view of the Work with Audits
display:
•
From the MIMIX Intermediate Main Menu, select option 6 (Work with audits)
and press Enter. Then use F10 as needed to access the Audit summary view.
•
Enter the command: installation-library/WRKAUD VIEW(*AUDSTS)
2. The Work with Audits display appears. If audit compliance problems exist, you
may see a different view of the Work with Audits display. Use F10 to access the
Summary view.
3. Check the value shown in the Audit Status column. Press F1 (Help) for a
description of status values.
4. To view additional information about an audit, use option 5 (Display).
On the summary view of the Work with Audits display, audits are sorted and displayed
so that the highest severity item is at the top of the list.
In addition to audit runtime status, the initial summary view (Figure 21) also includes
the full name of the data group and the following information:
The Object Diff column identifies the number of audited objects with differences
remaining after the audit completed.
The Objects Selected column indicates how objects were selected for auditing in
129
Displaying audit runtime status
the most recent run of the audit.
Figure 21. Audit Summary, view - data group definition columns
Work with Audits
System:
Type options, press Enter.
5=Display
6=Print
7=History
8=Recoveries
14=Audited objects
46=Mark recovered ...
Opt
__
__
__
__
__
__
__
__
Audit
Status
*NOTRUN
*CMPACT
*CMPACT
*CMPACT
*RCYACT
*QUEUED
*QUEUED
*NODIFF
9=Run rule
Audit
---------Definition--------Rule
DG Name
System 1 System 2
#OBJATR
EMP
AS01
AS02
#DLOATR
EMP
AS01
AS02
#FILATR
EMP
AS01
AS02
#FILATRMBR EMP
AS01
AS02
#FILDTA
EMP
AS01
AS02
#IFSATR
EMP
AS01
AS02
#MBRRCDCNT EMP
AS01
AS02
#DGFE
EMP
AS01
AS02
AS01
10=End
Object
Diff
0
0
0
0
0
0
0
0
Objects
Selected
*PTY
*PTY
*PTY
*PTY
*PTY
*PTY
*PTY
*ALL
Bottom
Parameters or command
===> _________________________________________________________________________
F3=Exit
F4=Prompt
F5=Refresh
F10=Compliance summary
F11=Last run
F14=Audited objects
F16=Inst. policies
F23=More options F24=More key
Note: Audit runtime status and compliance status values are prioritized and are also
“bubbled up” to the next higher level in the user interface, which is the
installation. In a 5250 emulator, audit status is included in the summarized
replication status displayed on the Work with Application Groups display. The
Work with Data Groups display provides an indication of the number of audits
that require action or attention.
130
Displaying audit runtime status
The additional view of audit summary information (Figure 22) displays policies in
effect when the audit was last run
Figure 22. Audit Summary view - last run columns.
Work with Audits
System:
Type options, press Enter.
5=Display
6=Print
7=History
8=Recoveries
14=Audited objects
46=Mark recovered ...
Opt
__
__
__
__
__
__
__
__
Audit
Status
*NOTRUN
*CMPACT
*CMPACT
*CMPACT
*RCYACT
*QUEUED
*QUEUED
*NODIFF
Audit
Rule
#OBJATR
#DLOATR
#FILATR
#FILATRMBR
#FILDTA
#IFSATR
#MBRRCDCNT
#DGFE
DG Name
EMP
EMP
EMP
EMP
EMP
EMP
EMP
EMP
------Last
Recovery
*ENABLED
*ENABLED
*ENABLED
*ENABLED
*ENABLED
*ENABLED
*ENABLED
*ENABLED
9=Run rule
Run------Level
*LEVEL30
*LEVEL30
*LEVEL30
*LEVEL30
*LEVEL30
*LEVEL30
*LEVEL30
*LEVEL30
AS01
10=End
Object
Diff
0
0
0
0
0
0
0
0
Bottom
Parameters or command
===> _________________________________________________________________________
F3=Exit
F4=Prompt
F5=Refresh
F10=Compliance summary
F11=Last run
F14=Audited objects
F16=Inst. policies
F23=More options F24=More key
The Last Run columns show the values of policies in effect at the time the audit was
last run through its compare phase.
Recovery identifies the value of the automatic audit recovery policy. When this
policy is enabled, after the comparison completes, MIMIX automatically starts
recovery actions to correct differences detected by the audit. Recovery may also
indicate a value of *DISABLED if a condition checked by the Action for running
audits (RUNAUDIT) policy existed and the policy value for that condition specified
*CMP, preventing audit recoveries from running.
Level identifies the value of the audit level policy. The audit level determines the
level of checking performed during the compare phase of the audit. If an audit was
never run, the value *NONE is displayed in both columns.
Running an audit immediately
You always have the option of running an audit immediately. You can do this by
running the MIMIX rule associated with the audit. From a 5250 emulator, audits
invoked in this manner always select all replication-eligible objects associated with
the class of object for the audit. When running an audit immediately from Vision
Solutions Portal, you have the ability to select whether the audit will select all
replication-eligible objects or only prioritized objects.
In most cases, you want to run the audit from the management system. Policies
determine whether a request to run an audit can be performed on the requesting
system.
Most users should perform this procedure form the management system.
131
Displaying audit runtime status
To run a rule immediately, do the following:
1. From the MIMIX Intermediate Main Menu, select option 6 (Work with audits) and
press Enter.
2. Type option 9 (Run rule) next to the audit you want and press Enter.
Note: Audits are not allowed to run against disabled data groups.
For more information, see, “Resolving audit problems” on page 133.
132
Resolving audit problems
When viewing results of audits, the starting point is the Summary view of the Work
with Audits display. You may also need to view the output file or the job log, which are
only available from the system where the audits ran. In most cases, this is the
management system.
Do the following from the management system:
1. Do one of the following to access the Work with Audits display.
•
From the MIMIX Intermediate Main Menu, select option 6 (Work with audits)
and press Enter. Then use F10 as needed to access the Audit summary view.
•
From a command line, enter WRKAUD VIEW(*AUDSTS)
2. Check the Audit Status column for values shown in Table 30. Audits with potential
problems are at the top of the list. Take the action indicated in Table 30.
Table 30.
Addressing audit problems
Status
Action
*FAILED
If the failed audit selected objects by priority and its timeframe for starting has not passed,
the audit will automatically attempt to run again.
The audit failed for these possible reasons.
Reason 1: The rule called by the audit failed or ended abnormally.
• To run the rule for the audit again, select option 9 (Run rule). This will check all objects
regardless of how the failed audit selected objects to audit.
• To check the job log, see “Checking the job log of an audit” on page 135.
Reason 2: The #FILDTA audit or the #MBRRCDCNT audit which required replication
processes that were not active.
1. From the command line, type WRKDG and press Enter.
• If all processes for the data group are active, skip to Step 2.
• If processes for the data group show a red I, L, or P in the Source and Target columns,
use option 9 (Start DG).
2. When the data group is active, return to the Work with Audits display and use option 9
(Run rule) to run the audit. This will check all objects regardless of how the failed audit
selected objects to audit.
3. If the audit fails again, check the job log using “Checking the job log of an audit” on
page 135.
133
Table 30.
Addressing audit problems
Status
Action
*DIFFNORCY
The comparison performed by the audit detected differences. No recovery actions were
attempted because of a policy in effect when the audit ran. Either the Automatic audit
recovery policy is disabled or the Action for running audits policy prevented recovery
actions while the data group was inactive or had a replication process which exceeded its
threshold.
If policy values were not changed since the audit ran, checking the current settings will
indicate which policy was the cause. Use option 36 to check data group level policies and
F16 to check installation level policies.
• If the Automatic audit recovery policy was disabled, the differences must be manually
resolved.
• If the Action for running audits policy was the cause, either manually resolve the
differences or correct any problems with the data group status. You may need to start
the data group and wait for threshold conditions to clear. Then run the audit again.
To manually resolve differences do the following:
1. Type 7 (History) next to the audit with *DIFFNORCY status and press Enter.
2. The Work with Audit History display appears with the most recent run of the audit at the
top of the list. Type 8 (Display difference details) next to an audit to see its results in the
output file.
3. Check the Difference Indicator column. All differences shown for an audit with
*DIFFNORCY status need to be manually resolved. For more information about the
possible values, see “Interpreting audit results - supporting information” on page 299.
To have MIMIX always attempt to recover differences on subsequent audits, change the
value of the automatic audit recovery policy.
*NOTRCVD
The comparison performed by the audit detected differences. Some of the differences were
not automatically recovered. The remaining detected differences must be manually
resolved.
Note: For audits using the #MBRRCDCNT rule, automatic recovery is not possible. Other audits,
such as #FILDTA, may correct the detected differences.
Do the following:
1. Type 7 (History) next to the audit with *NOTRCVD status and press Enter.
2. The Work with Audit History display appears with the most recent run of the audit at the
top of the list. Type 8 (Display difference details) next to an audit to see its results in the
output file.
3. Check the Difference Indicator column. Any differences with values other than
*RECOVERED must be manually resolved. For more information about the possible
values, see “Interpreting audit results - supporting information” on page 299.
*NOTRUN
The audit was prevented from running by the Action for running audits policy. Either the
data group was inactive or a replication process exceeded its threshold. This may be
expected during periods of peak activity or when data group processes have been ended
intentionally. However, if the audit is frequently not run due to this policy, action may be
needed to resolve the cause of the problem.
For more information about the values displayed in the audit results, see “Interpreting
audit results - supporting information” on page 299.
134
Checking the job log of an audit
An audit’s job log can provide more information about why an audit failed. If it still
exists, the job log is available on the system where the audit ran. Typically, this is the
management system.
You must display the notifications from an audit in order to view the job log. Do the
following:
1. From the Work with Audits display, type 7 (History) next to the audit and press
Enter.
2. The Work with Audit History display appears with the most recent run of the audit
at the top of the list.
3. Use option 12 (Display job) next to the audit you want and press Enter.
4. The Display Job menu opens. Select option 4 (Display spooled files). Then use
option 5 (Display) from the Display Job Spooled Files display.
5. Look for messages from the job log for the audit in question. Usually the most
recent messages are at the bottom of the display.
Message LVE3197 is issued when errors remain after an audit completed.
Message LVE3358 is issued when an audit failed. Check for following
messages in the job log that indicate a communications problem (LVE3D5E,
LVE3D5F, or LVE3D60) or a problem with data group status (LVI3D5E,
LVI3D5F, or LVI3D60).
135
Ending audits
Only active or queued audits can be ended. This includes audits with the following
statuses: Currently comparing (*CMPACT), Currently recovering (*RCYACT), or
Currently waiting to run (*QUEUED).
You must end active or queued audits from the system that originated the audit. You
can end active or queued audits from any view of the Work with Audits display. This
procedure uses the Status view.
To end an active or queued audit, do the following:
1. From the MIMIX Intermediate Main Menu, select option 6 (Work with audits) and
press Enter. Then use F10 as needed to access the Audit summary view.
2. Check the value shown in the Audit Status column. Press F1 (Help) for a
description of status values.
3. Type option 10 (End) next to the active or queued audit you want to end and press
Enter.
4. Audits in *CMPACT or *QUEUED status are set back to their previous status
values. Audits in *RCYACT status are set according to the completed comparison
result as well as the results of any completed recovery actions.
136
Displaying audit history
Displaying audit history
The Work with Audit History display lists the available history for completed runs of a
specific combination of audit rule and data group. Each item listed is a history of a
completed audit run, shown in reverse chronological order so that the completed audit
with the most recent start time is at the top of the list. Audits that are new or that have
an active status are not included in this list.
Do the following to access retained history for a specific audit and data group
combination:
1. From the MIMIX Intermediate Main Menu, type 6 (Work with audits) and press
Enter.
2. From the Work with Audits display, type 7 (History) next to the audit and data
group you want and press Enter.
The amount of history information available is determined by how frequently an audit
runs and the settings of the Audit history retention policy. Having retained audit history
enables you to look for trends across multiple runs of an audit that may be an
indication of a configuration problem or some other issue with an object. For example,
the Work with Audit History display makes it easy to notice that particular audit of one
data group always has a similar number of recovered objects or always has
differences that cannot be recovered automatically.
The initial view shows (Figure 23) the final audit status and recovery phase statistics.
Figure 23. Work with Audit History - view of recovery results.
Work with Audit History
SYSTEM:
Audit rule . . . . . . :
Data group definition . :
AS01
#FILATR
EMP AS01 AS02
Type options, press Enter.
5=Display
6=Print
8=View difference details
12=Display job
14=Audited objects
46=Mark recovered
------------------Objects----------------Audit
Total
Not
Not
Opt
Compare Start
Status
Selected
Compared Recovered Recovered
__
01/02/10 15:25:31 *NODIFF
91
0
0
0
__
12/31/09 09:06:04 *NODIFF
0
0
0
0
__
12/30/09 08:50:29 *AUTORCVD
4
0
0
3
BOTTOM
Parameters or command
===> _________________________________________________________________________
F3=Exit
F4=Prompt
F5=Refresh
F9=Retrieve
F11=Summary results
F12=Cancel
F13=Repeat
F14=Audited objects
F21=Print list
F11 toggles between this view and additional views.
137
Displaying audit history
The summary results view (Figure 24) shows the total number of objects selected by
the audit and whether the objects selected were the result of a priority audit or a
scheduled audit.
Figure 24. Work with Audit History - view of summary results.
Work with Audit History
SYSTEM:
Audit rule . . . . . . :
Data group definition . :
Type options, press Enter.
5=Display
6=Print
8=View difference details
14=Audited objects
46=Mark recovered
Opt
__
__
__
Compare
01/02/10
12/31/09
12/30/09
AS01
#FILATR
EMP AS01 AS02
Audit
Start
Status
15:25:31 *NODIFF
09:06:04 *NODIFF
08:50:29 *AUTORCVD
Total
Selected
91
0
4
12=Display job
Objects
Selected
*ALL
*PTY
*PTY
BOTTOM
Parameters or command
===> _________________________________________________________________________
F3=Exit
F4=Prompt
F5=Refresh
F9=Retrieve
F11=Compare results
F12=Cancel
F13=Repeat
F14=Audited objects
F21=Print list
The compare results view (Figure 25) shows the duration of the audit as well as
statistics for the compare phase of the audit.
Figure 25. Work with Audit History - view of compare results.
Work with Audit History
SYSTEM:
Audit rule . . . . . . :
Data group definition . :
AS01
#FILATR
EMP AS01 AS02
Type options, press Enter.
5=Display
6=Print
8=View difference details
12=Display job
14=Audited objects
46=Mark recovered
------------Objects------------Audit
Audit
Not
Detected
Opt
Compare Start
Status
Duration
Compared
Compared Not Equal
__
01/02/10 15:25:31 *NODIFF
00:00:04
91
0
0
__
12/31/09 09:06:04 *NODIFF
00:00:01
0
0
0
__
12/30/09 08:50:29 *AUTORCVD
00:00:01
4
0
3
BOTTOM
Parameters or command
===> _________________________________________________________________________
F3=Exit
F4=Prompt
F5=Refresh
F9=Retrieve
F11=Recovery results
F12=Cancel
F13=Repeat
F14=Audited objects
F21=Print list
138
Working with audited objects
When viewing the Work with Audit History display from the system on which the audit
request originated, you can use options to view the object difference details detected
by the audit (option 8), the job log for the audit (option 9), and a list of objects that
were audited (option 14).
Audits with no selected objects
On the Work with Audit History display, it is possible to see repeated audit runs that
have zero (0) objects selected during the time frame that prioritized audits are allowed
to run each day. Zero objects selected means that no objects matched the frequency
specified for criteria for selecting objects at the time when the prioritized audit ran.
Consider this example of how prioritized audits operate. Audit #FILATR is set to run
priority audits using its shipped default values for priority auditing. This means the
audit will run approximately once per hour between 3 and 8 a.m. every day. Each
audit run will select the following:
•
Any replicated objects that were not equal in their last audit.
•
Any new replicated objects had never been audited.
•
Any replicated objects that changed in the past 24 hours.
•
Any replicated objects that did not change since they were audited a week ago.
•
Any replicated objects that did not change since their last audit a month (30 days)
ago and have a history of repeated consecutive successful audits.
For the first run (between 3 and 4 a.m.) of a normal work day, it is likely the audit
selected objects in the new and changed in the past day categories, and may have
selected some objects in other categories as well. The second run is likely to have
selected fewer objects, and may have selected only objects that had differences from
the earlier run. If those differences were resolved, then the subsequent runs that day
are likely to have selected no objects because none were eligible. While such a daily
pattern may repeat, it is also subject to replication and other auditing activity within
your environment.
Working with audited objects
The Work with Audited Objects display shows a list of objects compared by one or
more audits. This information is available only on the originating system for audits
performed when the Audit history retention (AUDHST) policy in effect specified to
keep details relevant to the type of audit and those audits have not exceeded the
current policy's retention criteria.
The list of objects is sorted by severity of their final audit status (the status after
comparisons and any recovery actions complete), with the most severe status first.
Because this display lists audited object history, the #DGFE rule, which compares
configuration data, is not included.
When the objects listed are for only one audit, the display appears as shown in Figure
26. This layout is used when the display is invoked by option 14 (Audited objects) on
the Work with Audits display or the Work with Audit History display. Note that the Audit
139
Working with audited objects
start field is located at the top of the display in this case. If the selected audit is the
audit run with the latest start date, (*LAST) will also appear in the Audit start field.
Figure 26. Work with Audited Objects display for a single audit.
Work with Audited Objects
SYSTEM:
Data group:
Audit rule:
EMP AS01 AS02
#FILATR
Audit start:
AS01
06/17/09 15:01:34 (*LAST)
Type options, press Enter.
5=Display
6=Print
9=Object history
Opt
_
_
_
_
Audited
Status
*NE
*EQ
*EQ
*RCVD
Type
*FILE
*FILE
*FILE
*FILE
Object
Name
L00SAMPLEA/RJFILE1
L00SAMPLEA/RJFILE2
L00SAMPLEA/RJFILE3
L00SAMPLEA/RJFILE4
BOTTOM
Parameters or command
===> _________________________________________________________________________
F3=Exit
F4=Prompt
F5=Refresh
F9=Retrieve
F12=Cancel
F13=Repeat
F18=Subset
F21=Print list
F22=Display entire field
When the list includes objects from multiple audits, the display appears as shown in
Figure 27, with the specific audit rule and start time displayed in columns. A > symbol
next to an object name indicates a long object path name exists which can be viewed
with F22.
File member information is not automatically displayed. However, you can use F18 to
change subsetting criteria to include members. When member information is
displayed, the name is in the format: library/file(member). Also, the information
displayed for file members may not be from the most recently performed audit.
Because members can be compared by several audits, the most recent run of each of
those audits is evaluated. The evaluated audit run with most severe status is
displayed, even if it is not the most recently performed audit of the evaluated audit
140
Working with audited objects
runs. For all other objects, the information displayed is from the most recent audit run
that compared the object.
Figure 27. Work with Audited Objects display with all audits displayed.
Work with Audited Objects
System:
Data group:
Audit rule:
AS01
EMP AS01 AS02
*ALL
Type options, press Enter.
5=Display
6=Print
9=Object history
Opt
_
_
_
_
_
_
_
_
Audited
Status
*NE
*NE
*NE
*EQ
*EQ
*EQ
*EQ
*EQ
Type
*DTAARA
*DTAARA
*STMF
*DTAARA
*DTAARA
*FILE
*FILE
*FILE
Object
Name
L00SAMPLEA/AJDTAARA1
L00SAMPLEA/AJDTAARA2
/L00DIR/ALPHA.STM
L00SAMPLEA/DTAARA1
L00SAMPLEA/DTAARA2
L00SAMPLEA/RJFILE1
L00SAMPLEA/RJFILE2
L00SAMPLEA/RJFILE3
-----------Audit-----------Rule
Date
Time
#OBJATR
12/11/09 09:40:27
#OBJATR
12/11/09 09:40:27
#IFSATR
12/11/09 09:47:57
#OBJATR
12/11/09 09:40:27
#OBJATR
12/11/09 09:40:27
#FILATR
12/11/09 09:43:13
#FILATR
12/11/09 09:43:13
#FILATR
12/11/09 09:43:13
More...
Parameters or command
===> _________________________________________________________________________
F3=Exit
F4=Prompt
F5=Refresh
F9=Retrieve
F12=Cancel
F13=Repeat
F18=Subset
F21=Print list
F22=Display entire field
You can select option 5 to view the details of the audit in which the object was
compared, such as the audit compare and recovery timestamps, and option 9 to view
auditing history for a specific object.
Displaying audited objects from a specific audit run
Use this procedure to display the list of objects compared by a specific audit run.
For prioritized audits, not every object is audited in every audit run.
From Work with Audits display or the Work with Audit History display, do the following:
1. Ensure that you are on the system where the audit originated. The originating
system is included in the audit details, which you can view using option 5
(Display).
2. Type 14 (Audited objects) next to audit run that you want and press Enter.
3. If necessary, press F18 (Subset) to specify criteria for filtering the list by object
type, name, or audited status.
Displaying a customized list of audited objects
Use this procedure to list all objects compared by a data group or to specify filtering
criteria such as object type, name, or audited status.
From Work with Audits display or the Work with Audit History display, do the following:
1. Ensure that you are on the system where the audit originated. The originating
141
Working with audited object history
system is included in the audit details, which you can view using option 5
(Display).
2. Press F14 (Audited objects). The Work with Audited Objects (WRKAUDOBJ)
command appears.
3. Specify the Data group definition for which you want to see audited objects.
4. Specify the value you want for Object type and press Enter.
5. Additional fields appear based on the value specified in Step 5. Specify values to
define the criteria for selecting the objects to be displayed.
Note: The value specified for Member (MBR) determines whether member-level
objects are selected for their object history. The members selected are not
automatically displayed in the list. To include any selected members, press
F10 (Additional parameters), then specify *YES for Include member
(INCMBR).
6. Press Enter to display the list of objects from the retained history details.
Working with audited object history
The Work with Audited Obj. History display lists the available audit history for a single
object compared by the indicated audit rules within the indicated data group. This
capability provides the ability to check for trends for a specific object such as repeated
automatic recovery of a difference.
The audit history for an object is available only on the originating system for audits
performed when the Audit history retention (AUDHST) policy in effect specified to
keep details relevant to the type of audit and those audits have not exceeded the
current policy's retention criteria.
The list is sorted in reverse chronological order so that the audit history having the
most recent start date is at the top of the list.
When the displayed object history is for a file member, the member is represented as
object type *FILE with its name formatted as library/file(member). The Audit Rule
column appears in the list to identify which audit rule compared the member in the
audit run, as shown in Figure 28. When the audit history for any other object type is
142
Working with audited object history
displayed, there is only one possible audit rule so the Audit rule field is located at the
upper right of the display.
Figure 28. Work with Audited Obj. History display showing audit history for a file member
Work with Audited Obj. History
System:
AS01
Data group:
EMP AS01 AS02
Type:
*FILE
Name: ABCLIB/PF1(MBR1)
Type options, press Enter.
5=Display
6=Print
8=View difference details
Audit
Opt Rule
_
#FILDTA
_
#MBRRCDCNT
_
#FILATRMBR
----Compare Information--- -------Recovery Information-----Date
Time
Status Date
Time
Status
06/17/09 15:22:48 *EQ
06/17/09 15:01:34 *EQ
06/16/09 15:01:27 *NE
06/16/09
15:04:25
*RECOVERED
Bottom
Parameters or command
===> _________________________________________________________________________
F3=Exit
F4=Prompt
F5=Refresh
F9=Retrieve
F12=Cancel
F13=Repeat
F21=Print list
F22=Display entire field
From this display you can use option 5 to view details of the audit in which the object
was compared, such as its audit compare and recovery timestamps, and option 8 to
view object difference details that were detected by the audit.
Displaying the audit history for a specific object
Use this procedure to display the retained audit histories for a specific object.
From Work with Audits display or the Work with Audit History display, do the following:
1. Display a list of objects audited for a data group using either of the following
procedures:
•
“Displaying audited objects from a specific audit run” on page 141
•
“Displaying a customized list of audited objects” on page 141
2. From the Work with Audited Objects display, type 9 (Object history) next to the
object you want and press Enter.
143
Displaying audit compliance
Displaying audit compliance
The audit compliance view of the Work with Audits display (Figure 29) shows audit
compliance status in the Compliance column. F11 toggles between variations of audit
compliance views.
Note: If other audit problems exist, you may see a different view of the Work with
Audits display. Use F10 to access the Compliance view.
On the compliance view of the Work with Audits display, the list is initially sorted by
compliance status. To sort the list by scheduled time, use F17.
In addition to audit compliance status, the initial compliance view (Figure 29) shows
the timestamp of when the compare phase ended in the Compare End column.
Compliance is checked based on the last completed compare date. Compliance
determines whether the date of the last completed compare completed by an audit is
within the range set by policies. The Audit warning threshold policy and the Audit
action threshold policy define when to indicate that an audit is approaching or
exceeding that range.
For audits configured for scheduled object auditing or both scheduled and prioritized
object auditing, compliance status is based on the last run of a scheduled audit or a
user-invoked audit. For audits configured for only prioritized object auditing,
compliance status is based on the last run, which may have been a prioritized audit or
a user-invoked audit. A user-invoked audit or a scheduled audit checked all objects
that are configured for the data group and within the class of objects checked by the
audit whereas a prioritized audit may have checked only a subset of those objects.
Figure 29. Audit Compliance, view - data group definition columns.
Work with Audits
System:
Type options, press Enter.
5=Display
6=Print
7=History
8=Recoveries
14=Audited objects
36=Change DG policies
Opt
__
__
__
__
__
__
__
__
Compliance
*OK
*OK
*OK
*OK
*OK
*OK
*OK
*OK
AS01
9=Run rule
10=End
37=Change audit schedule
Audit
---------Definition--------Rule
DB Name
System 1 System 2
#DGFE
EMP
AS01
AS02
#DLOATR
EMP
AS01
AS02
#FILATR
EMP
AS01
AS02
#FILATRMBR EMP
AS01
AS02
#FILDTA
EMP
AS01
AS02
#IFSATR
EMP
AS01
AS02
#MBRRCDCNT EMP
AS01
AS02
#OBJATR
EMP
AS01
AS02
---Compare End--Date
Time
09/25/08 12:15:34
09/25/08 12:15:34
09/25/08 12:15:34
09/25/08 12:15:35
09/25/08 12:15:38
09/25/08 12:15:36
09/25/08 12:15:38
09/25/08 12:15:37
Bottom
Parameters or command
===> _________________________________________________________________________
F3=Exit
F4=Prompt
F5=Refresh
F10=Schedule summary
F11=Next scheduled
F14=Audited objects
F17=Sort sched. time
F24=More keys
144
Displaying audit compliance
The additional view of audit compliance information (Figure 30) displays when the
scheduled audit run will occur. The scheduled date and time in this view do not apply
to prioritized audit runs.
Figure 30. Audit Compliance, view 2 - next scheduled time columns.
Work with Audits
System:
Type options, press Enter.
5=Display
6=Print
7=History
8=Recoveries
14=Audited objects
36=Change DG policies
Opt
__
__
__
__
__
__
__
__
Compliance
*OK
*OK
*OK
*OK
*OK
*OK
*OK
*OK
Audit
Rule
#DGFE
#DLOATR
#FILATR
#FILATRMBR
#FILDTA
#IFSATR
#MBRRCDCNT
#OBJATR
DG Name
EMP
EMP
EMP
EMP
EMP
EMP
EMP
EMP
AS01
9=Run rule
10=End
37=Change audit schedule
-Scheduled Time-Date
Time
09/26/08 02:00:00
09/26/08 02:25:00
09/26/08 02:10:00
09/26/08 02:20:00
09/26/08 02:35:00
09/26/08 02:15:00
09/26/08 02:30:00
09/26/08 02:05:00
---Compare End--Date
Time
09/25/08 12:15:34
09/25/08 12:15:34
09/25/08 12:15:34
09/25/08 12:15:35
09/25/08 12:15:38
09/25/08 12:15:36
09/25/08 12:15:38
09/25/08 12:15:37
Bottom
Parameters or command
===> _________________________________________________________________________
F3=Exit
F4=Prompt
F5=Refresh
F10=Schedule summary
F11=DG definition
F14=Audited objects
F17=Sort sched. time
F24=More keys
Note: Audit runtime status and compliance status values are prioritized and are
“bubbled up” to the next higher level in the user interface, which is the
installation. In a 5250 emulator, audit status is included in the summarized
replication status displayed on the Work with Application Groups display. The
Work with Data Groups display provides an indication of the number of audits
that require action or attention.
Determining whether auditing is within compliance
Regular auditing detects and often repairs problems in the replication environment.
Compliance with the best practice of regular auditing is determined for each individual
audit based on the date when the audit last completed its compare phase.
Audit compliance problems are identified by the following a status values
*ATTN -The audit is approaching an out of compliance state as determined by the
Audit warning threshold policy. Attention is required to prevent the audit from
becoming out of compliance.
*ACTREQ - The audit is out of compliance with the Audit action threshold policy.
Action is required. Perform an audit of the data group.
An audit with a compliance problem must be run to resolve the problem.
Do the following to check for compliance problems:
1. Do one of the following to access the Compliance view of the Work with Audits
display:
145
Displaying audit compliance
•
From the MIMIX Intermediate Main Menu, select option 6 (Work with audits)
and press Enter. Then use F10 as needed to access the Compliance view
•
Enter the command: installation-library/WRKAUD VIEW(*COMPLY)
2. Check the Compliance column for values of *ATTN and *ACTREQ.
3. To resolve a problem with audit compliance, the audit in question must be run and
complete its compare phase.
•
To see when the scheduled run of the audit will occur, press F11. To see when
both scheduled and prioritized audits will run, press F10 to access the Audit
summary view, then use F11 to toggle between views.
•
To run the audit now, select option 9 (Run rule) and press Enter. This action will
select all replicated objects associated with the class of the audit. For more
Information, see “Running an audit immediately” on page 131.
146
Displaying scheduling information for automatic audits
Displaying scheduling information for automatic audits
An audit can be configured to run by schedule, by priority, by schedule and priority, or
not at all. The schedule summary views of the Work with Audits display allow you to
see scheduling information for each audit.
Do the following to view when an audit can occur for a specific audit and data group
combination:
1. From the MIMIX Intermediate Main Menu, type 6 (Work with audits) and press
Enter.
2. The Work with Audits display appears, showing either the audit summary or
compliance summary view. Press F10 as needed to access the Schedule
summary view.
3. The initial view of the Schedule summary is displayed. Use F11 to toggle between
additional variations of audit schedule views.
•
The initial view (Figure 31) shows the date and time of the next scheduled
audit run. You cannot view the exact time of when the next prioritized audit will
run.
•
To view current scheduled auditing settings, press F11 (Figure 32).
•
To view current priority auditing settings, press F11 twice (Figure 33).
Prioritized audit runs are allowed to start every day only during the specified
time range. Multiple runs of an audit may occur during that time.
The list is initially sorted by rule and data group name.To sort the list by scheduled
time, use F17.
Figure 31. Audit Schedule Summary, view - next scheduled time.
Work with Audits
System:
Type options, press Enter.
5=Display
6=Print
7=History
8=Recoveries
14=Audited objects
36=Change DG policies
Opt
__
__
__
__
__
__
__
__
Audit
Rule
#DGFE
#DLOATR
#FILATR
#FILATRMBR
#FILDTA
#IFSATR
#MBRRCDCNT
#OBJATR
---------Definition--------DG Name
System 1 System 2
EMP
AS01
AS02
EMP
AS01
AS02
EMP
AS01
AS02
EMP
AS01
AS02
EMP
AS01
AS02
EMP
AS01
AS02
EMP
AS01
AS02
EMP
AS01
AS02
AS01
9=Run rule
10=End
37=Change audit schedule
Frequency
*WEEKLY
*WEEKLY
*WEEKLY
*WEEKLY
*WEEKLY
*WEEKLY
*WEEKLY
*WEEKLY
-Scheduled Time-Date
Time
09/25/08 02:00:00
09/25/08 02:25:00
09/25/08 02:10:00
09/25/08 02:20:00
09/25/08 02:35:00
09/25/08 02:15:00
09/25/08 02:30:00
09/25/08 02:05:00
Bottom
Parameters or command
===> _________________________________________________________________________
F3=Exit
F4=Prompt
F5=Refresh
F10=Audit summary
F11=Schedule settings
F14=Audited objects
F17=Sort sched. time
F24=More keys
147
Displaying scheduling information for automatic audits
Figure 32. Audit Schedule Summary, view - schedule settings.
Work with Audits
System:
Type options, press Enter.
5=Display
6=Print
7=History
8=Recoveries
14=Audited objects
36=Change DG policies
Opt
__
__
__
__
__
__
__
__
Audit
Rule
#DGFE
#DLOATR
#FILATR
#FILATRMBR
#FILDTA
#IFSATR
#MBRRCDCNT
#OBJATR
DG Name
EMP
EMP
EMP
EMP
EMP
EMP
EMP
EMP
Frequency
*WEEKLY
*WEEKLY
*WEEKLY
*WEEKLY
*WEEKLY
*WEEKLY
*WEEKLY
*WEEKLY
Date
*NONE
*NONE
*NONE
*NONE
*NONE
*NONE
*NONE
*NONE
AS01
9=Run rule
10=End
37=Change audit schedule
Weekday
SMTWTFS
SMTWTFS
SMTWTFS
SMTWTFS
SMTWTFS
SMTWTFS
SMTWTFS
SMTWTFS
SMTWTFS
Rel.Day
12345L
Time
02:00:00
02:25:00
02:10:00
02:20:00
02:35:00
02:15:00
02:30:00
02:05:00
Bottom
Parameters or command
===> _________________________________________________________________________
F3=Exit
F4=Prompt
F5=Refresh
F10=Audit summary
F11=Priority settings
F14=Audited objects
F17=Sort sched. time
F24=More keys
Figure 33. Audit Schedule Summary, view - priority settings.
Work with Audits
System:
Type options, press Enter.
5=Display
6=Print
7=History
8=Recoveries
14=Audited objects
36=Change DG policies
Opt
__
__
__
__
__
__
__
__
Audit
Rule
#DGFE
#DLOATR
#FILATR
#FILATRMBR
#FILDTA
#IFSATR
#MBRRCDCNT
#OBJATR
DG Name
EMP
EMP
EMP
EMP
EMP
EMP
EMP
EMP
-Start RangeAfter
Until
*NONE
03:00
08:00
03:00
08:00
03:00
08:00
03:00
08:00
03:00
08:00
03:00
08:00
03:00
08:00
AS01
9=Run rule
10=End
37=Change audit schedule
----Priority Objects Selected---New
Chg
Unchg
No Diff
*DAILY
*DAILY
*DAILY
*DAILY
*DAILY
*DAILY
*DAILY
*DAILY
*DAILY
*DAILY
*DAILY
*DAILY
*DAILY
*DAILY
*WEEKLY
*WEEKLY
*WEEKLY
*WEEKLY
*WEEKLY
*WEEKLY
*WEEKLY
*MONTHLY
*MONTHLY
*MONTHLY
*MONTHLY
*MONTHLY
*MONTHLY
*MONTHLY
Bottom
Parameters or command
===> _________________________________________________________________________
F3=Exit
F4=Prompt
F5=Refresh
F10=Audit summary
F11=Priority settings
F14=Audited objects
F17=Sort sched. time
F24=More keys
148
Displaying status of system-level processes
Working with system-level processes
CHAPTER 8
MIMIX uses several processes that run at the system level to support the replication
environment and provide additional functionality. System-level processes include the
system manager, journal manager, target journal inspection, collector services, and if
needed, cluster services. These processes can be accessed from the Work with
Systems display (WRKSYS command). Typically, these processes are automatically
started and ended when MIMIX is started or ended. However, you may need to start
or end individual processes when resolving problems.
The following topics are included in this chapter to help you resolve problems with
system level processes:
•
“Displaying status of system-level processes” on page 149 describes how to
check for expected status values and resolve problems with system-level
processes. This includes procedures for starting and ending managers, target
journal inspection, and collector services.
•
“Resolving *ACTREQ status for a system manager” on page 151 describes how
to resolve a status of action required.
•
“Checking for a system manager backlog” on page 151 describes how to check if
there a backlog of unprocessed entries that require action.
•
“Displaying status of target journal inspection” on page 155 describes how to
display the status of a single inspection job on a system and how to resolve
problems with its status.
•
“Displaying results of target journal inspection” on page 156 describes where to
find information about the objects identified by target journal inspection.
•
“Identifying the last entry inspected on the target system” on page 158 describes
how to determine the last entry in the target journal and the last entry processed
by target journal inspection.
Displaying status of system-level processes
Status of processes that run at the system level can be viewed from the Work with
Systems display.
1. Do one of the following:
•
From MIMIX Intermediate Main Menu, select option 2 (Work with Systems) and
press Enter.
•
From the Work with Application Groups display, use option 12 (Node entries).
On the resulting Work with Node Entries display, press F7 (Systems).
2. The Work with Systems display appears. The first system definition in the list is
the local system. Figure 34 shows expected status values for most two-system
environments. For any other status values, continue with the next step.
149
Displaying status of system-level processes
Expected Status Values:
•
System managers and journal managers have an expected status of *ACTIVE
on all systems.
•
For target journal inspection, the expected status is that systems that are
currently the target for replication have a status of *ACTIVE and other systems
have a status of *NOTTGT.
•
For cluster services, most installations are not licensed for MIMIX® Global™
and have an expected status of *NONE. If a system participates in an IBM i
cluster, the expected value is *ACTIVE. For more information about operation
for MIMIX® Global™, see the MIMIX Operations with IBM i Clustering book.
Figure 34. Expected status on the Work with Systems display for a two-system environment.
Work with Systems
OSCAR
Local system definition . . :
Cluster . . . . . . . . . . :
OSCAR
*NONE
Type option, press Enter
4=Remove cluster node
5=Display
6=Print
7=System manager status
8=Work with data groups 9=Start
10=End
11=Jrn inspection status
System
----- Managers ----- Journal
----- Services -----Opt Definition Type System
Journal
Inspect.
Collector Cluster
___ OSCAR
*MGT *ACTIVE
*ACTIVE
*ACTIVE
*ACTIVE
*NONE
___ HENRY
*NET *ACTIVE
*ACTIVE
*NOTTGT
*ACTIVE
*NONE
F3=Exit
F5=Refresh
F9=Automatic refresh
F13=Repeat
F16=System definitions
F10=Legend
Bottom
F12=Cancel
3. If one or more processes are *INACTIVE, do one of the following:
•
Type a 9 (Start) next to the system you want and press Enter. The Start MIMIX
Managers display appears. Any processes except cluster services that are not
active on the system are preselected. (To start cluster services, MIMIX®
Global™ users must specify *YES for the Start cluster services prompt.) Press
Enter.
4. For any other status values on a system, do the following;
•
If one or more processes are *UNKNOWN, use the procedure in “Verifying all
communications links” on page 282.
•
For a system manager status of *ACTREQ, use “Resolving *ACTREQ status
for a system manager” on page 151.
•
To check for a system manager backlog, use “Checking for a system manager
backlog” on page 151.
150
Displaying status of system-level processes
•
For target journal inspection status values other than *ACTIVE or *NOTTGT,
see “Displaying status of target journal inspection” on page 155.
Resolving *ACTREQ status for a system manager
A system manager status of *ACTREQ indicates that at least one of the system
manager pairs in which the system is a participant has failed. The system manager
must be started. To start the system manager, type a 9 (Start) next to the system and
press Enter.
Checking for a system manager backlog
The Work with System Pair Status panel includes the count of unprocessed entries for
the source system job of the system manager process along with the timestamp of the
oldest unprocessed entry. A count of unprocessed entries means that a backlog
exists and action may be required.
A status of *INACTIVE indicates the system manager needs to be started. Type a 9
(Start) next to the system and press Enter.
A status of *ACTIVE with unprocessed entries indicates further action may be
required. Since this data is a snapshot of work currently being done, it is important to
refresh this panel (F5) to ensure data is up to date. Evaluate data for unprocessed
entries with a status of *ACTIVE as follows:
•
If the status is *ACTIVE and there are a high number of unprocessed entries
for your environment or the timestamp is not changing when data is refreshed
(F5), contact CustomerCare.
•
If the status is *ACTIVE and there is a low number of unprocessed entries for
your environment, refresh data (F5) and check whether the timestamp is
changing. If the timestamp changes, the entries are being processed.
151
Starting a system manager or a journal manager
To selectively start a system manager or journal manager for a system, do the
following
1. Do one of the following:
•
From MIMIX Intermediate Main Menu, select option 2 (Work with Systems) and
press Enter.
•
From the Work with Application Groups display, use option 12 (Node entries).
On the resulting Work with Node Entries display, press F7 (Systems).
2. The Work with Systems display appears. Type a 9 (Start) next to the system
definition you want and press Enter.
3. The Start MIMIX Managers display appears. By default, any manager that is not
running will be selected to start. Specify the value for the type of manager you
want to start at the Manager prompt and press Enter.
Ending a system manager or a journal manager
To end a system manager or journal manager, do the following:
1. Do one of the following:
•
From MIMIX Intermediate Main Menu, select option 2 (Work with Systems) and
press Enter.
•
From the Work with Application Groups display, use option 12 (Node entries).
On the resulting Work with Node Entries display, press F7 (Systems).
2. The Work with Systems display appears with a list of the system definitions
defined for the MIMIX installation. Type a 10 (End) next to the system definition
you want and press Enter.
3. The End MIMIX Managers display appears. Specify the value for the type of
manager you want to end at the Manager prompt and press Enter. The selected
managers are ended.
Starting collector services
To start collector services for a system, do the following
1. Do one of the following:
•
From MIMIX Intermediate Main Menu, select option 2 (Work with Systems) and
press Enter.
•
From the Work with Application Groups display, use option 12 (Node entries).
On the resulting Work with Node Entries display, press F7 (Systems).
2. The Work with Systems display appears. Type a 9 (Start) next to the system
definition you want and press Enter.
3. The Start MIMIX Managers display appears. At the Collector services prompt,
verify the value is *YES and press Enter.
152
Ending collector services
To end collector services for a system, do the following
1. Do one of the following:
•
From MIMIX Intermediate Main Menu, select option 2 (Work with Systems) and
press Enter.
•
From the Work with Application Groups display, use option 12 (Node entries).
On the resulting Work with Node Entries display, press F7 (Systems).
2. The Work with Systems display appears. Type a 10 (End) next to the system
definition you want and press Enter.
3. The End MIMIX Managers display appears. At the Collector services prompt, type
*YES and press Enter.
Starting target journal inspection processes
These instructions will start target journal inspection processes on a selected system.
If the system is the target system of one or more data groups whose journal
definitions are configured for target journal inspection, a journal inspection job is
started for the system journal and for each user journal on the system.
If the system is the target system for replication, an inspection job is started for the
system journal and for each user journal on the system that is identified within data
groups replicating to the system.
Target journal inspection processes start at the last sequence number in the currently
attached journal receiver in the following cases:
•
When it is the first time a target journal inspection process is started
•
When starting after being ended and the last processed receiver is no longer
available
•
When starting after enabling target journal inspection in a journal definition where
it was previously disabled
When starting target journal inspection after it was previously ended, processing
begins with the next sequence number after the last processed sequence number.
To start target journal inspection processes for a system, do the following
1. Do one of the following:
•
From MIMIX Intermediate Main Menu, select option 2 (Work with Systems) and
press Enter.
•
From the Work with Application Groups display, use option 12 (Node entries).
On the resulting Work with Node Entries display, press F7 (Systems).
2. The Work with Systems display appears. Type a 9 (Start) next to the system
definition you want and press Enter.
3. The Start MIMIX Managers display appears. At the Target journal inspection
prompt, verify the value is *YES and press Enter.
153
Ending target journal inspection processes
These instructions will end target journal inspection processes on a selected system.
If the system is the target system for replication, the inspection process for the system
journal is ended and all inspection processes are ended for the user journals
identified as the target journal in data groups replicating to the system.
To end target journal inspection processes for a system, do the following
1. Do one of the following:
•
From MIMIX Intermediate Main Menu, select option 2 (Work with Systems) and
press Enter.
•
From the Work with Application Groups display, use option 12 (Node entries).
On the resulting Work with Node Entries display, press F7 (Systems).
2. The Work with Systems display appears. Type a 10 (End) next to the system
definition you want and press Enter.
3. The End MIMIX Managers display appears. At the Target journal inspection
prompt, verify the value is *YES and press Enter.
154
Displaying status of target journal inspection
Displaying status of target journal inspection
Target journal inspection consists of a set of jobs that read journals on the target
system to check for people or processes other than MIMIX that have modified
replicated objects on the target system. Best practice is to allow target journal
inspection for all systems in your replication environment.
Each target journal inspection process runs on a system only when that system is the
target system for replication. The number of inspection processes depends on how
many journals are used by data groups replicating to that system. On a target system,
there is one inspection job for the system journal and one job for each target user
journal identified in data groups replicating to that system.
Because target journal inspection processes run at the system-level, the best location
to begin checking status is from the Work with Systems display.
1. Do one of the following:
•
From MIMIX Intermediate Main Menu, select option 2 (Work with Systems) and
press Enter.
•
From the Work with Application Groups display, use option 12 (Node entries).
On the resulting Work with Node Entries display, press F7 (Systems).
2. The Work with Systems display appears. The Journal Inspect. column shows the
summarized status of all journal inspection processes on a system.
•
Expected values are either *ACTIVE or *NOTTGT.
•
For all other status values, type 11 (Jrn inspection status) next to the system
you want and press Enter.
3. The Work with Journal Inspection Status display appears, listing the subset of
journal definitions for the selected system. The status displayed is for target
journal inspection for the journal associated with a journal definition.
Note: Journal definitions whose journals are not eligible for target journal
inspection are not displayed. This includes journal definitions that identify
the remote journal used in RJ configurations (whose names typically end
with @R) as well as journal definitions JRNMMX and MXCFGJRN which
are for internal use.
Table 31 identifies the status for the inspection job associated with a journal and
how to resolve problems.
Table 31.
Status values for a single target journal inspection process.
Journal
Inspection
Status
Description and Action
*INACTIVE
(inverse red)
Journal inspection is not active.
Use option 9 (Start) to start all eligible target journal inspection
processes on the system identified in the selected journal definition.
155
Displaying results of target journal inspection
Table 31.
Status values for a single target journal inspection process.
Journal
Inspection
Status
Description and Action
*UNKNOWN
(inverse white)
The status of the process on the system cannot be determined, possibly
because of an error or communications problem.
Use the procedure in “Verifying all communications links” on page 282.
*ACTIVE
(inverse blue)
Target journal inspection is active for the journal identified in the journal
definition.
*NEWDG
Target journal inspection has not run because all enabled data groups
that use the journal definition as a target journal have never been
started.
The inspection process will start when one or more of the data groups
are started.
*NOTCFG
Either the journal definition does not allow target journal inspection or all
enabled data groups that use the journal definition (user journal) prevent
journaling on the target system. Target journal inspection is not
performed for the journal.
For instructions for configuring target journal inspection, see topics
“Determining which data groups use a journal definition” and “Enabling
target journal inspection” in the MIMIX Administrator Reference book.
*NOTTGT
The journal definition is not used as a target journal definition by any
enabled data group. Target journal inspection is not performed for the
journal.
This is the expected status when the journal definition is properly
configured for target journal inspection but the system is currently a
source system for all data groups using this journal definition.
Displaying results of target journal inspection
Target journal inspection sends a warning notification for each user other than MIMIX
who changed objects on the target system since the inspection job started. Because
inspection jobs restart daily with other system level processes, a notification would
typically be sent once per day per user. The notification identifies only the first object
changed by the user.
Note: The MIMIX portal application for Vision Solutions Portal provides enhanced
capabilities for displaying target journal inspection results. Notifications from
target journal inspection processes are identified as originating from
TGTJRNINSP in the Notifications portlet on the Summary page. Actions
available for these notifications include displaying notification details as well
as displaying a list of the objects changed on the target node by the user
identified in the notification. Also, you can access a list of all objects changed
on the target node by all users from the Replicated Objects portlet on the
Analysis page.
156
Displaying results of target journal inspection
Displaying details associated with target journal inspection notifications
This procedure displays notifications sent by target journal inspection and describes
how to display related information from a 5250 emulator.
To check for notifications for target journal inspection, do the following:
1. Do one of the following:
•
On the Work with Application Groups display, the Notifications field indicates
whether any warning notifications exist. Press F15 (Notifications).
•
On the Work with Data Groups display, the third number in the
Audits/Recov./Notif. field displays the number of new notifications. Press F8
(Recoveries), then press F10 (Work with Notifications).
•
From a command line, enter: WRKNFY.
2. The Work with Notifications display appears. Notifications from target journal
inspection are identified by the name TGTJRNINSP in the Source column.
3. Type a 5 (Display) to view the notifications details.
4. On the Display Notification Details display, check these fields:
•
The Originating system field on the Display Notification Details display
identifies the system on which target journal inspection ran and sent the
notification.
•
The Notification details field identifies the user or program that made the
change, the first object changed, the location it was found in the inspected
journal, and a command string to run to see journal entries generated by the
user.
Note: If the text of the Notification details field is truncated, you can view the full
text of the message associated with the notification from the MIMIX
message log. Use “Displaying messages for TGTJRNINSP notifications”
on page 157.
5. Investigate why the identified user changed objects on the target system. Objects
may need to be repaired.
Displaying messages for TGTJRNINSP notifications
The text of notifications by target journal inspection vary slightly with the object type of
the reported object. When a notification is sent, an associated message is sent to the
MIMIX message log.
You can use the following commands to view the full text of notification messages,
Use the name of the originating system (Step 4 in previous procedure) as the name of
the originating system (ORGSYS) in these commands:
•
For library-based objects, Enter:
WRKMSGLOG MSGID(LVE3902) PRC(TGTJRNINSP) ORGSYS(name)
•
For IFS objects, Enter:
WRKMSGLOG MSGID(LVE3903) PRC(TGTJRNINSP) ORGSYS(name)
157
Identifying the last entry inspected on the target system
•
For DLO objects, Enter:
WRKMSGLOG MSGID(LVE3904) PRC(TGTJRNINSP) ORGSYS(name)
Identifying the last entry inspected on the target system
For each target journal inspection process, you can view details that identify the last
journal entry inspected and identify the last entry in the current journal receiver.
Do the following:
1. From MIMIX Intermediate Main Menu, select option 2 (Work with Systems) and
press Enter.
2. The Work with Systems display appears. Type 11 (Jrn inspection status) next to
the system you want and press Enter.
3. The Work with Journal Inspection Status display appears. Type 5 (Display) next to
the journal definition on the system you want and press Enter.
4. The Display Journal Inspection Status Details display appears.
•
The following fields identify the currently attached journal receiver and the last
entry in the current receiver: Journal, Journal receiver, Last journal entry
sequence. and Last journal entry time.
•
The Target journal inspection fields identify the last entry processed by target
journal inspection.
158
What are notifications and recoveries
Working with notifications and
recoveries
CHAPTER 9
This topic describes what notifications and recoveries are and how to work with them.
This chapter includes the following topics:
•
“What are notifications and recoveries” on page 159 defines terms used for
discussing notifications and recoveries and identifies the sources that create
them.
•
“Displaying notifications” on page 160 identifies where notifications are viewed in
the user interfaces and how to work with them.
•
“Notifications for newly created objects” on page 163 describes the MIMIX®
AutoNotify™ feature which can be used to monitor for newly created libraries,
folders, or directories.
•
“Displaying recoveries” on page 164 identifies where recoveries in progress are
viewed in the user interfaces and how to work with them.
What are notifications and recoveries
A notification is the resulting automatic report associated with an event that has
already occurred. The severity of a notification is reflected in the overall status of the
installation.
Notifications can be generated in a variety of ways:
•
Target journal inspection processes generate notifications when users or
programs other than MIMIX have changed objects on the target node.
•
Rules that are not associated with the audits provided by MIMIX also generate
notifications to indicate that rule processing either ended in error or, if requested,
completed successfully.
•
Shipped monitors, such as the MMNFYNEWE monitor for the MIMIX®
AutoNotify™ feature, generate notifications.
•
Custom automation may initiate user-generated notifications when user-defined
events are detected. User-generated notifications can be set to indicate a failure,
a warning, or a successful operation.
•
Audits generate notifications as a secondary mechanism for reporting when the
activities performed by an audit complete or end in error. These notifications are
automatically marked as acknowledged. (The primary mechanism is to report
errors through replication processes and the audit summary.) Policies provide
considerable control over notifications generated by audits.
Because the manner in which notifications are generated can vary, it is important to
note that notifications can represent both real-time events as well as events that
occurred in the past but, due to scheduling, are being reported in the present. For
159
Displaying notifications
example, the ownership of a file is changed on the target system at 8:00 PM. If your
audit (CMPFILA) is scheduled to run at 1:00 AM, MIMIX will detect the change and
push a notification to the user interface when the audit completes. Previously,
detection of the change was contingent upon you viewing a report after the audit
completed and noticing the difference.
Recoveries - The term recovery is used in two ways. The most common use refers to
the recovery action taken by audits or replication processes to correct a detected
difference when automatic recovery polices are enabled. The second use refers to a
temporary report that provides details about a recovery action in progress. The report
is automatically created when the recovery action starts and is removed when it
completes. While it exists, the report identifies what originated the action and what is
being acted upon, and may include access to an associated output file (outfile) and
the job log for the associated job. The action which generated a report may also
generate a notification when the recovery action ends.
Displaying notifications
Do one of the following to check for notifications:
Note: Notifications from audits are automatically set to a status of acknowledged.
Audit status and results should be checked from the Work with Audits
(WRKAUD) display.
•
If there are no audit problems in the installation, the MIMIX Availability Status
display will indicate whether there are any notifications requiring attention or
immediate action that are from sources other than audits. From the MIMIX
Availability Status display, type a 5 (Display details) next to Audits and
notifications and press Enter.
•
Notifications from all sources are listed on the Work with Notifications display. To
access the Work with Notifications display, enter the command WRKNFY. The list is
sorted so that new notifications appear at the top. To see details for a notification,
type a 5 (Display) next to the notification you want and press Enter.
•
The Work with Data Groups display includes the number of new notifications that
require action or attention. From the MIMIX Basic Main Menu type 6 (Work with
data groups) and press Enter. The Work with Data Groups display appears. The
Audit/Recov./Notif. fields are located in the upper right corner.
What information is available for notifications
The following information is available for notifications listed on the Work with
Notifications display. The F11 key toggles between views of status, timestamp, and
text of the notification.
Additional details are available for each notification through the Display Notification
Details display.
Status - The Work with Notifications display lists notifications grouped by their status.
*NEW - New notifications have not been acknowledged or removed and their
status is reflected in higher level status.
160
Displaying notifications
*ACK - Acknowledged notifications are archived as viewed and their status is no
longer reflected in higher level status.
Severity - Identifies the severity level of the notification.
*ERROR - An error occurred that requires immediate action.
*WARNING - Investigation may be necessary. An operation completed but an
error may exist. For example, the MIMIX AutoNotify feature issues notifications
with this severity that identify newly created objects that are not identified for
replication.
*INFO - No user intervention is required.
Notification - Displays the notification text sent by audits, automatic recoveries,
target journal inspection, monitors, user-defined or MIMIX rules, or a user-generated
notification. To view the full text, use option 5 to display the notification details.
Data group - Identifies the data group associated with the notification. User-defined
notifications and notifications from monitors or user-defined rules may indicate that
there is no associated data group.
Note: On the Status view of the Work with Notifications display, the F7 key toggles
between the Source column and the Data Group column. The full three-part
name is available in the Timestamp view (F11).
Date - Indicates the date the notification was sent.
Time - Indicates the time the notification was sent.
Source - Identifies the process, program, or command that generated the recovery.
Names that begin with the character # are generated by automatic recovery actions
for audits or database replication or by a MIMIX rule. Names that begin with the
characters ## are generated by automatic recovery actions for object replication.
From System - Identifies the name of the system on which the notification was
generated. The name From System is used on the Timestamp view (F11) of the Work
with Notifications display. When you display the notification details from the 5250
emulator, this is called the Originating system.
Detailed information
When you display a notification, you see its description, status, severity, data group,
source, and sender as described above. You also have access to the following
information:
Details - When the source of the notification is a rule, this identifies the command that
was initiated by the rule. When the source of the notification is user-generated, this
indicates the notification detail text specified when the notification entry was added.
When the source of the notification is a monitor, this describes the events that
resulted in the notification.
Output File - If available, this identifies an associated output file. Output file
information associated with a notification is only available from the sender system.
For user-generated notifications, output file information is available only if it was
specified when the notification was added.
161
Displaying notifications
Job - If available, this identifies the job that generated the notification. Job information
associated with a notification is only available from the sender system. For usergenerated notifications, this information is available only if it was specified when the
notification was added.
Options for working with notifications
Table 32 identifies the possible actions you can take for a notification. From the
Notifications window, the Actions list for each notification contains only the actions
possible for the selected notification.
Table 32.
Options available for notifications
Option
Description
4=Remove
Deletes the notification. You are prompted to confirm this choice.
For a notification generated by an audit or a MIMIX rule, the
associated job and output files are also deleted.
This must be performed from the system on which the notification
originated.
5=Display
Displays available additional information associated with the
notification.
For notifications generated by rules, this includes the details of the
rule that generated the notification, including the substitution
variables for the command the rule initiated
6=Print
Prints the information associated with the notification.
8=View results
When the information is available, this provides the Name and
Library of the output file (outfile) associated with the notification.
This option is only available from the system on which the
notification originated.1
12=Display job
Displays the job log for the job which generated the notification, if
it is available. This option is only available from the system on
which the notification originated
46=Acknowledge
Sets the selected notification status to *ACK (Acknowledged).
47=Mark as new
Sets the selected notification status to *NEW (New).
1.
MIMIX manages an output file associated with a notification from an automatically recovery
action or a MIMIX rule when the output file exists in a specific library. The format of the library
name for such an output file is MIMIX-installation-library_0.
162
Notifications for newly created objects
Notifications for newly created objects
The MIMIX® AutoNotify™ feature can be used to monitor for newly created libraries,
folders, or directories. The AutoNotify feature uses a shipped journal monitor called
MMNFYNEWE to monitor for new objects in an installation that are not already
included or excluded for replication by a data group. The AutoNotify feature monitors
the security audit journal (QAUDJRN), and when new objects are detected, issues a
warning notification.
The MMNFYNEWE monitor is shipped in a disabled state. In order to use this feature,
the MMNFYNEWE monitor must be enabled on the source system within your MIMIX
environment. Once enabled, this monitor will automatically start with the master
monitor.
Notifications will be sent when newly created objects meet the following conditions:
• The installation must have a data group configured whose source system is
the system the monitor is running on.
• The journal entry must be a create object (T-CO) or object management
change (T-OM).
• If the journal entry is a create object (T-CO), then the type must be new (N).
• The journal entry must be for a library, folder, or directory.
• If the journal entry is for a library, it cannot be a MIMIX generated library
since MIMIX generated libraries are not replicated by MIMIX.
• If the journal entry is for a directory, it cannot be the /LAKEVIEWTECH
directory, or any directory under /LAKEVIEWTECH.
• If the journal entry is for a directory, it must be a directory that is supported
for replication by MIMIX.
• The object is not already known (included or excluded) in the installation.
Notifications can be viewed from the Work with Notifications (WRKNFY) display. The
notification message will indicate required actions.
163
Displaying recoveries
Displaying recoveries
Active recoveries are an indication of problems detected and being corrected by
MIMIX AutoGuard. Before certain activity, such as ending MIMIX, it is important that
no recoveries are in progress in the installation. You can check for recoveries from
either user interface.
You can see how many recoveries are in progress from the MIMIX Availability Status
display or the Work with Data Groups display. The Work with Recoveries display lists
recoveries and provides options for working with held recoveries associated with an
audit or a MIMIX rule.
To see a count of recoveries in progress, do one of the following
•
To access the MIMIX Availability Status display, enter the command WRKMMXSTS.
The Recoveries field in the upper right corner of the display shows the number of
active recoveries in progress for the installation.
•
To access the Work with Data Groups display, use option 5 (Display details) next
to the Replication area.
Figure 35 shows the Audits/Recov./Notif. fields in the upper right corner of the Work
with Data Groups display. The first number is the total number of audits that require
action to correct a problem or require your attention to prevent a situation from
becoming a problem. The second number indicates the number of active recoveries,
including those resulting from audits. The third number indicates the number of new
notifications that require action or attention. If more than 999 items exist in any field,
the field will display +++. A consistently high number of recoveries suggests that there
may be configuration issues with one or more data groups.
To select a recovery to view or work with a held recovery, do the following:
1. To access the Work with Recoveries display, do one of the following:
•
From the Work with Audits display, use option 8 (Recoveries) to see a list of
recoveries associated with an audit.
•
From the Work with Data Groups display, use F8 to see all recoveries.
•
On a command line, enter the command WRKRCY.
2. To see details for a recovery, type a 5 (Display) next to the recovery you want and
164
Displaying recoveries
press Enter.
Figure 35. Work with Data Groups display showing recoveries in progress
CHICAGO
10:49:06
Type options, press Enter.
Audits/Recov./Notif.: 001 / 002 / 003
5=Display definition
8=Display status
9=Start DG
10=End DG
12=Files not active
13=Objects in error
14=Active objects
15=Planned switch
16=Unplanned switch ...
---------Source----------------Target--------ErrorsOpt Data Group System
Mgr DB Obj DA
System
Mgr DB Obj
DB
Obj
__ TESTDG34
LONDON
A
A
A
CHICAGO
A
A
A
__ TESTDG43
LONDON
A
A
A
CHICAGO
A
A
A
Work with Data Groups
F3=Exit
F10=Legend
Bottom
F5=Refresh F7=Audits
F8=Recoveries
F9=Automatic refresh
F16=DG definitions
F23=More options
F24=More keys
What information is available for recoveries
The following information is available for recoveries listed on the Work with
Recoveries display. The F11 key toggles between views of status, timestamp, and text
of the recoveries. Additional details are available for each recovery through the
Display Recovery Details display.
Each recovery provides a brief description of the recovery process taking place as
well as its current status.
Status - Shows the status of the recovery action.
*ACTIVE - The job associated with the recovery is active.
*ENDING - The job associated with the recovery is ending.
*HELD - The job associated with the recovery is held. A recovery whose source is
a replication process cannot be held.
Data group - Identifies the data group associated with the recovery.
Note: On the Status view of the Work with Recoveries display, the F7 key toggles
between the Source column and the Data Group column. The full three-part
name is available in the Timestamp view (F11).
Date - Indicates the date the recovery process started.
Time - Indicates the time the recovery process started.
Source - Identifies the process, program, or command that generated the recovery.
Names that begin with the character # are generated by automatic recovery actions
165
Displaying recoveries
for audits or database replication or by a MIMIX rule. Names that begin with the
characters ## are generated by automatic recovery actions for object replication.
Sender or From System - Identifies the system from which the recovery originated.
Detailed information
When you display a recovery, you see its description, status, data group, source, and
sender as described above. You also have access to the following information.
Details - When the source of the recovery is a rule, this identifies the command run by
the rule in an attempt to recover from the detected error.
Output File - If available, this identifies an associated output file that lists the detected
errors the recovery is attempting to correct. Output file information associated with a
recovery is only available from the sender system.
Job - If available, this identifies the job that is performing the recovery action. Job
information associated with a recovery is only available from the sender system.
Options for working with recoveries
Table 33 identifies the possible actions you can take for a recovery. From the
Recoveries window, the Actions list for each recovery contains only the actions
possible for the selected recovery.
Table 33.
Options available for recoveries
WRKRCY
Option
Description
4=Remove
Removes the specified recovery, if it is not held or active. A
confirmation panel is displayed after pressing Enter. Use this option to
remove orphaned recoveries whose associated recovery job ended.
This option is only available from the system on which the recovery job
ran.
5=Display
Displays available additional information associated with the recovery.
6=Print
Prints the information associated with the recovery.
8=View progress
Displays a filtered view of the output file associated with the recovery.
MIMIX updates the output file while the recovery is in progress,
identifying the detected errors it is attempting to correct and marking
corrected errors as being recovered.This option is only available from
the system on which the recovery job is running.
10=End job
Ends an active recovery job. This action is valid for recoveries with
names that begin with # and is only available from the system on
which the recovery job is running.
12=Display job
Displays the job log for the recovery job associated in progress. This
option is only available from the system on which the recovery job is
running.
166
Displaying recoveries
Table 33.
Options available for recoveries
WRKRCY
Option
Description
13=Hold job
Places an active recovery job on hold. This action is valid for
recoveries with names that begin with # and is only available from the
system on which the recovery job is running.
14=Release job
Releases a held recovery job. This action is valid for recoveries with
names that begin with # and is only available from the system on
which the recovery job is held.
Orphaned recoveries
There are times when recoveries exist but are no longer associated with a job. The
following conditions could cause recoveries to become orphaned:
•
An unplanned switch has occurred
•
The MIMIX subsystem was ended unexpectedly
•
A recovery job was ended unexpectedly
When automatic audit recovery is enabled, orphaned recoveries are converted to
error notifications during system cleanup. If the orphaned recovery is older than the
cleanup time specified in the system definition, it is deleted.
When automatic database recovery or automatic object recovery is enabled,
orphaned recoveries are deleted, when possible.
Because recoveries are displayed on both systems, but jobs associated with them are
only accessible from the originating system, you need to verify that the recovery is
orphaned before removing it.
Determining whether a recovery is orphaned
Do the following to determine whether a recovery is orphaned:
1. From a command line, type WRKRCY and press Enter.
2. Press F11 to display the Timestamp view. This view allows you to see the From
System column which lists the system from which the recovery originated.
3. Ensure you are operating from the originating system. Then type a 12 next to the
recovery.
4. Do one of the following:
•
If an error message is displayed indicating that the job associated with the
recovery is not found, follow the steps in “Removing an orphaned recovery” on
page 168.
•
When the Display Job display appears, type a 10 in the Selection field and
press Enter. The status of the job is displayed. If the job associated with the
recovery is no longer valid, follow the steps in “Removing an orphaned
recovery” on page 168.
167
Displaying recoveries
Removing an orphaned recovery
These procedures assume that you have already confirmed that the recovery is
orphaned using the procedures in “Determining whether a recovery is orphaned” on
page 167.
Do the following to remove an orphaned recovery:
1. From the originating system, type WRKRCY on the command line and press Enter.
2. After you have ensured that the recovery is orphaned, type a 4 next to the
orphaned recovery you wish to remove and press Enter.
3. Press Enter to confirm your request to remove the recovery.
168
CHAPTER 10
Starting and ending replication
MIMIX uses a number of processes to perform replication. These processes, along
with a number of supporting processes must be active to enable MIMIX to function.
These pairs of commands will start and end replication:
•
The Start MIMIX (STRMMX) and End MIMIX (ENDMMX) commands will start or
stop replication processes as well as all supporting processes for the products in a
MIMIX installation library in a single operation. These commands are the preferred
method for starting and ending MIMIX.
•
The Start Application Group (STRAG) and End Application Group (ENDAG)
commands will start or stop replication processes in environments configured with
application groups. Each command calls a default procedure with steps to perform
its operations and can be customized.
•
The Start Data Group (STRDG) and End Data Group (ENDDG) commands will
start or stop data group replication processes. These commands are the basis for
controlling replication processes and are invoked programmatically by the
previously identified commands.
This chapter provides information about and procedures for use each set of
commands. The following topics are included:
•
“Before starting replication” on page 171 applies to all methods of starting
replication.
•
“Commands for starting replication” on page 171 describes the STRMMX,
STRAG, and STRDG commands and considerations for their use.
•
“What occurs when a data group is started” on page 174 describes what the
STRDG command does in addition to starting replication, choices for specifying a
journal starting point, and options for clearing pending and error entries.
•
“Starting MIMIX” on page 179 provides a procedure for using the STRMMX
command.
•
“Starting an application group” on page 180 provides a procedure for using the
STRAG command.
•
“Starting selected data group processes” on page 181 provides a procedure for
using the STRDG command and identifies when the start request should include
clearing pending entries.
•
“Starting replication when open commit cycles exist” on page 183 describes when
MIMIX cannot start replication due to open commit cycles and how to resolve
them and start replication.
•
“Before ending replication” on page 184 to all methods of ending replication.
•
“Commands for ending replication” on page 184 describes the ENDMMX,
ENDAG, and ENDDG commands and considerations for their use, such as when
to perform a controlled end or when to end the RJ link.
169
•
“What occurs when a data group is ended” on page 190 describes the behavior of
the ENDDG command.
•
“Ending MIMIX” on page 179 provides procedures for using the ENDMMX
command and describes when you may also need to end the MIMIX subsystem.
•
“Ending an application group” on page 194 provides a procedure for using the
ENDAG command.
•
“Ending a data group in a controlled manner” on page 195 provides procedures
for preparing to end, ending, and confirming that the end completed without
problems.
•
“Ending selected data group processes” on page 198 provides a procedure using
the ENDDG command.
•
“What replication processes are started by the STRDG command” on page 199
describes which replication processes are started with each possible value of the
Start processes (PRC) parameter. Both data groups configured for remote
journaling and data groups configured for MIMIX source-send processing are
addressed.
•
“What replication processes are ended by the ENDDG command” on page 203
describes what replication processes are ended with each possible value for the
End Options (PRC) parameter. Both data groups configured for remote journaling
and data groups configured for MIMIX source-send processing are addressed.
170
Before starting replication
Before starting replication
Consider the following:
•
Before starting replication, the database files and objects to be replicated by a
data group must be synchronized between the systems defined to the data group.
For more information about performing the initial synchronization, see the MIMIX
Administrator Reference book.
•
If you are using the MIMIX for MQ function, you must use the procedures in the
MIMIX for IBM WebSphere MQ book for initial synchronization and initial start of
data groups that replicate data for IBM WebSphere MQ.
•
Data groups that are in a disabled state are not started. Only data groups that
have been enabled can be started.
Commands for starting replication
These commands start replication processes. The significant differences between
these commands are:
Start MIMIX (STRMMX) – The STRMMX command will start all MIMIX processes
in a MIMIX installation, including those used for replication, in a single operation
regardless of how replication is configured. This is the preferred method of
starting MIMIX. Optionally, this command can be used to start all MIMIX
processes on the local system only.
Start Application Group (STRAG) – The STRAG command will start replication
processes for data groups that are part of an application group. This is the
preferred method of starting replication in application groups. The command
invokes a procedure which performs the operations to start replication for the
participating data groups.
Start Data Group (STRDG) – The STRDG command will start replication
processes for a data group and the remote journal link, if necessary. This
command is the basis for all other methods of starting replication. Optionally, this
command can specify a starting point in the journals, clear any pending or error
entries, set object auditing levels, and start a subset of the replication processes.
What is started with the STRMMX command
The STRMMX command is shipped with default values that will start all MIMIX
processes on all systems in the installation. Optionally, the command can be used to
start MIMIX processes on only the local system. Processes are started in the following
order:
MIMIX managers and services - All jobs for the system managers, journal
managers, target journal inspection, and collector services are started on the
specified systems. If you are using MIMIX with IBM i clustering, Cluster Services
are started for all specified systems that are configured for clustering.
Data groups - For enabled data groups, starts the replication processes, remote
171
Commands for starting replication
journal links, and automatic recovery processes on the specified systems. Each
data group starts from the journal receiver in use when the data group ended and
with the sequence number following the last sequence number processed.
Master monitor - Starts the master monitor on each of the specified systems.
Monitors - On each of the specified systems, the master monitor starts monitors
that are not disabled and which are configured to start with the master monitor.
Application groups - If all systems are specified, all application groups and any
associated data resource groups are started. If IBM i clustering is used, default
processing will start the IBM application CRG.
Note: The STRMMX command does not start promoter group activity. Start promoter
group activity using procedures in the Using MIMIX Promoter book.
STRMMX and ENDMMX messages
Once you have run the STRMMX or ENDMMX command, one of the following
messages is displayed:
•
Completion LVI0902 – This message indicates that all MIMIX products were
started or ended successfully.
•
Escape LVE0902 – This message indicates one or more MIMIX products failed to
start or end.
What is started by the default START procedure for an application group
When an application group is created, a default procedure named START is created
for it from a shipped default procedure. The Start Application Group (STRAG)
command automatically uses the application group’s default START procedure unless
you specify a different procedure.
Steps in the shipped default START procedure are described in the MIMIX
Administrator Reference book.
Choices when starting or ending an application group
For the purpose of describing their use, the Start Application Group (STRAG) and End
Application Group (ENDAG) commands are quite similar. This topic describes their
behavior for application groups that do not participate in a cluster controlled by the
IBM i operating system (*NONCLU application groups).
What is the scope of the request? The following parameters identify the scope of
the requested operation:
Application group definition (AGDFN) - Specifies the requested application group.
You can either specify a name or the value *ALL.
Resource groups (TYPE) - Specifies the types of resource groups to be
processed for the requested application group.
172
Commands for starting replication
Data resource group entry (DTARSCGRP) - Specifies the data resource groups to
include in the request. The default is *ALL or you can specify a name. This
parameter is ignored when TYPE is *ALL or *APP.
What is the requested behavior? The following parameters, when available, define
the expected behavior:
Current node roles (ROLE) - Only available on the STRAG command, this
parameter is ignored for non-cluster application groups.
What procedure will be used? The following parameters identify the procedure to
use and its starting point:
Begin at step (STEP) - Specifies where the request will start within the specified
procedure. This parameter is described in detail below.
Procedure (PROC) - Specifies the name of the procedure to run to perform the
requested operation when starting from its first step. The value *DFT will use the
procedure designated as the default for the application group. The value
*LASTRUN uses the same procedure used for the previous run of the command.
You can also specify the name of a procedure that is valid the specified
application group and type of request.
Where should the procedure begin? The value specified for the Begin at step
(STEP) parameter on the request to run the procedure determines the step at which
the procedure will start. The status of the last run of the procedure determines which
values are valid.
The default value, *FIRST, will start the specified procedure at its first step. This value
can be used when the procedure has never been run, when its previous run
completed (*COMPLETED or *COMPERR), or when a user acknowledged the status
of its previous run which failed, was canceled, or completed with errors
(*ACKFAILED, *ACKCANCEL, or *ACKERR respectively).
Other values are for resolving problems with a failed or canceled procedure. When a
procedure fails or is canceled, subsequent attempts to run the same procedure will
fail until user action is taken. You will need to determine the best course of action for
your environment based on the implications of the canceled or failed steps and any
steps which completed.
The value *RESUME will start the last run of the procedure beginning with the step at
which it failed, the step that was canceled in response to an error, or the step
following where the procedure was canceled. The value *RESUME may be
appropriate after you have investigated and resolved the problem which caused the
procedure to end. Optionally, if the problem cannot be resolved and you want to
resume the procedure anyway, you can override the attributes of a step before
resuming the procedure.
The value *OVERRIDE will override the status of all runs of the specified procedure
that did not complete. The *FAILED or *CANCELED status of these procedures are
changed to acknowledged (*ACKFAILED or *ACKCANCEL) and a new run of the
procedure begins at the first step.
.
For more information about starting a procedure with the step at which it failed, see
“Resuming a procedure” on page 91.
173
What occurs when a data group is started
What occurs when a data group is started
The Start Data Group (STRDG) command will start the replication processes for the
specified data group.
The STRDG command can be used interactively or programatically. Default values for
the command are used when it is invoked by the STRMMX command or by the
STRAG command running the default START procedure.
When a STRDG request is processed, MIMIX may take a few minutes while it does
the following for each specified data group:
•
Determines whether the RJ link is active and whether all required system
managers and journal managers on each system are started. If necessary, the
managers and the remote journal function defined by the RJ link are started.
•
Determines the starting point for replication (database, object, or both, as
configured).
•
Locates the starting point in the appropriate journal receiver. This will be the
starting point for send processes.
•
If necessary, changes the object audit level of existing objects identified for
replication. This occurs when starting following a switch or a configuration change
to any data group object, IFS, or DLO entry. This ensures that all replicated
objects identified by all entries of each entry type are set with an object audit level
suitable for replication. The processing order for data group entries can affect the
auditing value of IFS objects. For examples and for information about manually
specifying the audit level of objects, see the MIMIX Administrator Reference book.
•
Submits the appropriate start requests for the processes specified on the start
request.
•
Makes configuration changes for the data group become effective. If a
configuration change affects the set of objects to be replicated, the start request
also automatically deploys the configuration changes to an internal list used by
other functions. This may cause the start request to take longer.
•
Attempts to recover any existing access path maintenance1 errors for the data
group, if the Access path maintenance (APMNT) policy is enabled.
•
If specified on the start request, clears all pending entries for apply processes and
clears all error entries identified in replication processing for the data group. There
are times when it is necessary to clear pending entries, error entries, or both, to
establish a new synchronization point for the data group.
Starting a data group may take longer if the remote journal function is operating in
catchup mode.
1. The access path maintenance function is available on installations running MIMIX 7.1.15.00 or
higher. Access path maintenance replaces the parallel access path maintenance function available on installations running earlier software levels, On earlier software levels, a start data group
request creates and activates the monitors used by the parallel access path maintenance function if the parallel access path maintenance (PRLAPMNT) policy is enabled.
174
What occurs when a data group is started
Journal starting point identified on the STRDG request
On the STRDG command, you can optionally specify the point at which to start
replication in the journal receivers. The parameters for database and object journal
receivers and sequence numbers provide this capability. You may need to use these
parameters when starting data groups for the first time.
•
For user journal replication, the IBM i remote journal function controls where
processing starts in the source journal receiver. The values specified for the
Database journal receiver (DBJRNRCV) and Database large sequence number
(DBSEQNBR2) identify the starting location for the database reader process and
the database apply process.
•
For system journal replication, the value specified for Object journal receiver
(OBJJRNRCV) and Object large sequence number (OBJSEQNBR2) identify the
starting location for the object send process and the object apply process.
Note: The parameters Database sequence number (DBSEQNBR) and Object
sequence number (OBJSEQNBR) continue to be valid for journal
definitions which specify *MAXOPT2 for the Receiver size option
(RCVSIZEOPT) and for values that do not exceed 10 digits. To ensure
continued compatibility, the use of parameters DBSEQNBR2 and
OBJSEQNBR2 is recommended.
Journal starting point when the object send process is shared
When starting data groups that share an object send process, the first data group to
start will start the shared job at that data group’s starting point in the system journal
(QAUDJRN). As additional data groups start, each recognizes that the shared object
send job is active. The object send job determines whether the starting point for that
data group is earlier or later than the sequence number being read. If the data group’s
starting point is later, replication will begin when the shared job reaches the data
group's starting point. If the data group’s starting point is earlier, the shared job
completes its current block of entries, then returns to the earliest point for any of the
data groups being started. The shared job reads the earlier entries and routes the
transactions to the data group being started. When the shared job reaches the last
entry it read at the time of the STRDG request, it resumes routing transactions to all
active data groups using the shared job.
If the starting data group has a significant object send backlog, the other data groups
sharing the job will not receive transactions to replicate while the backlog for the
starting data group is being addressed. Therefore, when a significant backlog exists, it
is recommended that you change the data group configuration to use a dedicated job
(*DGDFN for object send prefix), start the data group, and allow it to catch up to the
current location of the shared job. Then end the data group, change its configuration
to use the desired shared job, and restart the data group.
Clear pending and clear error processing
The Clear pending and Clear error prompts on the STRDG command provide
flexibility when starting a data group by allowing you to optionally reset error status
conditions on data group file entries and discard pending journal entries that are
175
What occurs when a data group is started
stored in the journal log space. Clear pending resets the starting point for all data
group file entries and object entries. Clear error clears the hold log spaces.
When clearing pending entries, you can optionally specify which system to use for
determining database file network relationships when distributing files among
database apply sessions. The System for DB file relations (DBRSYS) prompt
identifies which system is used to assign data group file entries to apply sessions
when the start request specifies to clear pending entries in all apply sessions.
Table 34 shows the processing that occurs based on the selection made for the Clear
pending (CLRPND) and Clear error (CLRERR) prompts. The Clear pending and Clear
error prompts work independently. For example, when CLRPND(*NO) is selected, no
clear pending processing occurs.
Table 34.
CLRPND and CLRERR processing
CLRPND
CLRERR
Processing Description
*NO
*NO
Data groups start with regular processing:
• Data group file entry status remains unchanged.
• Hold logs remain unchanged.
*NO
*CLRPND
The value selected for the CLRPND parameter is
used for CLRERR. Same processing as
CLRPND(*NO) CLRERR(*NO).
*NO
*YES
• Data group file entries in *HLDERR, *HLDRGZ,
*HLDRNM, *HLDPRM, and *HLDRLTD status are
cleared.
• Tracking entries in *HLDERR status are cleared.
• Hold log space is deleted.
See File entry states
See Log spaces
*YES
*NO
Note: CLRPND(*YES) will not start a data group when
there are open commit cycles on files defined to the
data group.
See File entry apply
session assignment
See Single apply session
processing
See Log spaces
• Data group file entries in *HLDRGZ, *HLDRNM,
and *HLDPRM status are cleared and reset to
active.
• Data group tracking entries in *HLDRNM are
cleared and reset to active.
• Data group file entries and tracking entries in
*HLDERR status remain unchanged.
• If there is a requested status at the time of starting,
it is cleared.
• Journal, hold, tracking entry hold, and apply
history log spaces are deleted.
• The apply session to which data group file entries
are assigned may change.
Notes
176
What occurs when a data group is started
Table 34.
CLRPND and CLRERR processing
CLRPND
CLRERR
Processing Description
Notes
*YES
*YES
Note: CLRPND(*YES) will not start a data group when
there are open commit cycles on files defined to the
data group.
See File entry states
See File entry apply
session assignment
See Single apply session
processing
See Log spaces
• Data group file entries in *HLDERR, *HLDRGZ,
*HLDRNM, *HLDPRM, and *HLDRLTD status are
cleared and reset to active.
• Data group file entries in *HLDRTY status remain
unchanged.
• Data group object activity entries in any failed or
active status are changed to CC (Completed by
clear request).
• Tracking entries in *HLDERR and *HLDRNM
status are cleared and reset to active.
• Tracking entries are primed if any configuration
changes occurred for data group object entries or
data group IFS entries.
• If there is a requested status at the time of starting,
it is cleared.
• Journal, hold, and apply history log spaces are
deleted.
• The apply session to which data group file entries
are assigned may change.
*YES
*CLRPND
Note: CLRPND(*YES) will not start a data group when
there are open commit cycles on files defined to the
data group.
The value selected for the CLRPND parameter is
used for CLRERR. Same processing as
CLRPND(*YES) CLRERR(*YES).
See File entry states
See File entry apply
session assignment
See Single apply session
processing
See Log spaces
File entry states: Files in specific states will not reset to active when you specify *YES on the Clear Error
prompt. If you have set data group file entries to any of these states, the following process exception
applies:
Note: The only states that can be set using the Set Data Group File Entry (SETDGFE) command are *HLD,
*RLSWAIT, *ACTIVE, and *HLDIGN. All other states are the result of internal processing.
• *HLD - Journal entries cached before *YES is specified are discarded. If *ALL or *ALLSRC is specified
on the Start processes prompt, all subsequent entries from the specified starting point will be cached
again.
• *RLSWAIT - Journal entries are discarded as they wait for the synchronization point to arrive in the
journal stream. This occurs regardless of the value specified for Clear Error or Clear Pending.
• *HLDIGN - Journal entries are discarded until the file status is changed to something else.
• *HLDSYNC - Journal entries are ignored since an external process is actively synchronizing the file.
When that event completes normally, the file is set to *RLSWAIT.
177
What occurs when a data group is started
Table 34.
CLRPND
CLRPND and CLRERR processing
CLRERR
Processing Description
Notes
File entry apply session assignment: Clear pending processing attempts to load balance the data group
file entries among the defined apply sessions. If the requested apply session in the data group file entry
definition is *ANY, or if it is *DGDFT and the requested apply session for the data group definition is *ANY,
then the apply session to which the data group file entry is assigned may be changed when processing
occurs. For data groups configured to replicate through the user journal, the requested apply session may
be ignored to ensure that related files are handled by the same apply session.
The value specified for System for DB file relations (DBRSYS) determines the system used to determine
database file relationships while assigning files to apply sessions. This parameter is evaluated only when
the start request specifies to clear pending entries for all database apply sessions. The default value, *TGT,
uses the target system to determine the file relationships.
Single apply session processing: In most situations, you will perform clear pending processing on all
apply sessions belonging to a data group by specifying *ALL or *DBALL on the Start processes (PRC)
prompt. MIMIX also supports the ability to perform clear pending processing on a single apply session,
which is useful for recovery purposes in certain error situations. The System for DB file relations (DBRSYS)
parameter is ignored when the start request specifies a specific apply session.To perform clear pending
processing on a single apply session, specify PRC(*DBAPY) and the specific apply session (APYSSN).
Log spaces: Because they have not been applied, journal entries that exist in the journal log space are
considered pending. Journal entries that exist in the hold log space, however, are considered in error. The
Clear pending and Clear error prompts affect which log spaces are deleted (and recreated) when a data
group is started.
178
Starting MIMIX
Starting MIMIX
To start all MIMIX products within an installation library, do the following:
1. If you are starting MIMIX for the first time or starting MIMIX after a system IPL, do
the following:
a. Use the command WRKSBSJOB SBS(MIMIXSBS)to verify that the MIMIX
subsystem is running. If the MIMIXSBS is not already active, start the
subsystem using the STRSBS SBSD(MIMIXQGPL/MIMIXSBS)command.
b. If MIMIX uses TCP/IP for system communication, the TCP/IP servers must be
running. If TCP/IP is not already active, start TCP/IP using the port number
defined in the transfer definitions and the procedures described in “Starting the
TCP/IP server” on page 260.
2. Do one of the following:
•
From the MIMIX Basic Main Menu, select option 2 (Start MIMIX) and press
Enter.
•
From a command line type STRMMX and press Enter.
3. The Start MIMIX (STRMMX) display appears. Accept the default value for the
System definition prompt and press Enter.
4. If you see a confirmation display, press Enter to start MIMIX.
179
Starting an application group
Starting an application group
For an application group, a procedure for only one operation (start, end, or switch)
can run at a time. For information about parameters and shipped procedures, see
“What is started by the default START procedure for an application group” on
page 172 and “Choices when starting or ending an application group” on page 172.
To start an application group, do the following:
1. From the Work with Application Groups display, type 9 (Start) next to the
application group you want and press F4 (Prompt).
2. Verify that the values you want are specified for Resource groups and Data
resource group entry.
3. If you are starting after addressing problems with the previous start request,
specify the value you want for Begin at step. Be certain that you understand the
effect the value you specify will have on your environment.
4. Press Enter.
5. The Procedure prompt appears. Do one of the following:
•
To use the default start procedure, press Enter.
•
To use a different start procedure for the application group, specify its name.
Then press Enter.
180
Starting selected data group processes
Starting selected data group processes
This procedure can be used to do any of the following:
•
Start all or selected processes for a data group, or start a specific database apply
process
•
Specify a starting point for journal receivers when starting a data group
•
Clear pending and error entries when starting a data group
Data groups that are in an application group: The preferred method of starting
data groups that are part of an application group is to use the Start Application Group
(STRAG) command. Beginning with service pack 7.1.06.00, the default behavior of
the STRDG command helps to enforce this best practice when necessary by not
allowing the command to run when the data group is participating in a resource group
with three or more nodes. (A data resource group provides the association between
one or more data groups and an application group.). The STRDG request will run
when the data group is participating in a resource group with two nodes. In earlier
software levels, default behavior does not allow a start request when the data group is
part of an application group.
In application group environments with three or more nodes, it is particularly important
to treat all members of an application group as one entity. For example, a
configuration change that is made effective by starting and ending a single data group
would not be propagated to the other data groups in the same resource group.
However, the same change would be propagated to the other data groups if it is made
effective by ending and starting the parent application group.
When to clear pending entries and entries in error: Table 35 identifies when it is
necessary to clear pending entries for apply processes and clear logs of entries
indicating files in error to establish a new synchronization point when starting a data
group. The reason for starting the data group determines whether you need to clear
only pending entries for transactions waiting to be applied, clear only errors, or both.
Before clearing pending entries, determine if there are any file entries on hold. These
are the transactions that will be lost by clearing pending entries.
When clearing pending entries, most environments can accept the default value for
the System for DB file relations prompt. If necessary, you can specify a value when
directed to by your MIMIX administrator.
181
Starting selected data group processes
Table 35.
When to clear pending entries and entries in error when starting a data group
If starting the data group in any of these conditions:
Specify these values on the
STRDG command:
After enabling a previously disabled data group
Clear pending entries.
Specify *YES for the Clear pending
prompt
After changing the Number of DB apply sessions (NBRDBAPY)
parameter on the data group definition
After synchronizing database files and objects between two systems
Note: This assumes that you have synchronized the objects and database
files and have changed the journal receivers using TYPE(*ALL) on
the CHGDGRCV command.
After switching the direction of the data group, when starting
replication on the system that now becomes the source system
Clear pending entries and entries in
error.
• Specify *YES for the Clear
pending prompt
• Specify *CLRPND or *YES for
the Clear error prompt
For additional information about the STRDG command, refer to the following topics:
•
“What occurs when a data group is started” on page 174
•
“What replication processes are started by the STRDG command” on page 199
To start a data group, do the following:
1. From the Work with Data Groups display, type a 9 (Start DG) next to the data
group that you want to start and press Enter.
The Start Data Group (STRDG) display appears.
2. At the Start processes prompt, specify the value for the processes you want to
start. If you are starting the data group for the first time specify *ALL. To see a list
of values, press F4 (Prompt).
3. Press Enter.
4. Additional prompts appear. For most situations, you should accept the default
values. If necessary, specify the following:
•
At the Database journal receiver and Database large sequence number
prompts, identify where the database reader and apply processes begin.
•
At the Object journal receiver and Object large sequence number prompts,
identify where the object send and apply processes begin.
•
If you are starting the data group for any of the reasons listed in Table 35,
specify the indicated values for that reason in the Clear pending and Clear
error prompts.
•
If you are submitting this command for batch processing, you should specify
*NO for the Show confirmation screen prompt.
5. To start the data group, press Enter.
182
Starting replication when open commit cycles exist
Starting replication when open commit cycles exist
Open commit cycles may be present when a data group ends, or if a system event or
failure occurred.
In most conditions, an open commit cycle present at the time that a data group ended
will not prevent a request to start replication from running. However, MIMIX will
prevent the data group from starting when either of these conditions exist:
•
When the start request specifies to clear pending entries. Certain procedures may
require a clear pending start. Message LVE387F is issued with reason code AP.
•
When the commit mode specified for the database apply process changed.
Changing the commit mode is not a common occurrence. Message LVEC0B3 is
issued.
When these conditions exist, the open commit cycles must be resolved.
Checking for open commit cycles
Do the following to check for open commit cycles:
1. From the MIMIX Basic Main Menu, type a 6 (Work with data groups) and press
Enter.
2. The Work with Data Group display appears. Type an 8 (Display status) next to the
data group you ended and press Enter.
3. Press F8 (Database) to view the Data Group Detail Status display.
4. For each apply session listed, check the value shown in the Open Commit column
at the right side of the display. If the value is *YES, open commit cycles exist for
the data group.
Resolving open commit cycles
This procedure assumes that the data group is ended and that you have confirmed
the presence of open commit cycles.
1. Start the data group, specifying *NO for the Clear pending prompt.
2. You must take action to resolve the open commit cycles, such as ending or
quiescing the application or closing the commit cycle. MIMIX will process the open
commit cycles when they are resolved.
3. Perform a controlled end of the data group.
4. When the data group is ended, check for open commit cycles again.
You may need to repeat this procedure until all open commit cycles have been
resolved.
183
Before ending replication
Before ending replication
Consider the following:
•
If the next time you start the data groups requires that you clear pending entries,
or if you will be performing a switch, you should verify that no activity is still in
progress before you perform these activities. Use the command WRKDGACTE
STATUS (*ACTIVE) to ensure all activity entries completed.
•
Data groups that are in a disabled state are not ended. Only data groups that have
been enabled and have been started can be ended.
Commands for ending replication
These commands end replication processes. The significant differences between
these commands are:
•
End MIMIX (ENDMMX) - The ENDMMX command will end all MIMIX processes in
a MIMIX installation, including those used for replication, in a single operation.
Optionally, this command can be used to end all MIMIX processes on the local
system only.
•
End Application Group (ENDAG) - The ENDAG command will end replication
processes for data groups that are part of an application group. This is the
preferred method of ending replication in application groups. The command
invokes a procedure which performs the operations to end replication for the
participating data groups.
•
End Data Group (ENDDG) - The ENDDG command will end the specified
replication processes for the data group either immediately or in a controlled
manner. This command is the basis for all other methods of ending replication,
and is also called by commands that perform switch operations. Optionally, this
command can end a subset of replication processes or a selected database apply
process, specify a wait time and end option for controlled ends, and end the
remote journal link.
Command choice by reason for ending replication
Table 36 lists common reasons for ending MIMIX activity and the appropriate
command to use. Depending on why you are ending replication, you may need to
choose values other than the defaults.
Table 36.
Choosing the appropriate command to end replication
Reason for Ending Replication
Use Command
Ending communications for any reason
ENDMMX
Performing a full save and restore of data
that is defined to MIMIX
ENDMMX
Additional Information
184
Commands for ending replication
Table 36.
Choosing the appropriate command to end replication
Reason for Ending Replication
Use Command
Additional Information
Performing a save from the source
system
ENDAG or
ENDDG
When application groups are used, use
the ENDAG command with its default
END procedure. See “What is ended
by the default END procedure for an
application group” on page 189.
For ENDDG, specify *ALL for the Process
(PRC) parameter. See “What replication
processes are ended by the ENDDG
command” on page 203.
The save request may not be able to save
all the files or objects if they are opened
or locked by MIMIX.
Performing a save from the target system
If using step
programs and
procedures, run
ENDTGT
or
ENDDG
PRC(*ALLTGT)
See “Ending all or selected processes” on
page 187.
You may be able to end only selected
processes on the target system. See
“Ending selected data group processes”
on page 198.
The save request may not be able to save
all the files or objects if they are opened
or locked by MIMIX.
Preparing to update MIMIX software
ENDMMX
See controlled end information in “Ending
immediately or controlled” on page 186.
Performing an IPL of either system
ENDMMX
Also end the RJ link
Upgrading the operating system release
on either system
ENDMMX
Also end the RJ link
Performing a switch in preparation for
performing maintenance on either system
---
Let your switching mechanism end
replication (switch procedure for
application group or (MIMIX Switch
Assistant or MIMIX Model Switch
Framework)
Ending only a selected replication
process
ENDDG
See “Ending selected data group
processes” on page 198.
Changing configuration, such as adding
or changing data group entries
ENDAG or
ENDDG
When application groups are used, use
the ENDAG command.
The changes are not available to active
replication processes until the data group
processes are ended and restarted.
185
Commands for ending replication
Additional considerations when ending replication
The following questions will help you determine additional options you may need
when ending replication. All methods of ending replication can accomplish these
activities, but in some, the action is not default or may require additional
programming.
•
Do processes need to end in a controlled manner or can they be ended
immediately? Both commands support these options. For more information, see
“Ending immediately or controlled” on page 186
•
Do you need to end only a subset of the replication processes? Only ENDDG
supports ending selected processes. For more information see “Ending all or
selected processes” on page 187.
•
Does the RJ link also need to end? For data groups that use remote journaling
you may also choose whether to end the RJ link. In most cases, the RJ link can
remain active. For more information, see “When to end the RJ link” on page 188.
Ending immediately or controlled
Both ENDMMX and ENDDG commands provide the ability to choose whether
replication processes end immediately or in a controlled manner through the End
process (ENDOPT) parameter.
For the ENDAG command, the specified end procedure determines whether
replication processes end immediately or in a controlled manner. If the procedure
specifies a controlled end, the procedure also determines wait time and time out
options.
When you perform an immediate end, the processes end independently of each
other. For example, it is possible for the apply process to end before the send or
receive process. Each replication process verifies that its processing is at a point that
will permit ending, then ends. The amount of time it takes for an immediate end varies
depending on the delay values set for each manager and what each process is doing
at the time. An immediate end does not ensure that all journal entries generated are
sent to or applied on the target system.
If an incomplete IFS or object tracking entry for a data group is being processed
during an immediate end, the entire entry may not be applied. When the data group is
restarted, the entire incomplete entry is rewritten to ensure the integrity of the object.
When you perform a controlled end, MIMIX creates either a journal entry or log
space entry. This entry proceeds through the replication path. The date and time of
the entry are compared to the date and time of when the process being considered
was started. If the entry is earlier than the process start time, the end request is
ignored. If the entry is later than when the process being considered was started, the
process is ended.
A controlled end ensures that processes end in order and that each process
completes any queued or in-progress transactions before the next process is
permitted to end. This ensures that you have a known point in each journal at which
you can restart replication.
186
Commands for ending replication
If any processes have a backlog of entries, it may take some time for the entry
created by the request to be processed through the replication path. Any entries that
precede the entry requesting to end are processed first.
A data group that is ended in a controlled manner is prepared for a more effective and
safer start when the start request specifies to clear pending entries. The existence of
commit cycles implies that there is application activity on the source system that
should not be interrupted; replication should be allowed to continue through the end of
the commit cycle. It is preferable to ensure that commit cycles are resolved or
removed before ending a data group. There are conditions in which a data group will
not start if open commit cycles exist. For more information, see “Starting replication
when open commit cycles exist” on page 183.
If the request to perform a controlled end also includes ending the RJ link, the RJ link
is ended after all requested processes end.
Either type of end request may be ignored if the request is submitted just before the
time that MIMIX jobs are restarted daily. For more information about restarting jobs,
see ‘Configuring restart times for MIMIX jobs’ in the MIMIX Administrator Reference
book.
Controlling how long to wait for a controlled end to complete
On the ENDMMX or ENDDG command, when you request a controlled end you can
determine how long to wait for all specified data group processes to end. The Wait
time (seconds) (WAIT) parameter specifies how long to wait for all of the specified
data group processes to end. MIMIX will attempt to resolve all pending activity entries
before ending the data groups. If a numeric value was specified, and the selected
processes do not end within the specified time, the action specified for the Timeout
option (TIMOUTOPT) will occur.
The WAIT parameter also supports special values of *SBMRQS and *NOMAX. When
these values are used, the TIMOUTOPT parameter is ignored.
Note: If *ALL is specified for any part of the data group definition, the Wait time value
must be *SBMRQS (submit request).
Ending all or selected processes
MIMIX determines which data group replication processes to end based on the
command specified and options on the command.
The ENDMMX command ends all replication processes for all data groups on the
systems specified on the end request.
The default END procedure for the ENDAG command uses the default settings of the
ENDDG command. MIMIX also ships an ENDTGT procedure that, when specified on
the ENDAG command, will end only processes on the target system.
Only the ENDDG command supports the ability to end selected replication processes
through its Process (PRC) parameter. The default value is to end all replication
processes for the specified data groups. The configuration of each data group
determines which processes end with each possible value for the PRC parameter. If
you choose to use this parameter, be sure that you understand what processes will
187
Commands for ending replication
end. See “What replication processes are ended by the ENDDG command” on
page 203.
When to end the RJ link
The RJ link remains active unless you change the value of the End remote journaling
(ENDRJLNK) parameter on the ENDMMX command or the ENDDG command.
The RJ link can normally remain active unless you have a need to prevent data from
being sent to the target system. Some situations where you need to end the RJ link
include:
•
Following a switch, to prevent data from returning to the system on which it
originated (round-tripping), and to reduce communications and DASD usage
•
Before performing an IPL on either the source system or target system
•
Before upgrading the IBM i release on either the source system or the target
system
•
Before performing a hardware upgrade
The default END procedure for the ENDAG command used the default values for the
ENDDG command. MIMIX also ships a step program, MXENDRJLNK, that can be
added into the END procedure if necessary.
What is ended by the ENDMMX command
The ENDMMX command will end all MIMIX processes needed for replication on the
specified systems in the installation. If you are using application groups, the
application group is not specifically ended, and the associated end procedure will not
be run. Any processes for user applications or IBM cluster resource groups must be
ended separately. When you use this command, the following occurs:
Data groups - The end process specified is used to end all enabled data groups
and their supporting processes, including automatic recovery, on the specified
systems. This includes data groups associated with data resource groups. Default
values end data groups in a controlled manner.
Remote journal links - If you selected to end remote journaling, all remote journal
links associated with the specified systems are ended.
MIMIX managers and services - Ends the system managers, journal managers,
target journal inspection, and collector services on the specified systems.
Monitors - Ends all individual monitors currently active in the installation library on
the specified systems.
Master monitor - Ends the master monitor on each of the specified systems.
MIMIX Promoter - Ends promoter group activity on the specified systems.
Audits and Recoveries - All queued audits, all audits in progress, and all
recoveries in progress that are associated with the specified systems are ended.
This includes jobs with locks on the installation library. Queued audits are set to
188
Commands for ending replication
*NOTRUN and audits in comparison phase are set to *FAILED. Audits in recovery
phase reflect their state of processing at the time of the end request, which may
be *NOTRCVD.
Note: Cluster services is not ended when MIMIX managers end because cluster
services may be necessary for other applications.
What is ended by the default END procedure for an application group
When an application group is created, a default procedure named END is created for
it from a shipped default procedure. The End Application Group (ENDAG) command
automatically uses the application group’s default END procedure unless you specify
a different procedure.
Steps in the shipped default END procedure, as well as steps in additional shipped
procedures that end application groups, are described in the MIMIX Administrator
Reference book.
189
What occurs when a data group is ended
What occurs when a data group is ended
The End Data Group (ENDDG) command will end replication processes for the
specified data group.
The ENDDG command can be used interactively or programatically. This command is
invoked by the ENDMMX command and by the ENDAG command running the default
END procedure, using values other than default for some parameters.
When an ENDDG request is processed, MIMIX may take a few minutes while it does
the following for each specified data group:
•
Determines which data group replication processes to end based on the value you
specify for the Process (PRC) parameter. The default value ends all MIMIX
replication processes.
•
When ending data groups that use a shared object send job, the job is ended by
the last data group to end.
•
When ending data groups that perform access path maintenance1, the database
apply process signals the access path maintenance job and then ends. The
access path maintenance job uses additional jobs, if needed, to change the
access path maintenance attribute to immediate on all files that MIMIX had
previously changed to delayed. Any files that could not be changed are identified
as having an access path maintenance error before the maintenance jobs end.
•
Ends the specified replication processes in the manner specified for the End
process (ENDOPT) parameter. The command defaults to processing the end
request immediately (*IMMED). When invoked by the ENDMMX command, the
default value specified on ENDMMX is *CNTRLD, which takes precedence. When
invoked by a procedure specified on the ENDAG command, the procedure
determines whether ENDDG is passed parameter values or uses the command
defaults.
•
Uses the specified Wait time and Timeout options if a controlled end is requested.
•
If requested, ends the RJ link. The RJ link is not automatically ended. In most
cases, the default value *NO for the End remote journaling (ENDRJLNK)
parameter is appropriate. Keeping the RJ link active allows database changes to
continue to be sent to the target system even though the data group is not active.
•
If you have used the MIMIX CDP feature to set a recovery point in a data group
and then end the data group, the recovery point will be cleared. When the data
group is started again, the apply processes will process any available
transactions, including those which may have had corruptions. (Recovery points
are set with the Set DG Recovery Point (SETDGRCYP) command.) If a recovery
window is configured for the data group, its configured duration is not affected by
requests to end or start the data group.
•
On installations running software earlier than 7.1.15.00, if the parallel access path
maintenance function has been enabled, the End parallel AP maintenance
1. The access path maintenance function is available on installations running MIMIX 7.1.15.00 or
higher and is the replacement for the parallel access path maintenance function in earlier software levels.
190
What occurs when a data group is ended
(PRLAPMNT) parameter1 determines whether MIMIX will end the monitors used
by this function when the data group ends. The default value, *DFT, will end the
monitors when the value specified for Processes (PRC) includes database
processes that run on the target system (*ALL, *ALLTGT, *DBALL, *DBTGT, or
*DBAPY) and the value *ALL is specified for the Apply session (APYSSN)
parameter.
The ENDDG command does not end the system manager, journal manager, or other
processes that run at the node level. To end those processes, either use the
ENDMMX command or use the End MIMIX Managers (ENDMMXMGR) command
after replication processes have ended.
1. This parameter is not available on installations running MIMIX 7.1.15.00 or higher.
191
Ending MIMIX
Ending MIMIX
For most configurations, It is recommended that you end MIMIX products from the
management system, which is usually the backup system. If your installation is
configured so that the backup system is a network system, you should end MIMIX
from the network system.
Notes:
•
If you are ending MIMIX for a software upgrade or to install a service pack, use the
procedures in the software’s ReadMe document.
•
The ENDMMX command cannot run when application groups are configured and
there are any active, failed, or canceled procedures.
To end MIMIX, use the following procedures:
1. Use one of the following procedures:
•
“Ending with default values” on page 192
•
“Ending by prompting the ENDMMX command” on page 192
2. Complete any needed follow-up actions using the information and procedures in
“After you end MIMIX products” on page 193.
Ending with default values
Use this procedure to end all MIMIX production in an installation library.
1. From the MIMIX Basic Main Menu, select option 3 (End MIMIX) and press Enter.
You will see a confirmation display.
2. From the confirmation display, you can press F1 (Help) to see a description of the
default values that will be used. To end MIMIX, press Enter,
Ending by prompting the ENDMMX command
To end all MIMIX processes for the specified systems within an installation library, do
the following:
1. From a command line, type ENDMMX and press F4 (Prompt).
2. The End MIMIX display appears. At the End process prompt, specify *CNTRLD
for a controlled end or *IMMED for an immediate end. This parameter applies to
the application group (ENDAG) and data group (ENDDG) processes only.
Note: When ENDMMX ends data groups, it waits for each data group to end
before attempting to end the next MIMIX product.
3. At the End remote journaling prompt, specify whether you want to end remote
journaling.
Note: If you specify *YES, all data groups using the remote journal link in the
installation library will be affected. If other data groups are using the same
remote journal link, you should specify *NO.
4. If you specified *CNTRLD for Step 2, ensure that the values for the Wait time
192
Ending MIMIX
(seconds) and Timeout option prompts are what you want for the controlled end.
5. At the System definition prompt, indicate the scope of the request by specifying
either *ALL or *LOCAL. This determines the systems on which to end MIMIX
processes.
6. To end MIMIX processes, press Enter.
After you end MIMIX products
Some pending transactions may not be handled before the end process completes.
You may need to ensure that all activity entries are complete before you issue
additional commands. Examples of scenarios where it is important to check whether
all pending transactions are completed include:
•
Switching a data group (SWTDG command)
•
Starting a data group with clear pending entries (STRDG CLRPND(*YES)).
To check for active entries, use the command WRKDGACTE STATUS(*ACTIVE).
When to also end the MIMIX subsystem - You will also need to end the MIMIX
subsystem when you need to IPL the system, when upgrading MIMIX software, and
when installing a MIMIX software service pack. The MIMIX subsystem must be ended
from the 5250 emulator. To end the subsystem, do the following:
1. If you use MIMIX Availability Manager to monitor earlier releases of MIMIX, do the
following:
a. Ensure that all users have logged out of MIMIX Availability Manager.
b. From the 5250 emulator, enter LAKEVIEW/ENDMMXAM.
2. Enter the command WRKSBS. The Work with Subsystems display appears.
3. Type an 8 (Work with subsystem jobs) next to subsystem MIMIXSBS and press
Enter.
4. End any remaining jobs in a controlled manner. Type a 4 (End) next to the job and
press F4 (Prompt). The How to end (OPTION) parameter should have a value of
*YES. Press Enter. If you see a confirmation display, press Enter to continue.
5. Press F12 (Cancel) to return to the Work with Subsystems display.
6. Type a 4 (End subsystem) next to subsystem MIMIXSBS and press Enter.
193
Ending an application group
Ending an application group
For an application group, a procedure for only one operation (start, end, or switch)
can run at a time. For information about parameters and shipped procedures, see
“What is ended by the default END procedure for an application group” on page 189
and “Choices when starting or ending an application group” on page 172.
To end an application group, do the following:
1. From the Work with Application Groups display, type 10 (End) next to the
application group you want and press F4 (Prompt).
2. Verify that the values you want are specified for Resource groups and Data
resource group entry.
3. If you are starting the procedure after addressing problems with the previous end
request, specify the value you want for Begin at step. Be certain that you
understand the effect the value you specify will have on your environment.
4. Press Enter.
5. The Procedure prompt appears. Do one of the following:
•
To use the default end procedure, press Enter.
•
To use a different end procedure for the application group, specify its name.
Then press Enter.
194
Ending a data group in a controlled manner
Ending a data group in a controlled manner
The following procedures describe how to check for errors before requesting a
controlled end of a data group, how to perform the controlled end request, and how to
confirm that the end completed. Held files must be released and the apply process
must complete operations for journal entries stored in log spaces before you end data
group activity.
Data groups that are in an application group: The preferred method of ending data
groups that are part of an application group is to use the End Application Group
(ENDAG) command.
Preparing for a controlled end of a data group
It is good practice to ensure that errors are resolved before requesting a controlled
end of a data group.
Do the following:
1. From the Work with Data Groups display, type an 8 (Display status) next to the
data group you want to end and press Enter.
2. The Data Group Status display appears. In the upper right of the display, you
should see either one or both of the following fields. A non-zero value in these
fields will not prevent the end request from completing.
•
Database errors identifies the number of items replicated through the user
(database) journal that have a status of *HLDERR. This number should be 0
before you end the data group.
•
Object in error/active identifies two key statistics associated with objects
replicated through the system journal. The first number identifies the number of
objects that have a status of *FAILED and the second number identifies the
number of objects with active (pending) activity entries. Both numbers should
be 0 before you end the data group.
Note: Only information for the type of information replicated by the data group
appears on the status displays. For example, if the data group does not
contain database files, you will only see fields for object information.
3. For data groups which replicate from the user journal, you also need to check for
any files that are held for other reasons. Press F8 (Database). The Held for other
reasons field In the upper right of the Data Group Database Status display should
also be 0 before you end the data group.
A non-zero value may or may not prevent the end request from completing. For
more information, see topics “Working with files needing attention (replication and
access path errors)” on page 210.
Performing the controlled end
1. From the Work with Data Groups display, type a 10 (End DG) next to the data
group you want to end and press Enter.
2. The End Data Group (ENDDG) display appears. Specify *CNTRLD for the End
195
Ending a data group in a controlled manner
processes prompt.
3. If the data group uses remote journaling, verify that the value of the End remote
journaling prompt is what you want.
4. Because you specified *CNTRLD in Step 2, you can also use the Wait Time
(WAIT) parameter to specify how long MIMIX should try to end the selected
processes in a controlled manner. Use F1 (Help) to see additional information
about the possible options.
•
Specify *SBMRQS to submit a request to end the data groups. The appropriate
actions are issued to end the specified processes and control is returned to the
caller immediately. When you specify this value, the TIMOUTOPT parameter
(Step 5) is ignored.
•
Specify *NOMAX. When you specify this value, MIMIX will wait until all
specified MIMIX processes are ended.
•
Specify a numeric value (number-of-seconds). MIMIX waits the specified time
for a controlled end to complete before using the option specified in the
TIMOUTOPT parameter.
5. If you specified a numeric value for the WAIT parameter in Step 4, you can also
use the Timeout Option (TIMOUTOPT) parameter. You can specify what action
you want the ENDDG command to perform if the time specified in the WAIT
parameter is reached:
•
The current process should quit and return control to the caller (*QUIT).
•
A new request should be issued to end all processes immediately
(*ENDIMMED). When this value is specified, pending activity entries may still
exist after the data group processes are ended.
•
An inquiry message should be sent to the operator notifying of a possible error
condition (*NOTIFY). If you specify this value, the command must be run from
the target system.
6. Press Enter to process the command.
Confirming the end request completed without problems
After you request a controlled end of a data group, the Work with Data Group display
appears. Do the following:
1. From the Work with Data Group display appears. Type an 8 (Display status) next
to the data group you ended and press Enter.
2. The Data Group Status display appears. In the Target Statistics section near the
middle of the display, the Unprocessed Entry Count column should be blank for
any database apply processes and any object apply processes. If unprocessed
entries exist when you end the data group and perform a switch, you may lose
these entries when the data group is started following the switch.
Note: To ensure that you are aware of any possible pending or delayed activity
entries, enter the WRKDGACTE STATUS(*ACTIVE) command. Any
activities that are still in progress will be listed. Ensure that all activities are
completed.
196
Ending a data group in a controlled manner
3. Ensure that there are no open commit cycles.The next attempt to start the data
group will fail if open commit cycles exist and either the start request specified to
clear pending entries (CLRPND(*YES)) or the commit mode specified in the data
group definition changed. (Certain process, such as performing a hardware
upgrade with a disk image change, converting to MIMIX Dynamic Apply, or
enabling a disabled data group, require a clear pending start.) To verify commit
cycles, do the following:
a. Press F8 (Database) to view the Data Group Detail Status display.
b. For each apply session listed, verify that the value shown in the Open Commit
column at the right side of the display is *NO.
c. If open commit cycles exist, restart the data group. You must take action to
resolve the open commit cycles, such as ending or quiescing the application or
closing the commit cycle. Then repeat the controlled end again.
197
Ending selected data group processes
Ending selected data group processes
This procedure can be used to end all or selected processes for a data group, or end
a specific database apply process.
Data groups that are in an application group: The preferred method of ending data
groups that are part of an application group is to use the End Application Group
(ENDAG) command. Beginning with service pack 7.1.06.00, the default behavior of
the ENDDG command helps to enforce this best practice when necessary by not
allowing the command to run when the data group is participating in a resource group
with three or more nodes. (A data resource group provides the association between
one or more data groups and an application group.). The ENDDG request will run
when the data group is participating in a resource group with two nodes. In earlier
software levels, default behavior does not allow a end request when the data group is
part of an application group.
In application group environments with three or more nodes, it is particularly important
to treat all members of an application group as one entity. For example, a
configuration change that is made effective by starting and ending a single data group
would not be propagated to the other data groups in the same resource group.
However, the same change would be propagated to the other data groups if it is made
effective by ending and starting the parent application group.
For additional information about the ENDDG command, refer to the following topics:
•
“What occurs when a data group is ended” on page 190
•
“What replication processes are ended by the ENDDG command” on page 203
To selectively end processes for a data group, do the following:
1. From the Work with Data Groups display, type a 10 (End DG) next to the data
group that you want to end and press Enter.
2. The End Data Group (ENDDG) display appears. At the Process prompt, specify
the value for the processes you want to end. To see a list of values, press F4
(Prompt).
3. At the End process prompt, specify the value you want.
4. If the data group uses remote journaling, verify that the value of the End remote
journaling prompt is what you want.
5. If you want to end only a selected apply session, press F10 (Additional
parameters). Then specify the value for the session you want to end at the Apply
session prompt.
6. To end the selected processes, press Enter.
198
What replication processes are started by the STRDG command
What replication processes are started by the STRDG command
MIMIX determines how each data group is configured and starts the appropriate replication processes based on the value you specify for
the Start processes (PRC parameter). Default configuration values create data groups that use MIMIX Remote Journal support (MIMIX RJ
support) for database replication and source-send technology for object replication.
Table 37 identifies the processes that are started when MIMIX RJ support is used for database replication for each of the possible values
on the PRC parameter. An RJ link identifies the IBM i remote journal function, which transfers data to the target system. On the target
system, the data is processed by the MIMIX database reader (DBRDR) before the database apply process (DBAPY) completes
replication.
For data groups that use MIMIX RJ support, it is standard practice to leave the RJ link active when the data groups are ended. If the RJ
link is not already active when starting data groups, MIMIX starts the RJ link when the value specified for the PRC parameter includes
database source system processes or all processes. The RJ Link column in Table 37 shows the result of each process when the RJ link is
not active while the Notes column identifies behavior that may not be anticipated when the RJ link is already active.
Table 37.
Value for
PRC
Processes started by data groups configured for MIMIX Remote Journal support. This assumes that all replication processes are inactive when the STRDG
request is made.
Notes
Source Processes
Target Processes
DB replication
Object replication
DB replication
RJ Link 1
OBJSND
OBJRTV
CNRSND
STSRCV
DBRDR
DBAPY2
OBJRCV
CNRRCV
STSSND
OBJAPY
Object replication
*ALL
E
Starts1
Starts
Starts
Starts
Starts
Starts
Starts
Starts
Starts
Starts
Starts
*ALLSRC
A, E
Starts1
Starts
Starts
Starts
Starts
Inactive
Inactive
Starts
Starts
Starts
Inactive
*ALLTGT
A, B
Inactive1
Inactive
Inactive
Inactive
Inactive
Starts
Starts
Inactive
Inactive
Inactive
Starts
*DBALL
A, E
Starts1
Inactive3
Inactive3
Inactive3
Inactive3
Starts
Starts
Inactive3
Inactive3
Inactive3
Inactive3
Notes:
A. Data groups which use cooperative processing should have both database and object processes started to prevent objects and data on the target system from
becoming not fully synchronized.
B. When the RJ link is already active, database replication becomes operational.
C. When the RJ link is already active, database journal entries continue to transfer to the target system over the RJ link
D. When the RJ link is already active, database journal entries continue to transfer to the target system over the RJ link, where they will be processed by the DBRDR.
E. If data group data area entries are configured, the data area polling process also starts when values which start database source processes are selected.
199
What replication processes are started by the STRDG command
Table 37.
Value for
PRC
Processes started by data groups configured for MIMIX Remote Journal support. This assumes that all replication processes are inactive when the STRDG
request is made.
Notes
Source Processes
Target Processes
DB replication
Object replication
DB replication
RJ Link 1
OBJSND
OBJRTV
CNRSND
STSRCV
DBRDR
DBAPY2
OBJRCV
CNRRCV
STSSND
OBJAPY
Object replication
*OBJALL
A, C
Inactive1
Starts
Starts
Starts
Starts
Inactive4
Inactive 4
Starts
Starts
Starts
Starts
*DBSRC
A, C,
E
Starts1
Inactive3
Inactive3
Inactive3
Inactive3
Inactive
Inactive
Inactive3
Inactive3
Inactive3
Inactive3
*DBTGT
A, B
Inactive1
Inactive3
Inactive3
Inactive3
Inactive3
Starts
Starts
Inactive3
Inactive3
Inactive3
Inactive3
*OBJSRC
A, C
Inactive1
Starts
Starts
Starts
Starts
Inactive4
Inactive4
Starts
Starts
Starts
Inactive
*OBJTGT
A, C
Inactive1
Inactive
Inactive
Inactive
Inactive
Inactive4
Inactive4
Inactive
Inactive
Inactive
Starts
*DBRDR
A, D
Inactive1
Inactive3
Inactive3
Inactive3
Inactive3
Starts
Inactive
Inactive3
Inactive3
Inactive3
Inactive3
*DBAPY
A, C
Inactive1
Inactive3
Inactive3
Inactive3
Inactive3
Inactive4
Starts4
Inactive3
Inactive3
Inactive3
Inactive3
Notes:
A. Data groups which use cooperative processing should have both database and object processes started to prevent objects and data on the target system from
becoming not fully synchronized.
B. When the RJ link is already active, database replication becomes operational.
C. When the RJ link is already active, database journal entries continue to transfer to the target system over the RJ link
D. When the RJ link is already active, database journal entries continue to transfer to the target system over the RJ link, where they will be processed by the DBRDR.
E. If data group data area entries are configured, the data area polling process also starts when values which start database source processes are selected.
1.
2.
3.
4.
This column shows the effect of the specified value on the RJ link when the RJ link is not active. See the Notes for the effect of values when the RJ Link is already active, which is
default behavior.
If the access path maintenance (APMNT) policy has been enabled at the installation or data group level, an access path maintenance job is also started. Access path maintenance is
available on installations running 7.1.15.00 or higher.
These object replication processes are not available in data groups configured for database-only replication.
These database replication processes are not available in data groups configured for object-only replication.
Optionally, data groups can use source-send technology instead of remote journaling for database replication. Data groups created on
earlier levels of MIMIX may still be configured this way.
200
What replication processes are started by the STRDG command
Table 38 identifies the processes that are started by each value for Start processes when source-send technology is used for database
replication. The MIMIX database send (DBSND) process and database receive (DBRCV) process replace the IBM i remote journal
function and the DBRDR process, respectively.
Table 38.
Value for
PRC
Processes started by data groups configured for Source Send replication This assumes that all replication processes are inactive when the STRDG request
is made.
Notes
Source Processes
Target Processes
DB replication
Object replication
DB replication
DBSND 1
OBJSND
OBJRTV
CNRSND
STSRCV
DBRCV
DBAPY2
OBJRCV
CNRRCV
STSSND
OBJAPY
Object replication
*ALL
—
Starts 1
Starts
Starts
Starts
Starts
Starts
Starts
Starts
Starts
Starts
Starts
*ALLSRC
A
Starts 1
Starts
Starts
Starts
Starts
Starts
Inactive
Starts
Starts
Starts
Inactive
*ALLTGT
A
Inactive
Inactive
Inactive
Inactive
Inactive
Inactive
Starts
Inactive
Inactive
Inactive
Starts
*DBALL
A
Starts 1
Inactive3
Inactive3
Inactive3
Inactive3
Starts
Starts
Inactive3
Inactive3
Inactive3
Inactive3
*OBJALL
A
Inactive 4
Starts
Starts
Starts
Starts
Inactive4
Inactive4
Starts
Starts
Starts
Starts
*DBSRC
A
Starts 1
Inactive3
Inactive3
Inactive3
Inactive3
Starts
Inactive
Inactive3
Inactive3
Inactive3
Inactive3
*DBTGT
A
Inactive
Inactive3
Inactive3
Inactive3
Inactive3
Inactive
Starts
Inactive3
Inactive3
Inactive3
Inactive3
*OBJSRC
A
Inactive4
Starts
Starts
Starts
Starts
Inactive4
Inactive4
Starts
Starts
Starts
Inactive
*OBJTGT
A
Inactive4
Inactive
Inactive
Inactive
Inactive
Inactive4
Inactive4
Inactive
Inactive
Inactive
Starts
*DBRDR 5
—
—
Inactive3
Inactive3
Inactive3
Inactive3
—
—
Inactive3
Inactive3
Inactive3
Inactive3
*DBAPY
A
Inactive4
Inactive3
Inactive3
Inactive3
Inactive3
Inactive4
Starts 4
Inactive3
Inactive3
Inactive3
Inactive3
Notes:
A. Data groups which use cooperative processing should have both database and object processes started to prevent objects and data on the target system from
becoming not fully synchronized.
1.
2.
3.
4.
When the database send (DBSND) process starts, the data area polling process also starts.
If the access path maintenance (APMNT) policy has been enabled at the installation or data group level, an access path maintenance job is also started. Access path maintenance is
available on installations running 7.1.15.00 or higher.
These object replication processes are not available in data groups configured for database-only replication.
These database replication processes are not available in data groups configured for object-only replication
201
What replication processes are started by the STRDG command
5.
The database reader (*DBRDR) process is not used by data groups configured for source-send replication.
202
What replication processes are ended by the ENDDG command
What replication processes are ended by the ENDDG command
MIMIX determines how each data group is configured and ends the appropriate replication processes based on the value you specify for
the Process (PRC parameter). Default configuration values create data groups that use MIMIX Remote Journal support (MIMIX RJ
support) for database replication and source-send technology for object replication.
Table 39 identifies the processes that are ended by each value for PRC when MIMIX RJ support is used for database replication. An RJ
link identifies the IBM i remote journal function, which transfers data to the target system. On the target system, the data is processed by
the MIMIX database reader (DBRDR) before the database apply process (DBAPY) completes replication.
The communications defined by the RJ link remains active and is not affected by any value for PRC. In most cases, leaving the RJ link
active is preferable. If necessary, you can end the RJ link by changing value for End remote journaling (ENDRJLNK parameter). “When to
end the RJ link” on page 188 describes when you need to end the RJ link.
Table 39.
Value for
PRC
Processes ended by data groups configured for MIMIX Remote Journal support. This assumes that all replication processes are active when the ENDDG
request is made and that the request does not specify to end the RJ link.
Notes
Source Processes
Target Processes
DB replication
Object replication
DB replication
RJ Link 1
OBJSND
OBJRTV
CNRSND
STSRCV
DBRDR
DBAPY2
OBJRCV
CNRRCV
STSSND
OBJAPY
Object replication
*ALL
E
Active1
Ends
Ends
Ends
Ends
Ends
Ends
Ends
Ends
Ends
Ends
*ALLSRC
A, E
Active1
Ends
Ends
Ends
Ends
Active
Active
Ends
Ends
Ends
Active
*ALLTGT
—
Active1
Active
Active
Active
Active
Ends
Ends
Active
Active
Active
Ends
*DBALL
B, E
Active1
Active 3
Active 3
Active 3
Active 3
Ends
Ends
Active 3
Active 3
Active 3
Active 3
Notes:
A. Has no effect on database-only replication. New database journal entries continue to transfer to the target system over the RJ link, where they will be processed.
B. Data groups that use cooperative processing may be affected by the result of this value. Ending database processes while object processes remain active may
result in object activity entries being placed on hold. Similarly, ending object processes while database processes remain active may result in files being placed on
hold due to error.
C. New database journal entries continue to transfer to the target system over the RJ link. Existing entries stored in the log space on the target system before the end
request was processed will be applied.
D. New database journal entries continue to transfer to the target system over the RJ link, where they will be processed by the DBRDR.
E. The data area polling process ends when values which end database source processes are specified.
203
What replication processes are ended by the ENDDG command
Table 39.
Value for
PRC
Processes ended by data groups configured for MIMIX Remote Journal support. This assumes that all replication processes are active when the ENDDG
request is made and that the request does not specify to end the RJ link.
Notes
Source Processes
Target Processes
DB replication
Object replication
DB replication
RJ Link 1
OBJSND
OBJRTV
CNRSND
STSRCV
DBRDR
DBAPY2
OBJRCV
CNRRCV
STSSND
OBJAPY
Object replication
*OBJALL
A, B
Active1
Ends
Ends
Ends
Ends
Active 4
Active 4
Ends
Ends
Ends
Ends
*DBSRC
A, B,
E
Active1
Active 3
Active 3
Active 3
Active 3
Active
Active
Active 3
Active 3
Active 3
Active 3
*DBTGT
B
Active1
Active 3
Active 3
Active 3
Active 3
Ends
Ends
Active 3
Active 3
Active 3
Active 3
*OBJSRC
A, B
Active1
Ends
Ends
Ends
Ends
Active 4
Active 4
Ends
Ends
Ends
Active
*OBJTGT
A, B
Active1
Active
Active
Active
Active
Active 4
Active 4
Active
Active
Active
Ends
*DBRDR
B, C
Active1
Active 3
Active 3
Active 3
Active 3
Ends
Active
Active 3
Active 3
Active 3
Active 3
*DBAPY
B, D
Active1
Active 3
Active 3
Active 3
Active 3
Active
Ends
Active 3
Active 3
Active 3
Active 3
Notes:
A. Has no effect on database-only replication. New database journal entries continue to transfer to the target system over the RJ link, where they will be processed.
B. Data groups that use cooperative processing may be affected by the result of this value. Ending database processes while object processes remain active may
result in object activity entries being placed on hold. Similarly, ending object processes while database processes remain active may result in files being placed on
hold due to error.
C. New database journal entries continue to transfer to the target system over the RJ link. Existing entries stored in the log space on the target system before the end
request was processed will be applied.
D. New database journal entries continue to transfer to the target system over the RJ link, where they will be processed by the DBRDR.
E. The data area polling process ends when values which end database source processes are specified.
1.
2.
3.
The RJ link is not ended by the End options (PRC) parameter. New database journal entries continue to transfer to the target system over the RJ link. See the Notes column for additional details.
On installations running 7.1.15.00 or higher, if access path maintenance is enabled, the database apply process signals the access path maintenance job and then ends. The access
path maintenance job uses additional jobs, if needed, to change the access path maintenance attribute to immediate on all files that MIMIX had previously changed to delayed. Any
files that could not be changed are identified as having an access path maintenance error before the maintenance jobs end.
On installations running software earlier than 7.1.15.00, if parallel access path maintenance function is enabled, the associated monitors are also ended when ENDDG command
specifies *DFT for End parallel AP maintenance (PRLAPMNT) and *ALL for the Apply session (APYSSN). When *YES is specified for PRLAPMNT, the function is always ended
regardless of the values specified for PRC or APYSSN.
These object replication processes are not available in data groups configured for database-only replication.
204
What replication processes are ended by the ENDDG command
4.
These database replication processes are not available in data groups configured for object-only replication.
Optionally, data groups can use source-send technology instead of remote journaling for database replication. Data groups created on
earlier levels of MIMIX may still be configured this way. Table 40 identifies the processes that are ended by each value for End options
when source-send technology is used for database replication. The MIMIX database send (DBSND) process and database receive
(DBRCV) process are replaced by the IBM i remote journal function and the DBRDR process, respectively.
Table 40.
Value for
PRC
Processes ended by data groups configured for Source Send replication This assumes that all replication processes are active when the ENDDG request is
made.
Notes
Source Processes
Target Processes
DB replication
Object replication
DB replication
DBSND 1
OBJSND
OBJRTV
CNRSND
STSRCV
DBRCV
DBAPY2
OBJRCV
CNRRCV
STSSND
OBJAPY
Object replication
*ALL
—
Ends 1
Ends
Ends
Ends
Ends
Ends
Ends
Ends
Ends
Ends
Ends
*ALLSRC
—
Ends 1
Ends
Ends
Ends
Ends
Ends
Active
Ends
Ends
Ends
Active
*ALLTGT
—
Active
Active
Active
Active
Active
Active
Ends
Active
Active
Active
Ends
*DBALL
A
Ends 1
Active 3
Active 2
Active 2
Active 2
Ends
Ends
Active 2
Active 2
Active 2
Active 2
*OBJALL
A
Active 4
Ends
Ends
Ends
Ends
Active 3
Active 3
Ends
Ends
Ends
Ends
*DBSRC
A
Ends 1
Active 2
Active 2
Active 2
Active 2
Ends
Active
Active 2
Active 2
Active 2
Active 2
*DBTGT
A
Active
Active 2
Active 2
Active 2
Active 2
Active
Ends
Active 2
Active 2
Active 2
Active 2
*OBJSRC
A
Active 3
Ends
Ends
Ends
Ends
Active 3
Active 3
Ends
Ends
Ends
Active
*OBJTGT
A
Active 3
Active
Active
Active
Active
Active 3
Active 3
Active
Active
Active
Ends
*DBRDR 5
—
—
Active 2
Active 2
Active 2
Active 2
—
—
Active 2
Active 2
Active 2
Active 2
*DBAPY
A
Active 3
Active 2
Active 2
Active 2
Active 2
Active 3
Ends 3
Active 2
Active 2
Active 2
Active 2
Notes:
A. Data groups that use cooperative processing may be affected by the result of this value. Ending database processes while object processes remain active may
result in object activity entries being placed on hold. Similarly, ending object processes while database processes remain active may result in files being placed on
hold due to error.
1.
When the database send (DBSND) process ends, the data area polling process also ends.
205
What replication processes are ended by the ENDDG command
2.
3.
4.
5.
On installations running 7.1.15.00 or higher, if access path maintenance is enabled, the database apply process signals the access path maintenance job and then ends. The access
path maintenance job uses additional jobs, if needed, to change the access path maintenance attribute to immediate on all files that MIMIX had previously changed to delayed. Any
files that could not be changed are identified as having an access path maintenance error before the maintenance jobs end.
On installations running software earlier than 7.1.15.00, if parallel access path maintenance function is enabled, the associated monitors are also ended when ENDDG command
specifies *DFT for End parallel AP maintenance (PRLAPMNT) and *ALL for the Apply session (APYSSN). When *YES is specified for PRLAPMNT, the function is always ended
regardless of the values specified for PRC or APYSSN.
These object replication processes are not available in data groups configured for database-only replication.
These database replication processes are not available in data groups configured for object-only replication
The database reader (*DBRDR) process is not used by data groups configured for source-send replication.
206
CHAPTER 11
Resolving common replication
problems
Occasionally, a journaled transaction for a file or object may fail to replicate. User
intervention is required to correct the problem. This chapter provides procedures to
help you resolve problems that can occur during replication processing.
The following topics are included in this chapter:
•
“Working with message queues” on page 208 describes how to use the MIMIX
primary and secondary message queues from a 5250 emulator.
•
“Working with the message log” on page 209 describes how to access the MIMIX
message log from either user interface.
•
“Working with user journal replication errors” on page 210 includes topics for how
to resolve a file that is held due to an error. It also includes topics about options for
placing a file on hold and releasing held files.
•
“Working with tracking entries” on page 219 describes how to use tracking entries
to resolve replication errors for IFS objects, data areas, or data queues that are
replicated cooperatively with the user journal. It also includes topics about options
for placing a tracking entry on hold and releasing held tracking entries.
•
“Working with objects in error” on page 224 describes how to resolve objects in
error by working with the data group activities used for system journal replication.
This topic includes information about how to retry failed activity entries and how to
determine whether MIMIX is automatically attempting to retry an activity.
•
“Removing data group activity history entries” on page 229 describes how to
manually remove completed entries for system journal replication activity. This
may be necessary if you need to conserve disk space.
207
Working with message queues
Working with message queues
You can access the MIMIX primary and secondary message queues to display
messages or manage the list of messages.
Do the following to access a MIMIX message queue:
1. Type the command DSPMMXMSGQ and press F4 (Prompt).
2. Specify either *PRI or *SEC to access the message queue you want and press
Enter.
3. The Display MIMIX Message Queue display appears listing all of the current
messages. To view all of the information for a message, place the cursor on the
message you want and press Enter.
You can also use the function keys on this display to perform several message-related
tasks. Refer to the help text (F1 key) for information about these function keys.
Note: The MIMIX primary and secondary message queues are defined for each
system definition. You can control the severity and type of messages to be
sent to each message queue through parameters on the system definition.
208
Working with the message log
Working with the message log
The MIMIX message log provides a common location for you to see all messages
related to MIMIX products. A consolidated list of messages for all systems in the
installation library is available on the management system.
Note: The target system only shows messages that occurred on the target system.
LVI messages are informational messages and LVE messages are error or diagnostic
messages. CPF messages are generated by an underlying operating function and
may be passed up to the MIMIX product.
Do the following to access the MIMIX message log:
1. Do one of the following to access the message log display:
•
From the MIMIX Basic Main Menu, select option 13 (Work with messages) and
press Enter.
•
From the MIMIX Intermediate Main Menu, select option 3 (Work with
messages) and press Enter.
2. The Work with Message Log appears with a list of the current messages. The
initial view shows the message ID and text.
3. Press F11 to see additional views showing the message type, severity, the
product and process from which it originated, whether it is associated with a group
(for MIMIX, a data group), and the system on which it originated.
4. You can subset the messages shown on the display. A variety of subsetting
options are available that allow you to manage the message log more efficiently.
5. To work with a message, type the number of the option you want and press Enter.
The following options are available:
•
4=Remove - Use this option if you want to delete a message. When you select
this option, a confirmation display appears. Verify that you want to delete the
messages shown and press Enter. The message is deleted only from the local
system.
•
5=Display message - Use this option to view the full text of the first level
message and gain access to the second level text.
•
6=Print - Use this option to print the information for the message.
•
8=Display details - Use this option to display details for a message log entry
including its from and to program information, job information, group
information, product, process, originating system, and call stack information.
•
9=Related messages - Use this option to display a list of messages that relate
to the selected message. Related messages include a summary and any detail
messages immediately preceding it. This can be helpful when you have a large
message log list and you want to show the messages for a certain job.
•
12=Display job - If job information exists on the system, you can use this option
to access job information for a message log entry. The Work with Jobs display
appears from which you can select options for displaying specific information
about the job.
209
Working with user journal replication errors
Working with user journal replication errors
MIMIX reports user journal replication errors for files as status on the associated data
group file entry. This status is also reported at the data group level in a consolidated
form.
File replication problems are categorized as follows:
Held due to error - If a journal transaction is not replicated successfully, the file
entry is placed in *HLDERR status. This indicates a problem that must be
resolved.
Held for other reasons - File entries can also be placed in a variety of other held
statuses by user action or by MIMIX. Generally, these statuses are also
considered problems; some are transitional conditions that resolve automatically
while others require user action. To determine if there are files on hold for other
reasons, use the procedure in “Working with the detailed status of data groups” on
page 105.
For information about resolving problems with IFS objects and library-based objects
that are replicated by user journal, see “Working with tracking entries” on page 219.
Working with files needing attention (replication and access path errors)
The DB Errors column on the Work with Data Groups display identifies the number of
errors for user journal replication. Specifically, this column identifies the sum of the
number of database files, IFS, *DTAARA, and *DTAQ objects on hold due to errors
(*HLDERR) plus the number of LF and PF files that have access path maintenance1
failures for a data group. Data group file entries and tracking entries should not be left
in *HLDERR state for any extended time. Access path maintenance errors occur
when MIMIX could not change a file’s access path maintenance attribute back to
immediate.
To access a list of files in error for a data group, do the following:
1. From the MIMIX Basic Main Menu select option 6 (Work with data groups) and
press Enter.
2. The Work with Data Groups display appears. Type 12 (Files needing attention)
next to the data group you want which has errors identified in the DB Errors
column and press Enter.
3. The Work with DG File Entries display appears with a list of file entries for the data
group that have replication errors, access path maintenance2 errors, or both. Do
the following:
a. The initial view shows the current replication status of file entries. Any entry
with a status of *HLD, *HLDERR, *HLDIGN or *HLDRLTD indicates that action
is required. Use Table 41 to identify choices based on the file entry status.
1. Errors for the access path maintenance function are included on installations running MIMIX
7.1.15.00 or higher.
2. Access path maintenance errors can only be reported on data group file entries in installations
running MIMIX 7.1.15.00 or higher.
210
Working with user journal replication errors
Note: MIMIX retains log spaces for file entries with these statuses so that the
journal entries that are being held can be released and applied to the
target system. File entries should not be left in these states for an
extended period.
b. Use Table 41 to identify choices based on the file entry status and Table 42 to
identify available options from this display.
c. If necessary, take action to prevent the error from happening again. Refer to
the following topics:
• “Correcting file-level errors” on page 216
• “Correcting record-level errors” on page 217
4. Press F10 as needed on the Work with DG File entries display until you see the
access path maintenance view. The AP Maint. Status column identifies any AP
maintenance errors for a file with the value *FAILED and failures for logical files
associated with a file as *FAILEDLF.
Immediate action may not be necessary because MIMIX will attempt to retry
access path maintenance when the data group ends and when it is restarted. To
attempt an immediate retry, use option 40 (Retry AP maintenance).
Table 41.
Possible actions based on replication status of a file entry
Status
Preferred Action1
*ACTIVE
Unless an error has occurred, no action is necessary. Entries in the user
journal for the file are replicated and applied. If necessary, any of the
options to hold journal entries can be used.
*HLD
User action is required to release the file entry (option 26) so that held
journal entries from the user journal can be applied to the target system.
*HLDERR
User action is required. Attempt to resolve the error by synchronizing the
file (option 16).
Note: Transactions and hold logs are discarded for file entries with a status of
*HLDERR and an error code of IG. Such a file must be synchronized.
*HLDIGN
User action is required to either synchronize the file (option 16) or to
change the configuration if you no longer want to replicate the file.
Journal entries for the file are discarded. Replication is not occurring and
the file may not be synchronized.
Depending on the circumstances, Release may also be an option.
*HLDRGZ
*HLDRNM
*HLDPRM
*HLDSYNC
These are transitional states that should resolve to *ACTIVE. If these
status persist, check the journaling status for the entry. MIMIX retains log
spaces for the held journal entries for the duration of these temporary
hold requests.
211
Working with user journal replication errors
Table 41.
Possible actions based on replication status of a file entry
Status
Preferred Action1
*HLDRTY
The file entry is held because an entry could not be applied due to a
condition which required waiting on some other condition (such as inuse). After a short delay, the database apply job will automatically
attempt to process this entry again. The preferred action is to allow
MIMIX to periodically retry the file entry. By default, the database apply
job will automatically attempt to process the entry every 5 minutes for up
to 1 hour.
Manually releasing the file entry will cause MIMIX to attempt to process
the entry immediately
*HLDRLTD
User action is required for a file in the same network. View the related
files (option 35). A file that is related due to a dependency, such as a
constraint or a materialized query table, is held. Resolving the problem
for the related held file will resolve this status.
*RLSWAIT
The file is waiting to be released by the DB apply process and will be
changed to *ACTIVE. If the status does not change to *ACTIVE, check
the journaling status. If this status persists, you may need to synchronize
(option 16).
*CMPACT
*CMPRLS
*CMPRPR
These are transitional states that should resolve automatically. The file
entry represents a member that is being processed cooperatively
between the CMPFILDTA command and the database apply process.
1.
Evaluate the cause of the problem before taking any action.
Table 42.
Options for working with file entries from the Work with DG FIle Entries display
Option
Additional Information
9=Start journaling
See “Starting journaling for physical files” on page 235.
10=End journaling
See “Ending journaling for physical files” on page 236.
11=Verify journaling
See “Verifying journaling for physical files” on page 237.
16=Sync DG file
entry
See topic ‘Synchronizing database files’ in the MIMIX
Administrator Reference book.
20=Work with file
error entries
See topic “Working with journal transactions for files in error” on
page 213.
23=Hold file
See topic “Placing a file on hold” on page 214.
24=Ignore file
See topic “Ignoring a held file” on page 214.
25=Release wait
See topic “Releasing a held file at a synchronization point” on
page 215.
26=Release
See topic “Releasing a held file” on page 215.
27=Release clear
See topic “Releasing a held file and clearing entries” on page 216.
212
Working with user journal replication errors
Table 42.
Options for working with file entries from the Work with DG FIle Entries display
Option
Additional Information
31=Repair member
data
Available for entries with a status of *HLDERR that identify a
member. See topic ‘Comparing and repairing file data - members
on hold (*HLDERR)’ in the MIMIX Administrator Reference book.
35=Work with related
files
Displays file entries that are related to the selected file by
constraints or by other dependencies such as materialized query
tables
40=Retry AP
maintenance
Retries access path maintenance operations on the target system
for the selected file. This option is only valid on data group file
entries that have an access path maintenance status of *FAILED
or *FAILEDLF.
Working with journal transactions for files in error
When resolving problems for a file that is in *HLDERR state, a MIMIX administrator
may find it useful to examine the journal entries that are being held by MIMIX.
Although you can determine why a file is in error from either the source or target
system, to view the actual journal entries, you must be on the target system. If you
attempt to view the journal entries from the source system, MIMIX will indicate that
you are on the incorrect system to view the information.
Do the following:
1. From the subsetted list of files in error for a data group on the Work with DG File
Entries display, type 20 (Work with file error entries) next to the file entry you want
and press Enter.
2. The Work with DG FE on Hold display appears. A variety of information about the
transaction appears on the display.
Note: The values shown in the Sequence number column may be truncated if
the journal supports *MAXOPT3 for the receiver size and the journal
sequence number value exceeds the available display field. When
truncation is necessary, the most significant digits (left-most) are omitted.
Truncated journal sequence numbers are prefixed by '>'. The First journal
sequence number field displays the full sequence number of the first item
displayed in the list.
a. Locate the transaction that caused the file to be placed on hold. Use the
Position to field to position the list to a specific sequence number.
b. Select the option (Table 43) you want to use on the journal transaction:
Table 43.
2=Change
Options available from the Work with DG FE on Hold display.
You can change the contents or characteristics of the journal entry.
Use this option with caution. Any changes can affect the validity of
data in the journal entry.
213
Working with user journal replication errors
Table 43.
Options available from the Work with DG FE on Hold display.
4=Delete
You can delete the journal entry.
5=Display
You can display details for the specified journal entry associated with
the data group file entry in question.
9=Immediate
apply
You can immediately apply a transaction that has caused a file to go
on hold. The entry you selected is immediately applied to the file
outside of the apply process. If the apply is successful, the error/hold
entry that was applied is removed from the error/hold log. However, if
the apply fails, a message is issued and the entry remains in the
error/hold log. This process does not release the file; it only applies the
selected entry.
Placing a file on hold
Use this procedure to hold any journal entries for a file identified by a data group file
entry. Avoid leaving a file entry on hold for any extended period.
File entries with a status of *ACTIVE, *HLDRGZ, *HLDRNM, *HLDPRM, *HLDSYNC,
*HLDRLTD, and *RLSWAIT can be placed on hold.
The request changes the file entry status to *HLD. Any journal entries for the
associated file are replicated but not applied. If the file is being processed by an active
apply session, suspending of the update process can take a short time to complete.
You will receive a message when the file is held. MIMIX retains log spaces containing
any replicated journal entries in anticipation that the file entry will be released. When
the file is released, the accumulated journal entries will be applied. The *HLD status
remains until additional action is taken.
Do the following:
1. From the MIMIX Basic Main Menu select option 6 (Work with Data Groups) and
press Enter.
2. The Work with Data Groups display appears. Type 17 (File entries) next to the
data group you want and press Enter.
3. The Work with DG File Entries display appears. Type 23 (Hold file) next to the
entry you want and press Enter.
Ignoring a held file
Use this procedure to ignore any journal entries for an file identified by a data group
file entry. The request changes the file entry status to *HLDIGN. Any journal entries
for the associated file, including any hold logs, are discarded. The *HLDIGN status
remains until additional action is taken.
Note: Be certain that you want to use the ignore feature. Any ignored transactions
cannot be retrieved. You must replace the object on the target system with a
current version from the source system.
If a file has been on hold for a long time or you expect that it will be, the amount of
storage used by the error/hold log space can be quite large. If you anticipate that you
214
Working with user journal replication errors
will need to save and restore the file or replace it for any other reason, it may be best
to just ignore all current transactions.
Do the following:
1. From the MIMIX Basic Main Menu select option 6 (Work with Data Groups) and
press Enter.
2. The Work with Data Groups display appears. Type 17 (File entries) next to the
data group you want and press Enter.
3. The Work with DG File Entries display appears. Type 24 (Ignore file) next to the
entry you want and press Enter.
The status of the file is changed to *HLDIGN. The file entry is ignored. Journal
entries for the file entry, including any hold logs, are discarded.
Releasing a held file at a synchronization point
Use this procedure to wait for a synchronization point to release any held journal
entries for file identified by a data group file entry, then resume replication.
The request changes the file entry status to *RLSWAIT. Any journal entries for the
associated file are discarded until a File member saved (F-MS) journal entry or a Start
of save of a physical file member using save-while-active function (F-SS) is
encountered. This is the synchronization point. The file entry status is then changed
to *ACTIVE and all journal entries that were held after the synchronization point are
applied.
If the F-MS or F-SS journal entry is not in the log space, the file entry remains in
*RLSWAIT status. If you are unsure as to how many save requests might accumulate
for an object, you can synchronize the file associated with the file entry. The entry
status will become *ACTIVE.
To wait for a synchronization point before releasing a held file, do the following:
1. From the MIMIX Basic Main Menu select option 6 (Work with Data Groups) and
press Enter.
2. The Work with Data Groups display appears. Type 17 (File entries) next to the
data group you want and press Enter.
3. The Work with DG File Entries display appears. Type 25 (Release wait) next to
the entry you want and press Enter.
Releasing a held file
Use this procedure to immediately release any held journal entries for file identified by
a data group file entry with a status of *HLD and resume replication.
The request changes the file entry status to *ACTIVE. Any held journal entries for the
associated file are applied. Normal replication of the file resumes.
While a file is being released, the appropriate apply session suspends its operations
on other files. This allows the released file to catch up to the current level of
processing. If a file or member has been on hold for a long time, this can be lengthy.
215
Working with user journal replication errors
Do the following to immediately release a held file or file member:
1. From the MIMIX Basic Main Menu select option 6 (Work with Data Groups) and
press Enter.
2. The Work with Data Groups display appears. Type 17 (File entries) next to the
data group you want and press Enter.
3. The Work with DG File Entries display appears. Type 26 (Release) next to the
entry you want and press Enter
Releasing a held file and clearing entries
Use this procedure to clear any held journal entries for a file identified by a data group
file entry, then resume replication.
The request changes the file entry status to *ACTIVE. Any held journal entries for the
associated file are discarded. Journal entries received after the file entry status
became *ACTIVE are applied, resuming normal replication.
If a file entry is on hold and its associated file has been synchronized in such a way
that the held entries already exist in the restored file, this procedure will ensure that
those entries are not re-applied. This procedure will not work if the file is being
actively updated on the source system.
Do the following to release a held file and clear any journal entries that were
replicated but not applied:
1. From the MIMIX Basic Main Menu select option 6 (Work with Data Groups) and
press Enter.
2. The Work with Data Groups display appears. Type 17 (File entries) next to the
data group you want and press Enter.
3. The Work with DG File Entries display appears. Type 27 (Release clear) next to
the entry you want and press Enter
Correcting file-level errors
Typically, file-level errors can be categorized as one of the following:
•
A problem with the configuration of files defined for replication.
•
A discrepancy in the file descriptions between the management and network
systems
•
An operational error.
This topic identifies the most common file-level errors and measures that you can
take to prevent the problem from recurring. See also “Correcting record-level errors”
on page 217.
Once you diagnose and correct a file-level error, the problem rarely manifests itself
again. Some of the most common file-level errors are:
•
Authority: The MIMIXOWN user profile defined in the MIMIX job description does
not have authority to perform a function on the target system. You can prevent this
problem by ensuring that the MIMIXOWN user profile has all object authority
216
Working with user journal replication errors
(*ALLOBJ). This guarantees that the user profile has all the necessary authority to
run IBM i commands and has the ability to access the library and files on the
management system. Refer to the Using License Manager book for more
information about the MIMIXOWN user profile and authority.
•
Objects existence or corruption: MIMIX cannot run a function against a file on
the target system because the file or a supporting object (such as logical files)
does not exist or has become damaged. System security is the only way to
prevent an object from being accidentally deleted from the target system. Make
sure that only the correct personnel have the ability to remove objects from the
target system where replicated data is applied. Also, ensure that application
programs do not delete files on the target system when there are no apply
sessions running.
•
MIMIX subsystem ended: If the MIMIX subsystem is ended in an immediate
mode while MIMIX processes are still active, files may be placed in a “Held”
status. This is a result of MIMIX being unable to complete a transaction normally.
After MIMIX is restarted, you only need to release the affected files.
Correcting record-level errors
Record-level errors occur when MIMIX updates or attempts to update a file and the
feedback from the update process indicates a discrepancy between the files on the
management and network system. Record-level errors can usually be traced back to
problems with one of the following:
•
The system
•
Unique application environments, such as System 36 code running in native IBM
i.
•
Operational errors.
Record written in error
This section describes the most common record-level errors. MIMIX DB Replicator
was able to write the record on the target system; however, it wrote to the wrong
relative record number. In most situations, the IBM i database function writes a new
record to the end of a file. MIMIX did so, but it did not match the relative record
number of the sending system. Usually this error occurs when transactions (journal
entries) are skipped on the send system. Common reasons why records are written in
error include the following:
•
Journaling was ended: When journaling is ended, transaction images are not
being collected. If users update the files while journaling is not running, no journal
entries are created and MIMIX DB Replicator has no way of replicating the
missing transactions. The best way to prevent this error is to restrict the use of the
Start Journaling Physical File (STRJRNPF) and End Journaling Physical File
(ENDJRNPF) commands.
•
User journal replication was restarted at the wrong point: When you change
the starting point of replication for a data group, it is imperative that transactions
are not skipped.
217
Working with user journal replication errors
•
Apply session restarted after a system failure: This is caused when the target
system experiences a hard failure. MIMIX always updates its user spaces with the
last updated and sent information. When a system fails, some information may not
be forced to disk storage. The data group definition parameter for database apply
processing determines how frequently to force data to disk storage. When the
apply sessions are restarted, MIMIX may attempt to rewrite records to the target
system database.
•
Unable to write/update a record: This error is caused when MIMIX cannot
access a record in a file. This is usually caused when there are problems with the
logical files associated with the file or when the record does not exist. The best
way to prevent this error is to make sure that replication is started in the correct
position. This error can also be due to one of the problems listed in topic
“Correcting file-level errors” on page 216.
•
Unable to delete a record: This is caused when MIMIX is trying to delete a
record that does not exist or has a corrupted logical file associated with the
physical file. This error can also be due to one of the problems listed in topic
“Correcting file-level errors” on page 216.
218
Working with tracking entries
Working with tracking entries
Tracking entries identify library-based objects (data areas and data queues) and IFS
objects configured for cooperative processing (advanced journaling).
You can access the following displays to work with tracking entries in any status:
•
Work with DG IFS Trk. Entries display (WRKDGIFSTE command)
•
Work with DG Obj. Trk. Entries display (WRKDGOBJTE command)
These displays provide access for viewing status and working with common problems
that can occur while replicating objects identified by IFS and object tracking entries.
Held tracking entries: Status for the replicated objects is reported on the associated
tracking entries. If a journal transaction is not replicated successfully, the tracking
entry is placed in *HLDERR status. This indicates a problem that must be resolved.
Tracking entries can also be placed in *HLD, *HLDIGN statuses by user action These
statuses are reported as ‘held for other reasons’ and also require user action.
When a tracking entry has a status of *HLD or *HLDERR, MIMIX retains log spaces
so that journal entries that are being held can be released and applied to the target
system. Tracking entries should not be left in these states for an extended period.
Additional information: To determine if a data group has any IFS objects, data
areas, or data queues configured for advanced journaling, see “Determining if non-file
objects are configured for user journal replication” on page 271.
When working with tracking entries, especially for IFS objects, you should be aware of
the information provided in “Displaying long object names” on page 262.
Accessing the appropriate tracking entry display
To access IFS tracking entry or object tracking entry displays for a data group, do the
following:
1. From the MIMIX Basic Main Menu select option 6 (Work with Data Groups) and
press Enter.
2. The Work with Data Groups display appears. Type 17 (File entries) next to the
data group you want and press Enter.
3. Next to the data group you want, type the number for the option you want and
press Enter. Table 44 shows the options for tracking entries.
Table 44.
Tracking entry options on the Work with Data Groups display
Select Option
Result
50=IFS trk entries
Lists all IFS tracking entries for the selected data group on the
Work with DG IFS Trk. Entries display.
51=IFS trk entries
not active
Lists IFS tracking entries for the selected data group with inactive
status values (*HLD, *HLDERR, *HLDIGN, *HLDRNM, and
*RLSWAIT) on the Work with DG IFS Trk. Entries display.
219
Working with tracking entries
Table 44.
Tracking entry options on the Work with Data Groups display
Select Option
Result
52=Obj trk entries
Lists all object tracking entries for the selected data group on the
Work with DG Obj. Trk. Entries display.
53=Obj trk entries
not active
Lists object tracking entries for the selected data group with
inactive status values (*HLD, *HLDERR, *HLDIGN, and
*RLSWAIT) on the Work with DG Obj. Trk. Entries display.
4. The tracking entry display you selected appears. Significant capability is available
for addressing common replication problems and journaling problems. Do the
following:
a. Use F10 to toggle between views showing status, journaling status, and the
database apply session in use.
b. Any entry with a status of *HLD, *HLDERR or *HLDIGN indicates that action is
required. The identified object remains in this state until action is taken.
Statuses of *HLD and *HLDERR result in journal entries being held but not
applied. Use Table 45 to identify choices based on the tracking entry status.
c. Use options identified in Table 46 to address journaling problems or replication
problems.
Table 45.
Possible actions based on replication status of a tracking entry
Status
Preferred Action1
*ACTIVE
Unless an error has occurred, no action is necessary. Entries in the user
journal for the IFS object are replicated and applied. If necessary, any of
the options to hold journal entries can be used.
*HLD
User action is required to release the entry (option 26) so that held
journal entries from user journal can be applied to the target system.
*HLDERR
User action is required. Attempt to resolve the error by synchronizing the
file (option 16).
*HLDIGN
User action is required to either synchronize the object (option 16) or to
change the configuration if you no longer want to replicate the object.
Journal entries for the object are discarded. Replication is not occurring
and the object may not be synchronized.
Depending on the circumstances, Release may also be an option.
*HLDRNM
This is a transitional state for IFS tracking entries that should resolve to
*ACTIVE. If this status persists, check the journaling status for the entry.
Object tracking entries cannot have this status.
*RLSWAIT
If the status does not change to *ACTIVE, you may need to synchronize
(option 16)
1.
Evaluate the cause of the problem before taking any action.
220
Working with tracking entries
Table 46.
Options for working with tracking entries
Option
Additional Information
4=Remove
See “Removing a tracking entry” on page 223
5=Display
Identifies an object, its replication status, journaling status, and the
database apply session used.
6=Print
Creates a spooled file which can be printed
9=Start journaling
See “Starting journaling for IFS objects” on page 238 and “Starting
journaling for data areas and data queues” on page 241.
10=End journaling
See “Ending journaling for IFS objects” on page 239 and “Ending
journaling for data areas and data queues” on page 242.
11=Verify journaling
See “Verifying journaling for IFS objects” on page 240 and
“Verifying journaling for data areas and data queues” on page 243.
16=Synchronize
Synchronizes the contents, attributes, and authorities of the object
represented by the tracking entry between the source and target
systems.
For more information, see topic ‘Synchronizing tracking entries’ in
the MIMIX Administrator Reference book.
23=Hold
See “Holding journal entries associated with a tracking entry” on
page 221.
24=Ignore
See “Ignoring journal entries associated with a tracking entry” on
page 222.
25=Release wait
See “Waiting to synchronize and release held journal entries for a
tracking entry” on page 222.
26=Release
See “Releasing held journal entries for a tracking entry” on
page 223.
27=Release clear
See “Releasing and clearing held journal entries for a tracking
entry” on page 223.
Holding journal entries associated with a tracking entry
Use this procedure to hold any journal entries for an object identified by a tracking
entry. Avoid leaving a tracking entry on hold for any extended period.
The request changes the tracking entry status to *HLD. Any journal entries for the
associated IFS object, data area, or data queue are replicated but not applied. MIMIX
retains log spaces containing any replicated journal entries in anticipation that the
tracking entry will be released. When the tracking entry is released, the accumulated
journal entries will be applied. The *HLD status remains until additional action is
taken.
Do the following:
1. Access the IFS or object tracking entry display as described in “Accessing the
221
Working with tracking entries
appropriate tracking entry display” on page 219.
2. Type 23 (Hold) next to the tracking entry for the object you want and press Enter.
Ignoring journal entries associated with a tracking entry
Use this procedure to ignore any journal entries for an object identified by a tracking
entry. The request changes the tracking entry status to *HLDIGN. Any journal entries
for the associated IFS object, data area, or data queue, including any hold logs, are
discarded. The *HLDIGN status remains until additional action is taken.
Note: Be certain that you want to use the ignore feature. Any ignored transactions
cannot be retrieved. You must replace the object on the target system with a
current version from the source system.
If a tracking entry has been on hold for a long time or you expect that it will be, the
amount of storage used by the error/hold log space can be quite large. If you
anticipate that you will need to save and restore the object or replace it for any other
reason, it may be best to just ignore all current transactions.
Do the following:
1. Access the IFS or object tracking entry display as described in “Accessing the
appropriate tracking entry display” on page 219.
2. Type 24 (Ignore) next to the tracking entry for the object you want and press
Enter.
Waiting to synchronize and release held journal entries for a tracking
entry
Use this procedure to wait for a synchronization point to release any held journal
entries for an object identified by a tracking entry, then resume replication.
The request changes the tracking entry status to *RLSWAIT. Any journal entries for
the associated IFS object, data area, or data queue are discarded until an object
saved journal entry is encountered. This is the synchronization point. The tracking
entry status is then changed to *ACTIVE and all journal entries that were held after
the synchronization point are applied.
If the object saved journal entry is not in the log space, the tracking entry remains in
*RLSWAIT status. If you are unsure as to how many save requests might accumulate
for an object, you can synchronize the object associated with the tracking entry. The
tracking entry status will become *ACTIVE.
Do the following:
1. Access the IFS or object tracking entry display as described in “Accessing the
appropriate tracking entry display” on page 219.
2. Type 25 (Release wait) next to the tracking entry for the object you want and
press Enter.
222
Working with tracking entries
Releasing held journal entries for a tracking entry
Use this procedure to immediately release any held journal entries for an object
identified by a tracking entry with a status of *HLD or *HLDERR and resume
replication.
The request changes the tracking entry status to *ACTIVE. Any held journal entries
for the associated IFS object, data area, or data queue are applied. Normal replication
of the object resumes.
Do the following:
1. Access the IFS or object tracking entry display as described in “Accessing the
appropriate tracking entry display” on page 219.
2. Type 26 (Release) next to the tracking entry for the object you want and press
Enter
Releasing and clearing held journal entries for a tracking entry
Use this procedure to clear any held journal entries for an object identified by a
tracking entry, then resume replication.
The request changes the tracking entry status to *ACTIVE. Any held journal entries
for the associated IFS object, data area, or data queue are discarded. Journal entries
received after the tracking entry status became *ACTIVE are applied, resuming
normal replication.
If a tracking entry is on hold and its associated object has been synchronized in such
a way that the held entries already exist in the restored object, this procedure will
ensure that those entries are not re-applied.
Do the following:
1. Access the IFS or object tracking entry display as described in “Accessing the
appropriate tracking entry display” on page 219.
2. Type 27 (Release clear) next to the tracking entry for the object you want and
press Enter.
Removing a tracking entry
Use this procedure to remove a duplicate tracking entry for an IFS object, data area,
or data queue. A tracking entry with a status of *HLDERR cannot be removed.
Note: Do not use this procedure to prevent user journal replication of an object
represented by a tracking entry. If you need to exclude the object from
replication or have it replicated through the system journal instead of the user
journal, change or create the appropriate data group IFS entry or object entry.
Do the following:
1. Access the IFS or object tracking entry display as described in “Accessing the
appropriate tracking entry display” on page 219.
2. Type 4 (Remove) next to the tracking entry you want to remove and press Enter.
3. You will see a confirmation display. To remove the tracking entry, press Enter.
223
Working with objects in error
Working with objects in error
Use this topic to work with replication errors for objects replicated through the system
journal.
To access a list of objects in error for a data group, do the following:
1. From the MIMIX Basic Main Menu select option 6 (Work with Data Groups) and
press Enter.
2. The Work with Data Groups display appears. Type 13 (Objects in error) next to
the data group you want which has values shown in the Obj Errors column and
press Enter.
3. The Work with Data Group Activity display appears with a list of the objects in
error for the data group you selected. You can do any of the following:
•
Use F10 (Error view) to see the reason why the object is in error.
•
Use F11 to change between views for objects, DLOs, IFS objects, and spooled
files.
•
Use the options identified in Table 47 to resolve the errors. Type the number of
the option you want next to the object and press Enter
Table 47.
Options on the Work with Data Group Activity display for working with objects in
error.
4=Remove
Use this option to remove an entry with a *COMPLETED or
*FAILED status from the list. For entries with *FAILED status, this
option removes only the failed entry. Prompting is available for
extended capability. You may need to take action to synchronize
the object associated with the entry.
Note: If an entry with a status of *FAILED has related entries in
*DELAYED status, you can remove both the failed and the
delayed entries in one operation by using option 14 (Remove
related).
For more information, see “Removing data group activity history
entries” on page 229.
7=Display message
Use this option to display any error message that is associated
with the entry.
8=Retry
Use this option to retry the data group activity. MIMIX changes the
entry status to pending and attempts the failed operation again.
Note: It is possible to schedule the request for a time when the retry is
more likely to be successful. For more information about retrying
failed entries, see “Retrying data group activity entries” on
page 227.
224
Working with objects in error
Table 47.
Options on the Work with Data Group Activity display for working with objects in
error.
12=Work with entries
Use this option to access the Work with DG Activity Entries
display. From the display you can display additional information
about replicated journal transactions for the object, including the
journal entry type and access type (if available), as well as see
whether the object is undergoing delay retry processing. You can
also take options to display related entries, view error messages
for a failure, and synchronize the object. For more information,
see “Using the Work with DG Activity Entries display” on
page 225.
14=Remove related
Use this option to remove an entry with a status of *FAILED and
any related entries that have a status of *DELAYED. You may
need to take action to synchronize the object associated with the
entry.
Using the Work with DG Activity Entries display
From the Work with DG Activity Entries display, you can display information about and
take actions on activity entries for a replicated object. To access the display, select
option 12 (Work with entries) from the Work with Data Group Activity display.
Table 48 lists the available options.
Table 48.
Options available on the Work with DG Activity Entries display.
4=Remove
Use this option to remove an individual entry with a
*COMPLETED or *FAILED status from the list. For entries with
*FAILED status, this option removes only the failed entry. You may
need to take action to synchronize the object associated with the
entry.
Note: No prompting is available when using this option from this display.
To prompt for additional capability, use the option to remove from
the Work with Data Group Activity display. For more information,
see “Removing data group activity history entries” on page 229.
5=Display
Use this option display details about the individual entry. The
information available about the object includes whether the object
is undergoing delay retry processing, and journal entry
information, including access type information for T-SF, T-YC, and
T-ZC journal entry types. For more information, see “Determining
whether an activity entry is in a delay/retry cycle” on page 228
6=Print
Use this option to print the entry.
7=Display message
Use this option to display the error message associated with the
processing failure for the entry.
8=Retry
Use this option to retry the data group activity entry. MIMIX
changes the entry status to pending and attempts the failed
operation again as soon as possible.
225
Working with objects in error
Table 48.
Options available on the Work with DG Activity Entries display.
9=Display related
Displays entries related to the specified object. For example, use
this option to see entries associated with a move or rename
operation for the object.
12=Display job
Displays the job that was processing the object when the error
occurred, if the still job information exists and is on this system.
16=Synchronize
Use this option to synchronize objects defined to MIMIX for
system journal replication (objects that are not configured for
cooperative processing). Activity entries with *ACTIVE or
*COMPLETED status can be synchronized, as well as entries
with a *FAILED status and with the following journal types: T-CO,
T-CP, T-OR, T-SE, T-ZC (see notes), T-YC, and T-SF (see notes).
A confirmation display allows you to confirm your choices before
the request is processed. Entries are placed in a ‘pending
synchronization’ status. When the data group is active, the
contents of the object, its attributes, and its authorities are
synchronized between the source and target systems. The status
of the activity entry is set to ‘completed by synchronization.’
Notes:
• To synchronize files defined for cooperative processing, use
the Synchronize DG File Entry (SYNCDGFE) command.
• Spooled files (T-SF journal entries) with the following access
types can be synchronized: C = spooled file created; U =
spooled file changed.
• Changed objects (T-ZC journal entries) with the following
access types can be synchronized: 1 (Add); 7 (Change); 25
(Initialize); 29 (Merge); 30 (Open); 34 (Receive); 36
(Reorganize); 50 (Set); and 51 (Send).
226
Retrying data group activity entries
Data group activity entries that did not successfully complete replication have a status
of *FAILED. These failed data group activity entries are also called error entries. You
can request to retry processing for these activity entries.
Activity entries with a status of *ACTIVE can also be retried in some circumstances.
For example, you may want to retry an entry that is delayed but which has no
preceding pending activity entry. Or, you may want to retry a pending entry that is
undergoing processing in a delay retry cycle.
The retry request places the activity entry in the queue for processing by the system
journal replication process where the failure or delay occurred. Activity entries with a
status of *FAILED or *DELAYED are set to *PENDING until they are processed.
Retrying a failed data group activity entry
You can manually request that MIMIX retry processing for a data group activity entry
that has a status of *FAILED. The retry can be requested from either the Work with
Data Group Activity display or from the Work with DG Activity Entries display.
Note: Only the Work with Data Group Activity supports the ability to schedule the
retry request for a time in the future when the request is more likely to be
successful.
To retry failed (error) activity entries, do the following:
1. From the Work with Data Groups display, type a 13 (Objects in error) next to the
data group you want that has values shown in the Obj Errors column and press
Enter.
2. The Work with Data Group Activity display appears with a list of the objects in
error for the data group selected. Type an 8 (Retry) next to the entry you want and
do one of the following:
•
To submit the retry request for immediate processing, press Enter. Then skip to
Step 4.
•
To schedule the retry request for a time at which it is more likely to be
successful, press F4 (Prompt).
3. On the Retry DG Activity Entries (RTYDGACTE) display, specify a value for the
Time of day to retry prompt. Then press Enter.
You can specify a specific time within 24 hours. The scheduled time is based on
the time on the system from which the request is submitted regardless of the
system on which the activity to retry occurs. When you submit a retry request for a
scheduled time, MIMIX will make the entry active and will wait until the specified
time before retrying the request. The scheduled time is the earliest the request will
be processed. Be sure to consider any time zone differences between systems as
you determine a scheduled time. For additional information and examples, press
F1 (Help).
4. The Confirm Retry of DG Activity display appears. Press Enter.
If failed activity entries occur frequently, consider using the third delay retry cycle.
When the Automatic object recovery policy is enabled, a third retry cycle is performed
227
using the settings in effect from the Number of third delay/retries and Third retry
interval (min.) policies. These policies can be set for the installation or for a specific
data group.
Determining whether an activity entry is in a delay/retry cycle
This procedure allows you to check the status of an activity entry to determine
whether MIMIX is attempting automatic delay retry cycles for the object.
1. From the Work with Data Groups display, type a 14 (Active objects) next to the
data group you want and press Enter.
2. The Work with Data Group Activity display appears with a list of the objects that
are actively being replicated.
3. Type a 12 (Work with Entries) next to the list entry for the object you want and
press Enter. The Work with DG Activity Entries display appears with the list of
activity entries for the object you selected.
4. To view additional details for an entry, type a 5 (Display) next to the activity entry
you want and press Enter. The Display DG Activity Details display appears.
5. Check the value listed in the Waiting for retry field.
The value *YES is displayed when the activity entry is undergoing automatic
delay/retry processing. Delayed or failed activity entries and pending activity
entries that are not in a delay retry cycle will always have a value of *NO.
6. When the value of the Waiting for retry field is *YES, the Delay/Retry Processing
Information fields are also available and provide the following information:
•
The Retries attempted field identifies the number of times that MIMIX has
attempted to process the activity entry.
•
The Retries remaining field identifies the remaining number of times that
MIMIX can automatically attempt to retry the activity entry. MIMIX uses only as
many of the remaining retry attempts as necessary to achieve a successful
attempt.
•
The Delay interval (seconds) field identifies the number of seconds between
the previous attempt and the next retry attempt.
•
The Timestamp of next attempt field identifies the approximate date and time
that MIMIX will make the next attempt to process the activity entry. If object
replication processes are busy processing other entries, there may be a delay
between this time and when processing of this entry is actually attempted. The
value *PENDING indicates that the time of the next attempt has passed and
processing for the entry is waiting while other entries are being processed.
This field is displayed only on the system of the process that is in delay/retry.
228
Removing data group activity history entries
Removing data group activity history entries
MIMIX maintains history of successfully completed distribution requests to provide a
record of all object, DLO, and IFS replication activity completed by system journal
replication processes. While MIMIX efficiently uses disk space and removes
completed requests according to the value specified in the Keep data group history
parameter of the system definition, you may occasionally need to manually remove
completed activity entries. One reason to manually remove completed entries may be
to conserve disk space, while another may be to clean up entries for an object that
has been removed from replication as a result of a configuration change.
Note: Your business policies and procedures may require that you archive
completed activity entries to tape before you delete them.
To remove completed activity entries, do the following:
1. From the Work with Data Groups display, type 28 (Completed objects) next to the
data group you want and press Enter. The Work with Data Group Activity display
appears with a list of objects with completed entries.
2. Type a 4 (Remove) next to the entry you want and do one of the following:
•
To remove all available completed entries for the selected object, press Enter.
Then continue with Step 4.
•
To change the selection criteria to include entries for additional objects or to
limit the entries based on a time range, press F4 (Prompt). The Remove DG
Activity Entries (RMVDGACTE) display appears.
3. To change the selection criteria, do the following as needed:
•
To remove a subset of completed entries for the selected object based on the
timestamp of the replicated journal entries, specify values for Starting date and
time and Ending date and time prompts.
•
To expand the set of objects for which completed entries will be removed,
change the values of the following prompts as needed:
For an expanded set of object types, use the Object type prompt.
For a library based object, use the Object and Library prompts.
For a DLO, use the Document and Folder prompts.
For an IFS object use the IFS object prompt.
For a spooled file, use the Spooled file name, Output queue, and Library
prompts.
4. A confirmation display appears. Press Enter.
229
Starting, ending, and verifying
journaling
CHAPTER 12
This chapter describes procedures for starting and ending journaling. Journaling must
be active on all files, IFS objects, data areas and data queues that you want to
replicate through a user journal. Normally, journaling is started during configuration.
However, there are times when you may need to start or end journaling on items
identified to a data group.
The topics in this chapter include:
•
“What objects need to be journaled” on page 231 describes, for supported
configuration scenarios, what types of objects must have journaling started before
replication can occur. It also describes when journaling is started implicitly, as well
as the authority requirements necessary for user profiles that create the objects to
be journaled when they are created.
•
“MIMIX commands for starting journaling” on page 233 identifies the MIMIX
commands available for starting journaling and describes the checking performed
by the commands.
•
“Journaling for physical files” on page 235 includes procedures for displaying
journaling status, starting journaling, ending journaling, and verifying journaling for
physical files identified by data group file entries.
•
“Journaling for IFS objects” on page 238 includes procedures for displaying
journaling status, starting journaling, ending journaling, and verifying journaling for
IFS objects replicated cooperatively (advanced journaling). IFS tracking entries
are used in these procedures.
•
“Journaling for data areas and data queues” on page 241 includes procedures for
displaying journaling status, starting journaling, ending journaling, and verifying
journaling for data area and data queue objects replicated cooperatively
(advanced journaling). IFS tracking entries are used in these procedures.
230
What objects need to be journaled
What objects need to be journaled
A data group can be configured in a variety of ways that involve a user journal in the
replication of files, data areas, data queues and IFS objects. Journaling must be
started for any object to be replicated through a user journal or to be replicated by
cooperative processing between a user journal and the system journal.
Requirements for system journal replication - System journal replication
processes use a special journal, the security audit (QAUDJRN) journal. Events are
logged in this journal to create a security audit trail. When data group object entries,
IFS entries, and DLO entries are configured, each entry specifies an object auditing
value that determines the type of activity on the objects to be logged in the journal.
Object auditing is automatically set for all objects defined to a data group when the
data group is first started, or any time a change is made to the object entries, IFS
entries, or DLO entries for the data group. Because security auditing logs the object
changes in the system journal, no special action is need.
Requirements for user journal replication - User journal replication processes
require that the journaling be started for the objects identified by data group file
entries. Both MIMIX Dynamic Apply and legacy cooperative processing use data
group file entries and therefore require journaling to be started. Configurations that
include advanced journaling for replication of data areas, data queues, or IFS objects
also require that journaling be started on the associated object tracking entries and
IFS tracking entries, respectively. Starting journaling ensures that changes to the
objects are recorded in the user journal, and are therefore available for MIMIX to
replicate.
During initial configuration, the configuration checklists direct you when to start
journaling for objects identified by data group file entries, IFS tracking entries, and
object tracking entries. The MIMIX commands STRJRNFE, STRJRNIFSE, and
STRJRNOBJE simplify the process of starting journaling. For more information about
these commands, see “MIMIX commands for starting journaling” on page 233.
Although MIMIX commands for starting journaling are preferred, you can also use
IBM commands (STRJRNPF, STRJRN, STRJRNOBJ) to start journaling if you have
the appropriate authority for starting journaling.
Requirements for implicit starting of journaling - Journaling can be automatically
started for newly created database files, data areas, data queues, or IFS objects
when certain requirements are met.
The user ID creating the new objects must have the required authority to start
journaling and the following requirements must be met:
•
IFS objects - A new IFS object is automatically journaled if the directory in which it
is created is journaled as a result of a request that permitted journaling inheritance
for new objects. Typically, if MIMIX started journaling on the parent directory,
inheritance is permitted. If you manually start journaling on the parent directory
using the IBM command STRJRN, specify INHERIT(*YES). This will allow IFS
objects created within the journaled directory to inherit the journal options and
journal state of the parent directory.
•
Database files created by SQL statements - A new file created by a CREATE
231
What objects need to be journaled
TABLE statement is automatically journaled if the library in which it is created
contains a journal named QSQJRN.
•
New *FILE, *DTAARA, *DTAQ objects - The default value (*DFT) for the Journal at
creation (JRNATCRT) parameter in the data group definition enables MIMIX to
support both release-specific techniques that the operating system uses to
automatically start journaling for physical files, data areas, and data queues when
they are created.
– On systems running IBM i 6.1 or higher releases, MIMIX uses the support
provided by the IBM i command Start Journal Library (STRJRNLIB).
Customers are advised not to re-create the QDFTJRN data area on systems
running IBM i 6.1 or higher.
– On systems running IBM i 5.4, MIMIX uses the QDFTJRN data area for journal
at creation. The operating system will automatically journal a new object if it is
created in a library that contains a QDFTJRN data area and the data area has
enabled automatic journaling for the object type.
When configuration requirements are met, MIMIX will either start library journaling
or create the QDFTJRN data area for the appropriate libraries as well as enable
automatic journaling for the configured cooperatively processed object types.
When journal at creation configuration requirements are met, all new objects of
that type are journaled, not just those which are eligible for replication.
When the data group is started, MIMIX evaluates all data group object entries for
each object type. (Entries for *FILE objects are only evaluated when the data
group specifies COOPJRN(*USRJRN).) Entries properly configured to allow
cooperative processing of the object type determine whether MIMIX will enforce
library journaling or create the QDFTJRN data area. MIMIX uses the data group
entry with the most specific match to the object type and library that also specifies
*ALL for its System 1 object (OBJ1) and Attribute (OBJATR).
Note: MIMIX prevents library journaling from starting or the QDFTJRN data area
from being created in the following libraries: QSYS*, QRECOVERY,
QRCY*, QUSR*, QSPL*, QRPL*, QRCL*, QRPL*, QGPL, QTEMP and
SYSIB*.
For example, if MIMIX finds only the following data group object entries for library
MYLIB, it would use the first entry when determining whether to enforce library
journaling or create the QDFTJRN data area because it is the most specific entry
that also meets the OBJ1(*ALL) and OBJATR(*ALL) requirements. The second
entry is not considered in the determination because its OBJ1 and OBJATR
values do not meet these requirements.
LIB1(MYLIB) OBJ1(*ALL) OBJTYPE(*FILE) OBJATR(*ALL) COOPDB(*YES)
PRCTYPE(*INCLD)
LIB1(MYLIB) OBJ1(MYAPP) OBJTYPE(*FILE) OBJATR(DSPF) COOPDB(*YES)
PRCTYPE(*INCLD)
Authority requirements for starting journaling
Normal MIMIX processes run under the MIMIXOWN user profile, which ships with
*ALLOBJ special authority. Therefore, it is not necessary for other users to account
232
MIMIX commands for starting journaling
for journaling authority requirements when using MIMIX commands (STRJRNFE,
STRJRNIFSE, STRJRNOBJE) to start journaling.
When the MIMIX journal managers are started, or when the Build Journaling
Environment (BLDJRNENV) command is used, MIMIX checks the public authority
(*PUBLIC) for the journal. If necessary, MIMIX changes public authority so the user ID
in use has the appropriate authority to start journaling.
Authority requirements must be met to enable the automatic journaling of newly
created objects and if you use IBM commands to start journaling instead of MIMIX
commands.
•
If you create database files, data areas, or data queues for which you expect
automatic journaling at creation, the user ID creating these objects must have the
required authority to start journaling.
•
If you use the IBM commands (STRJRNPF, STRJRN, STRJRNOBJ) to start
journaling, the user ID that performs the start journaling request must have the
appropriate authority requirements.
For journaling to be successfully started on an object, one of the following authority
requirements must be satisfied:
•
The user profile of the user attempting to start journaling for an object must have
*ALLOBJ special authority.
•
The user profile of the user attempting to start journaling for an object must have
explicit *ALL object authority for the journal to which the object is to be journaled.
•
Public authority (*PUBLIC) must have *OBJALTER, *OBJMGT, and *OBJOPR
object authorities for the journal to which the object is to be journaled.
MIMIX commands for starting journaling
Before you use any of the MIMIX commands for starting journaling, the data group file
entries, IFS tracking entries, or object tracking entries associated with the command’s
object class must be loaded.
The MIMIX commands for starting journaling are:
•
Start Journal Entry (STRJRNFE) - This command starts journaling for files
identified by data group file entries.
•
Start Journaling IFS Entries (STRJRNIFSE) - This command starts journaling of
IFS objects configured for advanced journaling. Data group IFS entries must be
configured and IFS tracking entries be loaded (LODDGIFSTE command) before
running the STRJRNIFSE command to start journaling.
•
Start Journaling Obj Entries (STRJRNOBJE) - This command starts journaling of
data area and data queue objects configured for advanced journaling. Data group
object entries must be configured and object tracking entries be loaded
(LODDGOBJTE command) before running the STRJRNOBJE command to start
journaling.
233
MIMIX commands for starting journaling
If you attempt to start journaling for a data group file entry, IFS tracking entry, or object
tracking entry and the files or objects associated with the entry are already journaled,
MIMIX checks that the physical file, IFS object, data area, or data queue is journaled
to the journal associated with the data group. If the file or object is journaled to the
correct journal, the journaling status of the data group file entry, IFS tracking or object
tracking entry is changed to *YES. If the file or object is not journaled to the correct
journal or the attempt to start journaling fails, an error occurs and the journaling status
is changed to *NO.
234
Journaling for physical files
Journaling for physical files
Data group file entries identify physical files to be replicated. When data group file
entries are added to a configuration, they may have an initial status of *ACTIVE.
However, the physical files which they identify may not be journaled. In order for
replication to occur, journaling must be started for the files on the source system.
This topic includes procedures to display journaling status, and to start, end, or verify
journaling for physical files.
Displaying journaling status for physical files
Use this procedure to display journaling status for physical files identified by data
group file entries. Do the following:
1. From the MIMIX Intermediate Main Menu, type 1 and press Enter to access the
Work with Data Groups display.
2. On the Work with Data Groups display, type 17 (File entries) next to the data
group you want and press Enter.
3. The Work with DG File Entries display appears. The initial view shows the current
and requested status of the data group file entry. Press F10 (Journaled view).
At the right side of the display, the Journaled System 1 and System 2 columns
indicate whether the physical file associated with the file entry is journaled on
each system.
Note: Logical files will have a status of *NA. Data group file entries exist for
logical files only in data groups configured for MIMIX Dynamic Apply.
Starting journaling for physical files
Use this procedure to start journaling for physical files identified by data group file
entries. In order for replication to occur, journaling must be started for the file on the
source system.
This procedure invokes the Start Journal Entry (STRJRNFE) command. The
command can also be entered from a command line.
Do the following:
1. Access the journaled view of the Work with DG File Entries display as described
in “Displaying journaling status for physical files” on page 235.
2. From the Work with DG File Entries display, type a 9 (Start journaling) next to the
file entries you want. Then do one of the following:
•
To start journaling using the command defaults, press Enter.
•
To modify command defaults, press F4 (Prompt) then continue with the next
step.
3. The Start Journal Entry (STRJRNFE) display appears. The Data group definition
prompts and the System 1 file prompts identify your selection. Accept these
values or specify the values you want.
235
Journaling for physical files
4. Specify the value you want for the Start journaling on system prompt. Press F4 to
see a list of valid values.
When *DGDFN, *SRC, or *TGT is specified, MIMIX considers whether the data
group is configured for journaling on the target system (JRNTGT) and starts or
prevents journaling from starting as required.
5. If you want to use batch processing, specify *YES for the Submit to batch prompt.
6. To start journaling for the physical file associated with the selected data group,
press Enter.
The system returns a message to confirm the operation was successful.
Ending journaling for physical files
Use this procedure to end journaling for a physical file associated with a data group
file entry. Once journaling for a file is ended, any changes to that file are not captured
and are not replicated. You may need to end journaling if a file no longer needs to be
replicated, to prepare for upgrading MIMIX software, or to correct an error.
This procedure invokes the End Journaling File Entry (ENDJRNFE) command. The
command can also be entered from a command line.
To end journaling, do the following:
1. Access the journaled view of the Work with DG File Entries display as described
in “Displaying journaling status for physical files” on page 235.
2. From the Work with DG File Entries display, type a 10 (End journaling) next to the
file entry you want and do one of the following:
Note: MIMIX cannot end journaling on a file that is journaled to the wrong
journal. For example, a file that does not match the journal definition for
that data group. If you want to end journaling outside of MIMIX, use the
ENDJRNPF command.
•
To end journaling using command defaults, press Enter. Journaling is ended.
•
To modify additional prompts for the command, press F4 (Prompt) and
continue with the next step.
3. The End Journal File Entry (ENDJRNFE) display appears. If you want to end
journaling for all files in the library, specify *ALL at the System 1 file prompt.
4. Specify the value you want for the End journaling on system prompt. Press F4 to
see a list of valid values.
When *DGDFN, *SRC, or *TGT is specified, MIMIX considers whether the data
group is configured for journaling on the target system (JRNTGT) and ends or
prevents journaling from ending as required.
5. If you want to use batch processing, specify *YES for the Submit to batch prompt.
6. To end journaling, press Enter.
236
Journaling for physical files
Verifying journaling for physical files
Use this procedure to verify if a physical file defined by a data group file entry is
journaled correctly. This procedure invokes the Verify Journaling File Entry
(VFYJRNFE) command to determine whether the file is journaled and whether it is
journaled to the journal defined in the journal definition. When these conditions are
met, the journal status on the Work with DG File Entries display is set to *YES. The
command can also be entered from a command line.
To verify journaling for a physical file, do the following:
1. Access the journaled view of the Work with DG File Entries display as described
in “Displaying journaling status for physical files” on page 235.
2. From the Work with DG File Entries display, type a 11 (Verify journaling) next to
the file entry you want and do one of the following:
•
To verify journaling using command defaults, press Enter.
•
To modify additional prompts for the command, press F4 (Prompt) and
continue with the next step.
3. The Verify Journaling File Entry (VFYJRNFE) display appears. The Data group
definition prompts and the System 1 file prompts identify your selection. Accept
these values or specify the values you want.
4. Specify the value you want for the Verify journaling on system prompt. When
*DGDFN is specified, MIMIX considers whether the data group is configured for
journaling on the target system (JRNTGT) when determining where to verify
journaling.
5. If you want to use batch processing, specify *YES for the Submit to batch prompt
6. Press Enter.
237
Journaling for IFS objects
Journaling for IFS objects
IFS tracking entries are loaded for a data group after the data group IFS entries have
been configured for replication through the user journal (advanced journaling).
However, loading IFS tracking entries does not automatically start journaling on the
IFS objects they identify. In order for replication to occur, journaling must be started on
the source system for the IFS objects identified by IFS tracking entries.
This topic includes procedures to display journaling status, and to start, end, or verify
journaling for IFS objects identified for replication through the user journal.
These references go to different files in different books.
You should be aware of the information in “Considerations for working with long IFS
path names” on page 262.
Displaying journaling status for IFS objects
Use this procedure to display journaling status for IFS objects identified by IFS
tracking entries. Do the following:
1. From the MIMIX Intermediate Main Menu, type 1 and press Enter to access the
Work with Data Groups display.
2. On the Work with Data Groups display, type 50 (IFS trk entries) next to the data
group you want and press Enter.
3. The Work with DG IFS Trk. Entries display appears. The initial view shows the
object type and status at the right of the display. Press F10 (Journaled view).
At the right side of the display, the Journaled System 1 and System 2 columns
indicate whether the IFS object identified by the tracking is journaled on each
system.
Starting journaling for IFS objects
Use this procedure to start journaling for IFS objects identified by IFS tracking entries.
This procedure invokes the Start Journaling IFS Entries (STRJRNIFSE) command.
The command can also be entered from a command line.
To start journaling for IFS objects, do the following:
1. If you have not already done so, load the IFS tracking entries for the data group.
For more information see the MIMIX Administrator Reference book.
2. Access the journaled view of the Work with DG IFS Trk. Entries display as
described in “Displaying journaling status for IFS objects” on page 238.
3. From the Work with DG IFS Trk. Entries display, type a 9 (Start journaling) next to
the IFS tracking entries you want. Then do one of the following:
•
To start journaling using the command defaults, press Enter.
•
To modify the command defaults, press F4 (Prompt) and continue with the next
step.
238
Journaling for IFS objects
4. The Start Journaling IFS Entries (STRJRNIFSE) display appears. The Data group
definition and IFS objects prompts identify the IFS object associated with the
tracking entry you selected. You cannot change the values shown for the IFS
objects prompts1.
5. Specify the value you want for the Start journaling on system prompt. Press F4 to
see a list of valid values.
When *DGDFN, *SRC, or *TGT is specified, MIMIX considers whether the data
group is configured for journaling on the target system (JRNTGT) and starts or
prevents journaling from starting as required.
6. To use batch processing, specify *YES for the Submit to batch prompt and press
Enter. Additional prompts for Job description and Job name appear. Either accept
the default values or specify other values.
7. The System 1 file identifier and System 2 file identifier prompts identify the file
identifier (FID) of the IFS object on each system. You cannot change the values2.
8. To start journaling on the IFS objects specified, press Enter.
Ending journaling for IFS objects
Use this procedure to end journaling for IFS objects identified by IFS tracking entries.
This procedure invokes the End Journaling IFS Entries (ENDJRNIFSE) command.
The command can also be entered from a command line.
To end journaling for IFS objects, do the following:
1. Access the journaled view of the Work with DG IFS Trk. Entries display as
described in “Displaying journaling status for IFS objects” on page 238.
2. From the Work with DG IFS Trk. Entries display, type a 10 (End journaling) next to
the IFS tracking entries you want. Then do one of the following:
•
To end journaling using the command defaults, press Enter.
•
To modify the command defaults, press F4 (Prompt) and continue with the next
step.
3. The End Journaling IFS Entries (ENDJRNIFSE) display appears. The Data group
definition and IFS objects prompts identify the IFS object associated with the
tracking entry you selected. You cannot change the values shown for the IFS
objects prompts1.
4. Specify the value you want for the End journaling on system prompt. Press F4 to
see a list of valid values.
When *DGDFN, *SRC, or *TGT is specified, MIMIX considers whether the data
group is configured for journaling on the target system (JRNTGT) and ends or
1. When the command is invoked from a command line, you can change values specified for the
IFS objects prompts. Also, you can specify as many as 300 object selectors by using the + for
more values prompt.
2. When the command is invoked from a command line, use F10 to see the FID prompts. Then you
can optionally specify the unique FID for the IFS object on either system. The FID values can be
used alone or in combination with the IFS object path name.
239
Journaling for IFS objects
prevents journaling from ending as required.
5. To use batch processing, specify *YES for the Submit to batch prompt and press
Enter. Additional prompts for Job description and Job name appear. Either accept
the default values or specify other values.
6. The System 1 file identifier and System 2 file identifier identify the file identifier
(FID) of the IFS object on each system. You cannot change the values shown2.
7. To end journaling on the IFS objects specified, press Enter.
Verifying journaling for IFS objects
Use this procedure to verify if an IFS object identified by an IFS tracking entry is
journaled correctly. This procedure invokes the Verify Journaling IFS Entries
(VFYJRNIFSE) command to determine whether the IFS object is journaled, whether it
is journaled to the journal defined in the data group definition, and whether it is
journaled with the attributes defined in the data group definition. The command can
also be entered from a command line.
To verify journaling for IFS objects, do the following:
1. Access the journaled view of the Work with DG IFS Trk. Entries display as
described in “Displaying journaling status for IFS objects” on page 238.
2. From the Work with DG IFS Trk. Entries display, type a 11 (Verify journaling) next
to the IFS tracking entries you want. Then do one of the following:
•
To verify journaling using the command defaults, press Enter.
•
To modify the command defaults, press F4 (Prompt) and continue with the next
step.
3. The Verify Journaling IFS Entries (VFYJRNIFSE) display appears. The Data
group definition and IFS objects prompts identify the IFS object associated with
the tracking entry you selected. You cannot change the values shown for the IFS
objects prompts1.
4. Specify the value you want for the Verify journaling on system prompt. Press F4 to
see a list of valid values.
When *DGDFN is specified, MIMIX considers whether the data group is
configured for journaling on the target system (JRNTGT) and verifies journaling on
the appropriate systems as required.
5. To use batch processing, specify *YES for the Submit to batch prompt and press
Enter. Additional prompts for Job description and Job name appear. Either accept
the default values or specify other values.
6. The System 1 file identifier and System 2 file identifier identify the file identifier
(FID) of the IFS object on each system. You cannot change the values shown2.
7. To verify journaling on the IFS objects specified, press Enter.
“Using file identifiers (FIDs) for IFS objects” on page 273.
240
Journaling for data areas and data queues
Journaling for data areas and data queues
Object tracking entries are loaded for a data group after the data group object entries
have been configured replication through the user journal (advanced journaling).
However, loading object tracking entries does not automatically start journaling on the
objects they identify. In order for replication to occur, journaling must be started for the
objects on the source system for the objects identified by object tracking entries.
This topic includes procedures to display journaling status, and to start, end, or verify
journaling for data areas and data queues identified for replication through the user
journal.
Displaying journaling status for data areas and data queues
To check journaling status for data areas and data queues identified by object tracking
entries. Do the following:
1. From the MIMIX Intermediate Main Menu, type 1 and press Enter to access the
Work with Data Groups display.
2. On the Work with Data Groups display, type 52 (Obj trk entries) next to the data
group you want and press Enter.
3. The Work with DG Obj. Trk. Entries display appears. The initial view shows the
object type and status at the right of the display. Press F10 (Journaled view).
At the right side of the display, the Journaled System 1 and System 2 columns
indicate whether the object identified by the tracking is journaled on each system.
Starting journaling for data areas and data queues
Use this procedure to start journaling for data areas and data queues identified by
object tracking entries.
This procedure invokes the Start Journaling Obj Entries (STRJRNOBJE) command.
The command can also be entered from a command line.
To start journaling for data areas and data queues, do the following:
1. If you have not already done so, load the object tracking entries for the data
group. For more information see the MIMIX Administrator Reference book.
2. Access the journaled view of the Work with DG Obj. Trk. Entries display as
described in “Displaying journaling status for data areas and data queues” on
page 241.
3. From the Work with DG Obj. Trk. Entries display, type a 9 (Start journaling) next to
the object tracking entries you want. Then do one of the following:
•
To start journaling using the command defaults, press Enter.
•
To modify the command defaults, press F4 (Prompt) and continue with the next
step.
4. The Start Journaling Obj Entries (STRJRNOBJE) display appears. The Data
group definition and Objects prompts identify the object associated with the
241
Journaling for data areas and data queues
tracking entry you selected. Although you can change the values shown for these
prompts, it is not recommended unless the command was invoked from a
command line.
5. Specify the value you want for the Start journaling on system prompt. Press F4 to
see a list of valid values.
When *DGDFN, *SRC, or *TGT is specified, MIMIX considers whether the data
group is configured for journaling on the target system (JRNTGT) and starts or
prevents journaling from starting as required.
6. To use batch processing, specify *YES for the Submit to batch prompt and press
Enter. Additional prompts for Job description and Job name appear. Either accept
the default values or specify other values.
7. To start journaling on the objects specified, press Enter.
Ending journaling for data areas and data queues
Use this procedure to end journaling for data areas and data queues identified by
object tracking entries.
This procedure invokes the End Journaling Obj Entries (ENDJRNOBJE) command.
The command can also be entered from a command line.
To end journaling for data areas and data queues, do the following:
1. Access the journaled view of the Work with DG Obj. Trk. Entries display as
described in “Displaying journaling status for data areas and data queues” on
page 241.
2. From the Work with DG Obj. Trk. Entries display, type a 10 (End journaling) next
to the object tracking entries you want. Then do one of the following:
•
To verify journaling using the command defaults, press Enter.
•
To modify the command defaults, press F4 (Prompt) and continue with the next
step.
3. The End Journaling Obj Entries (ENDJRNOBJE) display appears. The Data group
definition and IFS objects prompts identify the object associated with the tracking
entry you selected. Although you can change the values shown for these prompts,
it is not recommended unless the command was invoked from a command line.
4. Specify the value you want for the End journaling on system prompt. Press F4 to
see a list of valid values.
When *DGDFN, *SRC, or *TGT is specified, MIMIX considers whether the data
group is configured for journaling on the target system (JRNTGT) and ends or
prevents journaling from ending as required.
5. To use batch processing, specify *YES for the Submit to batch prompt and press
Enter. Additional prompts for Job description and Job name appear. Either accept
the default values or specify other values.
6. To end journaling on the objects specified, press Enter.
242
Journaling for data areas and data queues
Verifying journaling for data areas and data queues
Use this procedure to verify if an object identified by an object tracking entry is
journaled correctly. This procedure invokes the Verify Journaling Obj Entries
(VFYJRNOBJE) command to determine whether the object is journaled, whether it is
journaled to the journal defined in the data group definition, and whether it is journaled
with the attributes defined in the data group definition. The command can also be
entered from a command line.
To verify journaling for objects, do the following:
1. Access the journaled view of the Work with DG Obj. Trk. Entries display as
described in “Displaying journaling status for data areas and data queues” on
page 241.
2. From the Work with DG Obj. Trk. Entries display, type a 11 (Verify journaling) next
to the object tracking entries you want. Then do one of the following:
•
To verify journaling using the command defaults, press Enter.
•
To modify the command defaults, press F4 (Prompt) and continue with the next
step.
3. The Verify Journaling Obj Entries (VFYJRNOBJE) display appears. The Data
group definition and Objects prompts identify the object associated with the
tracking entry you selected. Although you can change the values shown for these
prompts, it is not recommended unless the command was invoked from a
command line.
4. Specify the value you want for the Verify journaling on system prompt. Press F4 to
see a list of valid values.
When *DGDFN is specified, MIMIX considers whether the data group is
configured for journaling on the target system (JRNTGT) and verifies journaling on
the appropriate systems as required.
5. To use batch processing, specify *YES for the Submit to batch prompt and press
Enter. Additional prompts for Job description and Job name appear. Either accept
the default values or specify other values.
6. To verify journaling on the objects specified, press Enter.
243
About switching
CHAPTER 13
Switching
Switching is when you temporarily reverse the roles of the systems. The original
source system (production) becomes the temporary target system and the original
target system (backup) becomes the temporary source system. When the scenario
that required you to switch directions is resolved, you typically switch again to return
the systems to their original roles.
This chapter provides information and procedures to support switching. The following
topics are included:
•
“About switching” on page 244 provides information about switching with MIMIX
including best practice and reasons why a switch should be performed. Subtopics
describe:
– What is a planned switch and requirements for a planned switch
– What is an unplanned switch and actions to be completed after the failed
source system is recovered
– The role of procedures for switching environments that use application groups
– The role of MIMIX Model Switch Framework for switching environments that do
not use application groups
•
“Switching an application group” on page 250 describes how to run a procedure to
switch an application group.
•
“Switching a data group-only environment” on page 251. describes how to switch
from a 5250 emulator.
•
“Determining when the last switch was performed” on page 253 describes how to
check the Last switch field which indicates the switch compliance status and
provides the date when the last switch was performed.
•
“Problems checking switch compliance” on page 254 describes problems that can
occur with data for the Last switch field.
•
“Performing a data group switch” on page 255 describes how to switch a single
data group using the SWTDG command.
•
“Switch Data Group (SWTDG) command” on page 257 provides background
information about the SWTDG command, which is used in all switch interfaces.
About switching
Replication environments rarely remain static. Therefore, best practice is to perform
regular switches to ensure that you are prepared should you need to perform one
during an emergency.
MIMIX supports two methods for switching the direction in which replication occurs for
a data group. These methods are known as a planned switch and an unplanned
switch.
244
About switching
You may need to perform a switch for any of the following reasons:
•
The production system becomes unavailable due to an unplanned outage. A
switch in this scenario is unplanned.
•
You need to perform hardware or software maintenance on the production
system. Typically, you can schedule this in advance so the switch is planned.
•
You need to test your recovery plan. This activity is also a planned switch.
Historically, the concept of switching consists of three phases: switch to the backup
system, synchronize the systems when the production is ready to use, and switch
back to the production system.This round-trip view of switching assumes your goal is
to return to your original production system as quickly as possible. However, this view
overlooks the fact some customers may have an extended time pass between phase
one and the other phases, or may even view a switch as a one-way trip. MIMIX
supports both conceptual views of switching.
Switching data groups is only a part of performing a switch. MIMIX provides robust
support for customizing switching activity include all the needs of your environment.
Best practice for switching includes performing regular switches. Best practice also
includes performing all audits with the audit level set at level 30 immediately prior to a
planned switch to the backup system and before switching back to the production
system. For performing the switch in an environment that uses application groups is
to use option 4 (Switch all application groups) from the MIMIX Basic Main Menu. Best
practice for performing a switch in an environment using only data groups is to use
option 5 (start or complete switch using Switch Asst.) from the MIMIX Basic Main
Menu.
Planned switch
You can start a planned switch from either system. In a planned switch, MIMIX
initiates a controlled shutdown of the data group. Both systems and the
communications between them must be active.
Before you start a planned switch of a data group, you should ensure that the
following actions have been completed. Your enterprise may have additional
requirements.
•
Perform an full set of audits with the audit level policy set to level 30. Running the
#FILDTA audit at this audit level checks 100 percent of file member data for the
data group for synchronization between source and target systems and is strongly
recommended.
•
Shut down any applications that use database files or objects defined to the data
group. If any users or other non-MIMIX processes remain active while the switch
is being performed, the data can become not synchronized between systems and
orphaned data may result.
•
Ensure that there are no jobs other than MIMIX currently active on the source
system. This may require ending all interactive and batch subsystems other than
MIMIX and ending communications.
•
Users should be prevented from accessing either system until after the switch is
complete and the data group is restarted.
245
About switching
•
If you use user journal replication processes, you should address any files, IFS
tracking entries, or object tracking entries in error for your critical database files. If
you use system journal replication processes, you should address any object
errors.
You are not required to run journal analysis after a planned switch. MIMIX retains
information about where activity ended so that when you restart the data group, it is
started at the correct point.
When the data group is started, the temporary target system (the production system)
is now being updated with user changes that are being replicated from the temporary
source system (the backup system). Do not allow users onto the production system
until after the production system is caught up with these transactions and you run the
switch process again to revert to the normal roles.
Unplanned switch
In an unplanned switch, the source system is assumed to be unavailable. An
unplanned switch is generally required when the source system fails and, in order to
continue normal operations, you must switch users to a backup system. (Typically
MIMIX is configured so that the target for replication is your backup system.)
You must run an unplanned switch from the target system. MIMIX performs a
controlled shutdown of replication processes on the target system. The controlled
shutdown allows all apply processing to catch up before the apply processes are
ended.
There are default (*DFT) values for several parameters on the SWTDG command that
allow the switch operation to continue without intervention from the user. See
“Planned switch” on page 245 for additional details about these default values.
In an unplanned switch of a data group that uses remote journaling, the default
behavior is to end the RJ link.
Once the failed source system is recovered, the following actions should be
completed:
•
You should perform journal analysis on that system before restarting the data
group or user applications. Journal analysis helps identify any possible loss of
data that may have occurred when the source system failed. Journal analysis
relies on status information on the source system about the last entry that was
applied. This information will be cleared when the data group is restarted.
•
Communication between the systems must be active before you restart the data
group. The switch process is complete when you restart the data group. When
the data group is restarted, MIMIX notifies the source system that it is now the
temporary target system.
•
New transactions are created on the temporary source system (the backup
system) while the production system (the temporary target system) is unavailable
for replication. After you have completed journal analysis, you can send these
new transactions to the production system to synchronize the databases. Once
the databases are synchronized, you must run the switch process again to revert
to the normal roles before allowing users onto the production system.
246
About switching
When the data group is started after a switch, any pending transactions are cleared.
The journal receiver is already changed by the switch process and the new journal
receiver and first sequence number are used.
Switching application group environments with procedures
Application groups can only be switched using procedures. Procedures and steps are
a highly customizable means of performing operations for application groups. Each
application group has a set of default procedures that include procedures for
performing pre-check activity for switching and switching. Each operation is
performed by a procedure that consists of a sequence of steps and multiple jobs.
Each step calls a predetermined step program to perform a specific sub-task of the
larger operation.
This following paragraphs describe the behavior of the switch (SWTAG) command for
application groups that do not participate in a cluster controlled by the IBM i operating
system (*NONCLU application groups).
What is the scope of the request? The following parameters identify the scope of
the requested operation:
Application group definition (AGDFN) - Specifies the requested application group.
You can either specify a name or the value *ALL.
Resource groups (TYPE) - Specifies the types of resource groups to be
processed for the requested application group.
Data resource group entry (DTARSCGRP) - Specifies the data resource groups to
include in the request. The default is *ALL or you can specify a name. This
parameter is ignored when TYPE is *ALL or *APP.
What is the requested switch behavior? The following parameters on the SWTAG
command define the expected behavior:
Switch type (SWTTYP) - This specifies the reason the application group is being
switched. The procedure called to perform the switch and the actions performed
during the switch differ based on whether the current primary node (data source)
is available at the start of the switch procedure. The default value, *PLANNED,
indicates that the primary node is still available and the switch is being performed
for normal business processes (such as to perform maintenance on the current
source system or as part of a standard switch procedure). The value
*UNPLANNED indicates that the switch is an unplanned activity and the data
source system may not be available.
Node roles (ROLE) - This specifies which set of node roles will determine the
node that becomes the new primary node as a result of the switch. The default
value *CURRENT uses the current order of node roles. If the application group
participates in a cluster, the current roles defined within the CRGs will be used. If
*CONFIG is specified, the configured primary node will become the new primary
node and the new role of other nodes in the recovery domain will be determined
from their current roles. If you specify a name of a node within the recovery
domain for the application group, the node will be made the new primary node and
the new role of other nodes in the recovery domain will be determined from their
current roles.
247
About switching
What procedure will be used? The following parameters identify the procedure to
use and its starting point:
Begin at step (STEP) - Specifies where the request will start within the specified
procedure. This parameter is described in detail below.
Procedure (PROC) - Specifies the name of the procedure to run to perform the
requested operation when starting from its first step. The value *DFT will use the
procedure designated as the default for the application group. The value
*LASTRUN uses the same procedure used for the previous run of the command.
You can also specify the name of a procedure that is valid the specified
application group and type of request.
Where should the procedure begin? The value specified for the Begin at step
(STEP) parameter on the request to run the procedure determines the step at which
the procedure will start. The status of the last run of the procedure determines which
values are valid.
The default value, *FIRST, will start the specified procedure at its first step. This value
can be used when the procedure has never been run, when its previous run
completed (*COMPLETED or *COMPERR), or when a user acknowledged the status
of its previous run which failed, was canceled, or completed with errors
(*ACKFAILED, *ACKCANCEL, or *ACKERR respectively).
Other values are for resolving problems with a failed or canceled procedure. When a
procedure fails or is canceled, subsequent attempts to run the same procedure will
fail until user action is taken. You will need to determine the best course of action for
your environment based on the implications of the canceled or failed steps and any
steps which completed.
The value *RESUME will start the last run of the procedure beginning with the step at
which it failed, the step that was canceled in response to an error, or the step
following where the procedure was canceled. The value *RESUME may be
appropriate after you have investigated and resolved the problem which caused the
procedure to end. Optionally, if the problem cannot be resolved and you want to
resume the procedure anyway, you can override the attributes of a step before
resuming the procedure.
The value *OVERRIDE will override the status of all runs of the specified procedure
that did not complete. The *FAILED or *CANCELED status of these procedures are
changed to acknowledged (*ACKFAILED or *ACKCANCEL) and a new run of the
procedure begins at the first step.
.
For more information about starting a procedure with the step at which it failed, see
“Resuming a procedure” on page 91.
For more information about customizing procedures, see the MIMIX Administrator
Reference book.
Switching data group environments with MIMIX Model Switch Framework
Note: MIMIX Model Switch Framework does not support switching application
groups. Only data groups that are not associated with application groups
should be switched with MIMIX Model Switch Framework.
248
About switching
MIMIX provides a customized implementation of MIMIX Model Switch Framework to
perform a switch. MIMIX Model Switch Framework is ideally suited for customizing a
switching solution that detects the need for an unplanned switch, switches the
direction of data group replication, and switches users to the backup system.
Typically, if you have a Runbook, it will direct you when to use your MIMIX Model
Switch Framework implementation for both planned and unplanned switches.
The MIMIX Model Switch Framework calls the Switch Data Group (SWTDG)
command. The SWTDG command only switches the direction in which replication
occurs for a single data group; it does not switch users or any other facets of your
normal operating environment to the backup system. However, MIMIX Model Switch
Framework can be configured to address these additional facets of your environment
for multiple data groups. If you choose to use the SWTDG command either by
invoking it from a command line or by using the options for switching on the Work with
Data Groups display, you must take action to switch users to the backup system and
address other requirements for operating there.
The switching option from the MIMIX Basic Main menu are implementations of MIMIX
Model Switch Framework. The implementation is identified within policies.
Instructions for switching using MIMIX Model Switch Framework are described in
“Switching a data group-only environment” on page 251.
For additional information see the chapter “Using the MIMIX Model Switch
Framework” in the Using MIMIX Monitor book.
249
Switching an application group
Switching an application group
For an application group, a procedure for only one operation (start, end, or switch)
can run at a time. For details about parameters and behavior of the SWTAG
command, see “Switching application group environments with procedures” on
page 247.
To switch an application group, do the following:
1. From the Work with Application Groups display, type 15 (Switch) next to the
application group you want and press Enter.
The Switch Application Group (SWTAG) display appears.
2. Verify that the values you want are specified for Resource groups and Data
resource group entry.
3. Specify the type of switch to perform at the Switch type prompt.
4. Verify that the default value *CURRENT for Node roles prompt is valid for the
switch you need to perform. If necessary, specify a different value.
5. If you are starting the procedure after addressing problems with the previous
switch request, specify the value you want for Begin at step. Be certain that you
understand the effect the value you specify will have on your environment.
6. Press Enter.
7. The Procedure prompt appears. Do one of the following:
•
To use the default switch procedure for the specified switch type, press Enter.
•
To use a different switch procedure for the application group, specify its name.
Then press Enter.
8. A switch confirmation panel appears. To perform the switch, press F16.
250
Switching a data group-only environment
Switching a data group-only environment
In environments that do not use application groups, option 5 (Start or complete switch
using Switch Asst.) on the MIMIX Basic Main Menu is designed to simplify switching
by using a default MIMIX Model Switch Framework implementation. When you use
this option, MIMIX keeps track of which phase of the switch process you are in. You
will see a confirmation display that is appropriate for each phase. Each phase will
prompt the Run Switch Framework command (RUNSWTFWK) with your default
switch framework and appropriate values for the phase.
To change the default switch framework to a different implementation, see “Policies
for switching with model switch framework” on page 48.
Switching to the backup system
This procedure switches operations to the backup system.
Before using this procedure, consult your runbook for any additional procedures that
must be performed when switching to the backup system.
1. If this is a planned switch, Vision Solutions strongly recommends that you perform
a full set of audits with the audit level policy set to level 30. Running the #FILDTA
audit at this audit level checks 100 percent of file member data for the data group
for synchronization between source and target systems.
2. Shut down all active applications that are reading or updating replicated objects
from the production and backup systems.
Do the following from the backup system:
3. Ensure that all transactions have been applied to the backup system by doing the
following:
a. Select option 6 (Work with data groups) from the MIMIX Basic Main Menu.and
press Enter.
b. For each data group, select option 8 (Display status) and ensure that the
Unprocessed entry counts for both database and object apply have no values.
4. From the MIMIX Basic Main Menu, select option 5 (Start or complete switch using
Switch Asst.).
5. You will see the Confirm Switch to Backup confirmation display. Press F16 to
confirm your choice to switch MIMIX and specify switching options.
6. The Run Switch Framework (RUNSWTFWK) command appears. The default
Switch framework and the value *BCKUP for the Switch framework process are
preselected and cannot be changed. Do the following:
a. You must specify the type of switch to perform, *PLANNED or *UNPLANNED,
at the Switch type prompt.
b. You can change values for other parameters as needed.
c. To start the switch, press Enter.
7. Consult your runbook to determine if any additional steps are needed.
251
Switching a data group-only environment
After you complete this phase of the switch you must wait until the original production
system is available again. Then perform the steps in “Synchronizing data and starting
MIMIX on the original production system” on page 252.
Synchronizing data and starting MIMIX on the original production system
This procedure synchronizes data and starts replication from the backup system to
the original production system. Synchronizing the data ensures that the data on both
systems is equivalent before replication is started.
Before using this procedure, consult your runbook for any additional procedures that
must be performed when synchronizing and starting replication from the backup
system to the original production system
Do the following from the backup system:
1. Ensure the original production system is available again.
2. From the MIMIX Basic Main Menu, select option 5 (Start or complete switch using
Switch Asst.).
3. You will see the Confirm Synchronize and Start confirmation display. Press F16 to
confirm your choice and specify switching options.
4. The Run Switch Framework (RUNSWTFWK) command appears. The default
Switch framework and the value *SYNC for the Switch framework process are
preselected and cannot be changed. Do the following:
a. Optionally, you can change the value of the Set object auditing level prompt.
b. To synchronize and start, press Enter.
5. Once replication has caught up, Vision Solutions strongly recommends that you
perform a full set of audits with the audit level policy set to level 30. Running the
#FILDTA audit at this audit level checks 100 percent of file member data for the
data group for synchronization between source and target systems.
6. Consult your runbook to determine if any additional steps are needed.
When you are ready to switch back to the original production system, use “Switching
to the production system” on page 252.
Switching to the production system
This procedure returns operations to the original production system.
Before using this procedure, consult your runbook for any additional procedures that
must be performed when switching to the production system
1. Shut down all active applications that are reading or updating replicated objects
from the production and backup systems.
Do the following from the original production system:
2. Ensure that all transactions have been applied by doing the following:
a. Select option 6 (Work with data groups) from the MIMIX Basic Main Menu.and
press Enter.
252
Determining when the last switch was performed
b. For each data group, select option 8 (Display status) and ensure that the
Unprocessed entry counts for both database and object apply have no values.
3. From the MIMIX Basic Main Menu, select option 5 (Start or complete switch using
Switch Asst.).
4. You will see the Confirm Switch to Production confirmation display. Press F16 to
confirm your choice to switch MIMIX and specify switching options.
5. The Run Switch Framework (RUNSWTFWK) command appears. The default
Switch framework and the value *PROD for the Switch framework process are
preselected and cannot be changed. Do the following:
a. You can change values for other parameters as needed.
b. To start the switch, press Enter.
6. Consult your runbook to determine if any additional steps are needed.
Determining when the last switch was performed
Replication environments rarely remain static. Therefore, best practice is to perform
regular switches to ensure that you are prepared should you need to perform one
during an emergency.
The Last switch field indicates compliance with best practices. The status of the field
is highlighted to indicate the following:
Yellow - The number of days since the last switch is at the limit of what is
considered to be best practice. This threshold is determined by the Switch
warning threshold policy.
Red - The number of days since the last switch is beyond what is considered to be
best practice. This threshold is determined by the Switch action threshold policy.
Checking the last switch date
A 5250 emulator session provides information on the last switch date for an
installation from the Last switch field on the MIMIX Availability Status display. This
field is only displayed when a value is specified for the Default model switch
framework policy. The date indicates when the last completed switch was performed
using the switch framework specified in the policy.
To check the last switch date from a 5250 emulator, do the following:
1. Access the MIMIX Basic Main Menu. See “Accessing the MIMIX Main Menu” on
page 24.
2. From the MIMIX Basic Main Menu, select option 10 (Availability status) and press
Enter. The MIMIX Availability Status display appears. The last switch date is
located in the upper right corner of the display.
253
Problems checking switch compliance
Problems checking switch compliance
The Last switch field indicates the switch compliance status and provides the date
when the last switch was performed. This field is displayed correctly when certain
requirements have been met. The following problems can occur:
•
Approaching or out of compliance - The status of the field is highlighted to indicate
the number of days since the last switch is at the limit of what is considered to be
best practice. Schedule and perform a switch to resolve this problem.
•
No Last switch field - This field is only displayed when there is a value specified
for the Default model switch framework policy. The date indicates when the last
completed switch was performed using the switch framework specified in the
policy. Specify the name of the model switch framework you use for switching in
policies. See “Policies for switching with model switch framework” on page 48.
“Policies for switching with model switch framework” on page 48
254
Performing a data group switch
Performing a data group switch
Performing a data group switch changes the direction of replication for a data group
through the Switch Data Group (SWTDG) command. Only replication for the selected
data group is switched. You may want to perform a data group switch if you are having
problems with an application that only affects a specific data group or if you need to
manually load balance because of heavily used applications.
Note: You cannot switch a disabled data group. For more information, see “Disabling
and enabling data groups” on page 269.
To perform a data group switch, do the following:
1. If you will be performing a planned switch, do the following:
a. Shut down any applications that have database file or objects defined to the
data group.
b. Ensure that you have addressed any critical database files that are held due to
error or held for other reasons.
c. Ensure there are no pending object activity entries by entering: WRKDGACTE
STATUS(*ACTIVE)
2. From the Work with Data Groups display, type the option for the type of switch you
want next to the data group you want to switch and press Enter.
•
Use option 15 for a planned switch
•
Use option 16 for an unplanned switch
3. Some of the parameter values that you may want to consider when the Switch
Data Group display appears are:
•
If you specified Switch type of *PLANNED and have specified a number for the
Wait time (seconds) parameter, you can specify a value for the Timeout Option
parameter to specify what action you want the SWTDG command to perform if
the time specified in the Wait time (seconds) parameter is exceeded. When
you are performing a planned switch you may want to specify the number of
seconds to wait before all the active data group processes end. If you specify
*NOMAX the switch process will wait until all data group processes are ended.
This could delay the switch process.
•
You can use the Conditions that end switch parameter to specify the types of
errors that you want to end the switch operation. To ensure that the most
comprehensive checking options are used, choose *ALL. For a planned
switch, the default value, *DFT, is the same as *ALL. For an unplanned switch,
*DFT will prevent the switch only when database apply backlogs exist.
•
Verify that the value for the Start journaling on new source prompt is what you
want. If necessary, change the value.
4. After the confirmation screen, press F16 to continue.
5. Press Enter. Messages appear indicating the status of the switch request. When
you see a message indicating that the switch is complete, users can begin
processing as usual on the temporary source system.
255
Performing a data group switch
6. If you performed an unplanned switch, perform journal analysis on the original
source system as soon as it is available, to determine if any transactions were
missed. Use topic “Performing journal analysis” on page 295.
7. Start the data group, clearing pending entries, using the procedure in “Starting
selected data group processes” on page 181. This starts replication in the new
temporary direction.
256
Switch Data Group (SWTDG) command
Switch Data Group (SWTDG) command
The Switch Data Group (SWTDG) command provides the following parameters to
control how you want your switch operation handled:
•
The Wait time (seconds) parameter (WAIT) is used to specify the number of
seconds to wait for all of the active data group processes to end. The function of
the default value *DFT is different for planned switches than it is for unplanned
switches. For a planned switch, the value *DFT is equivalent to the value
*NOMAX. For an unplanned switch, the value *DFT is set to wait 300 seconds (5
minutes) for all of the active data group processes to end.
•
If you specify a value for the WAIT parameter you can use the Timeout option
parameter (TIMOUTOPT) to specify what action to take when the wait time you
specified is reached. The function of the default value *DFT is different for planned
switches than it is for unplanned switches. For a planned switch, the value *DFT is
equivalent to the value *QUIT. When the value specified for the WAIT parameter is
reached, the current process quits and returns control to the caller. For an
unplanned switch, the value *DFT is equivalent to the value *NOTIFY. When the
value specified for the WAIT parameter is reached, an inquiry message is sent to
notify the operator of a possible error condition.
•
The Conditions that end switch (ENDSWT) parameter is used to specify which
conditions should end the switch process. The function of the default value *DFT
is different for planned switches than it is for unplanned switches.
– For a planned switch, the value *DFT is equivalent to the value *ALL. The
value *ALL provides the most comprehensive checking for conditions that are
not compatible with best practices for switching. Additionally, the value *ALL
ensures that your programs will automatically include any future ENDSWT
parameter values that may be added to maintain a conservative approach to
the switching operation.
– For an unplanned switch, the value *DFT ends the process if there are any
backlogs for the database apply process. However, backlogs on other user
journal processes are not checked and switch processing is not ended even
though conditions may exist which are not compatible with best practices for
switching and may result in the loss of data.
•
The Start journaling on new source (STRJRNSRC) parameter is used to specify
whether you want to start journaling for the data group on the new source system.
•
The End journaling on new target (ENDJRNTGT) parameter is used to specify
whether you want to end journaling of the data group on the new target system.
•
The End remote journaling (ENDRJLNK) parameter is used in a planned switch of
a data group that uses remote journaling. This parameter specifies whether you
want to end remote journaling for the data group. The default behavior is to leave
the RJ link running. You need to consider whether to keep the RJ link active after
a planned switch of a data group. For more information, see “When to end the RJ
link” on page 188.
•
The Change user journal receiver (CHGUSRRCV) parameter is used to specify
whether or not you want MIMIX to create and attach a new user (database) journal
257
Switch Data Group (SWTDG) command
receiver during the switch operation. If you have applications that are dependent
on the receiver name for recovery purposes, It is recommended that you choose
CHGUSRRCV(*NO) to prevent a new journal receiver from being created during a
data group switch.
•
The Change system journal receiver (CHGSYSRCV) parameter is used to specify
whether or not you want MIMIX to create and attach a new journal receiver to the
system (audit) journal (QAUDJRN) during the switch operation. If you have
applications that are dependent on the receiver name for recovery purposes, it is
recommended that you choose CHGSYSRCV(*NO) to prevent a new journal
receiver from being created during a data group switch.
•
The End if database errors (ENDDBERR) parameter has been obsoleted by the
Conditions that end switch (ENDSWT) parameter. Previously, the ENDDBERR
parameter was used to specify whether to switch the data group when data
replication errors exist. Use the ENDSWT parameter and specify *DBERR to
produce the equivalent of ENDDBERR(*YES), or *NONE to produce the
equivalent of ENDDBERR(*NO).
•
The Confirm (CONFIRM) parameter is used to specify if a confirmation panel is
displayed. The default is *NO (the confirmation panel is not displayed). Note that
options for switching on the Work with Data Groups display call the SWTDG
command with *YES specified so that the confirmation panel is automatically
displayed and the user must press F16 to continue.
258
CHAPTER 14
Less common operations
This chapter describes how to perform infrequently used operations that help keep
your MIMIX environment running. The following topics are included:
•
“Starting the TCP/IP server” on page 260 contains the procedure for starting the
TCP/IP server.
•
“Ending the TCP/IP server” on page 261 contains the procedure for ending the
TCP/IP server.
•
“Working with objects” on page 262 contains tips for working with long object and
IFS path names.
•
“Viewing status for active file operations” on page 263 describes how to check
status when replicating database files that you are reorganizing or copying with
MIMIX Promoter.
•
“Displaying a remote journal link” on page 264 describes how to display
information about he link between a source journal definition and a target journal
definition.
•
“Displaying status of a remote journal link” on page 265 includes procedures for
determining whether a data group uses remote journaling and for checking the
status of a remote journal link.
•
“Identifying data groups that use an RJ link” on page 267 includes the procedure
to determine which data groups use a remote journal link.
•
“Identifying journal definitions used with RJ” on page 268 describes how to
determine whether a journal definition is defined to one or more remote journal
links.
•
“Disabling and enabling data groups” on page 269 describes when it can be
beneficial to disable and enable data groups. Procedures for these processes are
included in this topic.
•
“Determining if non-file objects are configured for user journal replication” on
page 271 provides procedures for determining whether configured for IFS objects,
data areas, and data queues are configured to be cooperatively processed
through the user journal.
•
“Using file identifiers (FIDs) for IFS objects” on page 273 describes file identifiers
(FIDs) which are used by commands to uniquely identify the correct IFS tracking
entries to process.
•
“Operating a remote journal link independently” on page 274 describes how to
configure, start, and end a remote journal link without defining data to be
replicated by MIMIX processes.
259
Starting the TCP/IP server
Starting the TCP/IP server
Use this procedure if you need to manually start the TCP/IP server.
Once the TCP communication connections have been defined in a transfer definition,
the TCP server must be started on each of the systems identified by the transfer
definition.
You can also start the TCP/IP server automatically through an autostart job entry.
Either you can change the transfer definition to allow MIMIX to create and manage
the autostart job entry for the TCP/IP server, or you can add your own autostart job
entry. MIMIX only manages entries for the server when they are created by transfer
definitions.
When configuring a new installation, transfer definitions and MIMIX-added autostart
job entries do not exist on other systems until after the first time the MIMIX managers
are started. Therefore, during initial configuration you may need to manually start the
TCP server on the other systems using the STRSVR command.
Note: Use the host name and port number (or port alias) defined in the transfer
definition for the system on which you are running this command.
Do the following on the system on which you want to start the TCP server:
1. From the MIMIX Intermediate Main Menu, select option 13 (Utilities menu) and
press Enter.
2. The Utilities Menu appears. Select option 51 (Start TCP server) and press Enter.
3. The Start Lakeview TCP Server display appears. At the Host name or address
prompt, specify the host name or address for the local system as defined in the
transfer definition.
4. At the Port number or alias prompt, specify the port number or alias as defined in
the transfer definition for the local system.
Note: If you specify an alias, you must have an entry in the service table on this
system that equates the alias to the port number.
5. Press Enter.
6. Verify that the server job is running under the MIMIX subsystem on that system.
You can use the Work with Active Jobs (WRKACTJOB) command to look for a job
under the MIMIXSBS subsystem with a function of PGM-LVSERVER.
260
Ending the TCP/IP server
Ending the TCP/IP server
To end the TCP server, do the following on both systems defined by the transfer
definition. One example of why you might end the TCP server is when you are
preparing to upgrade the MIMIX products in a product library.
Note: Use the host name and port number (or port alias) defined in the transfer
definition for the system you on which you are running this command
To end the TCP server on a system, do the following:
1. From the MIMIX Intermediate Main Menu, select option 13 (Utilities menu) and
press Enter.
2. The Utilities Menu appears. Select option 52 (End TCP server) and press Enter.
3. The End Lakeview TCP Server display appears. At the Host name or address
prompt, specify the host name for the local system as specified in the transfer
definition.
4. At the Port number or alias prompt, verify that the value shown is what you want. If
necessary change the value.
Note: If the configuration uses port aliases, specify the alias for local system.
Otherwise, specify the port number for the local system.
5. Press Enter.
261
Working with objects
Working with objects
When working with objects, these tips may be helpful.
Displaying long object names
The names of some IFS entries cannot be fully displayed in the limited space on a
"Work with" display. These entries are shown with a ‘>’ character in the right-most
column of the Object field.
You can display long object names from the following displays:
•
Work with Data Group IFS Entries display
•
Work with Data Group Activity
•
Work with Data Group Activity Entries
To display the entire object name from any of these displays, position the cursor on an
entry which indicates a long name and press F22 (Display entire field).
Considerations for working with long IFS path names
MIMIX currently replicates IFS path names of 512 characters. However, any MIMIX
command that takes an IFS path name as input may be susceptible to a 506
character limit. This character limit may be reduced even further if the IFS path name
contains embedded apostrophes ('). In this case, the supported IFS path name length
is reduced by four characters for every apostrophe the path name contains.
For information about IFS path name naming conventions, refer to the IBM book,
Integrated File System Introduction V5R4.
Displaying data group spooled file information
If spooled files are created as a result of MIMIX replication, you can access the
spooled file and the associated data group entry from the Work with Data Group
Activity display.
To access the spooled file information, do the following:
1. From the MIMIX Basic Main Menu, select option 6 (Work with data groups) and
press Enter.
2. The Work with Data Groups display appears. Select option 14 (Active objects) for
the data group you want to view and press Enter. The Work with Data Group
Activity display appears.
3. From this display, press F16 (Spooled Files) to access the Display Data Group
Spooled Files display. This display lists all of the current spooled files and shows
the mapping of their names between the source and target systems.
262
Viewing status for active file operations
Viewing status for active file operations
If you are replicating database files that you are reorganizing or copying with MIMIX
Promoter, you can check on the status of these operations. Do the following:
1. From the MIMIX Basic Main Menu, use F21 (Assistance level) to access the
intermediate menu.
2.
From the MIMIX Intermediate Main Menu, select option 13 (Utilities menu) and
press Enter.
3. From the MIMIX Utilities Menu, select option 63 (Work with copy status) and press
Enter.
4. The Work with Copy Status display appears. From this display you can track the
status of active copy or reorganize operations, including the replication of physical
file data as specified by METHOD(*DATA) on the Synchronize Data Group File
Entry (SYNCDGFE) command.
Note: You can only see status for the system on which you are working.
263
Displaying a remote journal link
Displaying a remote journal link
To display information about the link between a source journal definition and a target
journal definition, do the following:
1. From the Work with RJ Links display, type a 5 (Display) next to the entry you want
and press Enter.
2. The Display Remote Journal Link (DSPRJLNK) display appears, showing the
current values defined for the link.
264
Displaying status of a remote journal link
Displaying status of a remote journal link
To check the status of a remote journal link, do the following:
1. Type the command WRKRJLNK and press Enter.
2. The Work with RJ Links display appears with a list of defined links.
The Dlvry column indicates configured value for how the IBM i remote journal function
sends the journal entries from the source journal to the target journal. The possible
values for delivery are asynchronous (*ASYNC) and synchronous (*SYNC).
*ASYNC - Journal entries are replicated asynchronously, independent of the
applications that create the journal entries. The applications continue processing
while an independent system task delivers the journal entries. If a failure occurs on
the source system, journal entries on the source system may become trapped
because they have not been delivered to the target system.
*SYNC - Journal entries are replicated synchronously. The applications do not
continue processing until after the journal entries are sent to the target journal. If a
failure occurs on the source system, the target system contains the journal entries
that have been generated by the applications.
The State column represents the composite view of the state of the remote journal
link. Because the RJ link has both source and a target component, the state shown is
that of the component which has the most severe state. Table 49 shows the possible
states of an RJ link, listed in order from most severe to least severe.
Table 49.
Possible states for RJ links, shown in order starting with most severe.
State
Description
The following states are considered to be inactive:
*UNKNOWN
Neither journal defined to the remote journal link resides on the local
system so the state of the link cannot be checked.
*NOTAVAIL
The ASP where the journal is located is varied off.
*NOTBUILT
The remote journal link is defined to MIMIX but one of the associated
journal environments has not been built.
*SRCNOTBLT
The remote journal link is defined to MIMIX but the associated source
journal environment has not been built.
*TGTNOTBLT
The remote journal link is defined to MIMIX but the associated target
journal environment has not been built.
*FAILED
The remote journal cannot receive journal entries from the source
journal due to an error condition.
*CTLINACT
The remote journal link is processing a request for a controlled end.
*INACTIVE
The remote journal link is not active.
The following states are considered to be active:
265
Displaying status of a remote journal link
Table 49.
Possible states for RJ links, shown in order starting with most severe.
State
Description
*INACTPEND
An active remote journal link is in the process of becoming inactive. For
asynchronous delivery, this is a transient state that will resolve
automatically. For synchronous delivery, one system is inactive while the
other system is inactive with pending unconfirmed entries.
*SYNCPEND
An active remote journal link is connected using synchronous delivery
and is running in catch-up mode. The state will become *SYNC when
catch-up mode ends.
*ASYNCPEND
An active remote journal link is connected using asynchronous delivery
and is running in catch-up mode. The state will become *ASYNC when
catch-up mode ends.
*SYNC
An active remote journal link is connected using synchronous delivery
mode.
*ASYNC
An active remote journal link is connected using asynchronous delivery
mode.
266
Identifying data groups that use an RJ link
Identifying data groups that use an RJ link
Use this procedure to determine which data groups use a remote journal link before
you end a remote journal link or remove a remote journaling environment.
1. Enter the command WRKRJLNK and press Enter.
2. Make a note of the name indicated in the Source Jrn Def column for the RJ Link
you want.
3. From the command line, type WRKDGDFN and press Enter.
4. For all data groups listed on the Work with DG Definitions display, check the
Journal Definition column for the name of the source journal definition you
recorded in Step 2.
•
If you do not find the name from Step 2, the RJ link is not used by any data
group. The RJ link can be safely ended or can have its remote journaling
environment removed without affecting existing data groups.
•
If you find the name from Step 2 associated with any data groups, those data
groups may be adversely affected if you end the RJ link. A request to remove
the remote journaling environment removes configuration elements and
system objects that need to be created again before the data group can be
used. Continue with the next step.
5. Press F10 (View RJ links). Consider the following and contact your MIMIX
administrator before taking action that will end the RJ link or remove the remote
journaling environment.
•
When *NO appears in the Use RJ Link column, the data group will not be
affected by a request to end the RJ link or to end the remote journaling
environment.
Note: If you allow applications other than MIMIX to use the RJ link, they will be
affected if you end the RJ link or remove the remote journaling
environment.
•
When *YES appears in the Use RJ Link column, the data group may be
affected by a request to end the RJ link. If you use the procedure for ending a
remote journal link independently in topic “Ending a remote journal link
independently” on page 274, ensure that any data groups that use the RJ link
are inactive before ending the RJ link.
267
Identifying journal definitions used with RJ
Identifying journal definitions used with RJ
To see whether a journal definition is defined to one or more remote journal links, do
the following:
1. From the MIMIX Basic Main Menu, select option 11 (Configuration menu) and
press Enter.
2. The MIMIX Configuration menu appears. Select option 3 (Work with journal
definitions) and press Enter.
3. The Work with Journal Definitions display appears. The RJ Link column indicates
whether or not the journal definition is used by a remote journal link. A blank value
indicates the journal definition is not associated with a remote journal link.
Values that indicate the definition is used by a remote journal link are as follows:
*SOURCE - The journal definition is a source journal definition in a remote journal
link.
*TARGET - The journal definition is the target journal definition in a remote journal
environment.
*BOTH - The journal definition is the source journal definition for one remote
journal link and is also a target journal definition for another remote journal link in
a cascading environment.
*NONE - The journal definition is not used with the MIMIX RJ support.
4. To see the remote journal links associated with a journal definition, type 12 (Work
with RJ Links) and press Enter.
268
Disabling and enabling data groups
Disabling and enabling data groups
MIMIX supports the concept of disabled data groups in a replication environment. The
ability to disable a data group, and enable it later as desired, can be beneficial in a
variety of configuration scenarios.
The ability to disable a data group is particularly helpful in advanced cluster scenarios,
where inactive data groups may be a necessary component of the replication
environment. Because these data groups are inactive as part of the design, the user
does not need to be notified when the data groups are in error.
Disabling a data group is also useful in non-cluster situations. If you create a data
group for testing purposes, for example, you no longer have to delete the data group
in order to clean up your environment when testing is complete. Instead, you can
simply disable the data group until it is needed again. This provides the benefit of
retaining your object, file, IFS, and DLO entries while the data group is not needed.
Additionally, the journal manager does not retain journal receivers that have not been
processed by a disabled data group, which allows you to save storage space on your
system.
With support for disabled data groups, you also avoid having to start each data group
individually when an installation has data groups configured to replicate in different
directions. Let us assume you have two sets of data groups: one set configured to
replicate from System A to System B, and another set configured to replicate from
System B to System A. To start only those data groups replicating from System A to
System B, it was previously necessary to start them individually in order to prevent
those replicating from System B to System A from starting as well. Now you can
disable the data groups you do not want to start and simply start the remaining data
groups using the Start MIMIX (STRMMX) command.
Customers with many systems and data groups across varying time zones may find
support for disabled data groups useful when performing upgrades. Disabling data
groups allows you to stagger upgrades, causing minimal impact to your replication
environment. In this situation, you install a new installation and copy the configuration
data from the old installation using the Copy Configuration Data (CPYCFGDTA)
command. Over a convenient period of time, you can end and disable each data
group on the old (original) installation, then enable and start each data group on the
new installation. Once all data groups in the old installation are disabled and all data
groups in the new installation are enabled, the old installation can be deleted.
A disabled data group is initiated by a user and is in a state of *DISABLE. An enabled
data group can be active or inactive. The Change Data Group (CHGDG) command
can be used to change the state of a data group.
Only inactive data groups and data groups that do not have processes suspended at
a recovery point can be disabled. To make a data group inactive, you must end the
data group. The request to end the data group will clear any recovery point.
Disabled data groups are indicated by a status of -D (in green) on the Work with Data
Groups (WRKDG) display. You can optionally not display disabled data groups by
specifying a different value on the STATE parameter of the WRKDG command. Once
a data group that is not part of an application group is disabled, it cannot be started,
ended, or switched.
269
Disabling and enabling data groups
Note: If the data group is part of an application group, the Switch Application Group
(SWTAG) procedure may change its state so that it gets enabled and
switched. In this case, if you do not want the data group to be switched,
change the Allow to be switched (ALWSWT) parameter to *NO in the Data
Group Definition (DGDFN).
When a disabled data group is enabled, any pending entries must be cleared when
the data group is started. Specify CLRPND(*YES) on the Start Data Group command.
Procedures for disabling and enabling data groups
The Change Data Group (CHGDG) command allows you to disable or enable a data
group by changing its state. This command requires that the system manager is
active and communication with the remote system is active.
To disable or enable an individual data group, do the following:
1. On a command line, type CHGDG and press Enter. The Change Data Group
display appears.
2. At the Data group definition prompts, fill in the values you want or press F4 for a
valid list.
3. At the State prompt, do one of the following:
•
To keep the state of the data group the same, specify the default, *SAME.
•
To change the state of an active data group, you must first end the data group
by running the End Data Group (ENDDG) command. See “Ending selected
data group processes” on page 198. To disable an enabled data group, specify
*DISABLE. When the state of the data group is changed to disabled, the status
of the data group changes from *INACTIVE to *DISABLED.
•
To enable a disabled data group, specify *ENABLE. When the state of the data
group is changed to enabled, the status of the data group changes from
*DISABLED to *INACTIVE.
4. Press Enter to confirm your changes.
Note: To start an enabled data group, you must specify *YES for the Clear pending
entries prompt on the Start Data Group (STRDG) command.
270
Determining if non-file objects are configured for user journal replication
Determining if non-file objects are configured for user
journal replication
MIMIX can take advantage of IBM i journaling functions that provide change-level
details in journal entries in a user journal for object types other than files (*FILE).
When properly configured, MIMIX can cooperatively process IFS stream files, data
areas, and data queues between system journal and user journal replication
processes. This enables changes to data or attributes to be replicated through the
user journal instead of replicating the entire object through the system journal every
time a change occurs.
Determining how IFS objects are configured
In order for IFS objects to be replicated from the user journal, one or more data group
IFS entries must be configured to process cooperatively with the user journal. Also,
IFS tracking entries must exist for the object identified by the data group IFS entries.
To determine if a data group has any IFS objects that are configured for user journal
replication and has any corresponding IFS tracking entries, do the following:
1. From the MIMIX Basic Main Menu select option 6 (Work with data groups) and
press Enter.
2. The Work with Data Groups display appears. Type 22 (IFS entries) next to the
data group you want and press Enter.
The Work with DG IFS Entries display appears, showing the IFS entries
configured for the data group.
3. Press F10 twice to access the CPD view.
4. The values shown in the Coop with DB column indicate how objects identified by
the data group IFS entries will be replicated.
•
Entries with the value *YES are configured for user journal replication.
Continue with the next step to ensure that IFS tracking entries exist for the IFS
objects. Replication cannot occur without tracking entries.
•
Entries the value *NO are configured for system journal replication.
To view additional information for a data group IFS entry, type 5 (Display) next to
the entry and press Enter.
5. Press F12 (Cancel) to return to the Work with Data Groups display. Then type 50
(IFS trk entries) next to the data group you want and press Enter.
6. The Work with DG IFS Trk. Entries display appears with a list of tracking entries
for the IFS objects identified for replication by the data group. If there are no
tracking entries listed but Step 4 indicates that properly configured data group IFS
entries exist, the tracking entries must be loaded. For more information about
loading tracking entries, see the MIMIX Administrator Reference book.
271
Determining if non-file objects are configured for user journal replication
Determining how data areas or data queues are configured
In order for data area and data queue objects to be replicated from the user journal,
one or more data group object entries must be configured to process cooperatively
with the user journal. Also, object tracking entries must exist for the object identified
by the data group object entries.
To determine if a data group has any data area or data queue objects that are
configured for user journal replication and has any corresponding object tracking
entries, do the following:
1. From the MIMIX Basic Main Menu select option 6 (Work with data groups) and
press Enter.
2. The Work with Data Groups display appears. Type 20 (Object entries) next to the
data group you want and press Enter.
The Work with DG Object Entries display appears, showing the object entries
configured for the data group.
3. For each entry in the list, do the following:
a. Type a 5 (Display) next to the entry and press Enter.
b. The object entry must have the following values specified in the fields
indicated:
• The Object type field must be *ALL, *DTAARA, or *DTAQ
• The Cooperate with database field must be *YES
• The Cooperating object types field must specify *DTAARA to replicate data
areas and *DTAQ to replicate data queues.
4. Press F12 (Cancel) to return to the Work with Data Groups display. Then type 52
(Obj trk entries) next to the data group you want and press Enter.
5. The Work with DG Obj. Trk. Entries display appears with a list of tracking entries
for the data area and data queue objects identified for replication by the data
group. If there are no tracking entries listed but Step 3 indicates that properly
configured data group object entries exist, the tracking entries must be loaded.
For more information about loading tracking entries, see the MIMIX Administrator
Reference book.
272
Using file identifiers (FIDs) for IFS objects
Using file identifiers (FIDs) for IFS objects
Commands used for user journal replication of IFS objects use file identifiers (FIDs) to
uniquely identify the correct IFS tracking entries to process. The System 1 file
identifier and System 2 file identifier prompts ensure that IFS tracking entries are
accurately identified during processing. These prompts can be used alone or in
combination with the System 1 object prompt.
These prompts enable the following combinations:
•
Processing by object path: A value is specified for the System 1 object prompt
and no value is specified for the System 1 file identifier or System 2 file identifier
prompts.
When processing by object path, a tracking entry is required for all commands
with the exception of the SYNCIFS command. If no tracking entry exists, the
command cannot continue processing. If a tracking entry exists, a query is
performed using the specified object path name.
•
Processing by object path and FIDs: A value is specified for the System 1
object prompt and a value is specified for either or both of the System 1 file
identifier or System 2 file identifier prompts.
When processing by object path and FIDs, a tracking entry is required for all
commands. If no tracking entry exists, the command cannot continue processing.
If a tracking entry exists, a query is performed using the specified FID values. If
the specified object path name does not match the object path name in the
tracking entry, the command cannot continue processing.
•
Processing by FIDs: A value is specified for either or both of the System 1 file
identifier or System 2 file identifier prompts and, with the exception of the
SYNCIFS command, no value is specified for the System 1 object prompt. In the
case of SYNCIFS, the default value *ALL is specified for the System 1 object
prompt.
When processing by FIDs, a tracking entry is required for all commands. If no
tracking entry exists, the command cannot continue processing. If a tracking entry
exists, a query is performed using the specified FID values.
273
Operating a remote journal link independently
Operating a remote journal link independently
You can configure, start, and end a remote journal link without defining data to be
replicated by MIMIX processes. For example, you might have a need to use remote
journals without performing data replication. The Start Remote Journal Link
(STRRJLNK) and End Remote Journal Link (ENDRJLNK) commands provide this
capability.
Note: These commands should only be used by personnel with experience using the
IBM i remote journal function.
For most needs, support for the RJ link that is integrated in the commands which start
and end replication processes (STRMMX, STRDG, ENDMMX, and ENDDG).
Starting a remote journal link independently
To start a remote journal link separately from other MIMIX processes, do the following:
1. To access the Work with Journal Links display, type the command WRKRJLNK and
press Enter.
2. From the Work with Remote Journal Links display, type a 9 (Start) next to the link
in the list that you want to start and press Enter.
3. The Start Remote Journal Link (STRRJLNK) display appears. Specify the value
you want for the Starting journal receiver prompt.
4. To start remote journaling for the specified link, press Enter.
Ending a remote journal link independently
Default values for this command will perform an immediate end for the specified link.
Be aware that the actions taken by the ENDOPT parameter on this command are
different from the actions taken when you perform an immediate or controlled end of a
MIMIX data group. For more information about the differences between this command
and the End Data Group (ENDDG) command, see the MIMIX Reference book.
For the following situations, an immediate end is always performed (the value
specified for the ENDOPT parameter is ignored):
•
The remote journal function is running in synchronous mode
(DELIVERY(*SYNC)).
•
The remote journal function is performing catch-up processing.
To end a remote journal link separately from other MIMIX processes, do the following:
1. To access the Work with Journal Links display, type the command WRKRJLNK and
press Enter.
2. From the Work with Remote Journal Links display, type a 10 (End) next to the link
in the list that you want to end.
3. Do one of the following:
•
To perform an immediate end from the source system, press Enter. This
completes the procedure for an immediate end.
274
Operating a remote journal link independently
•
To perform a controlled end or to end from the target system, press F4
(Prompt), then continue with the next step.
4. The End Remote Journal Link (ENDRJLNK) display appears. Press F10
(Additional parameters).
5. To perform a controlled end, specify *CNTRLD at the End remote journal link
prompt. If you need to end from the target system, specify *TGT at the End RJ link
on system prompt. To process the request, press Enter.
275
CHAPTER 15
Troubleshooting - where to start
Occasionally, a situation may occur that requires user intervention. This section
provides information to help you troubleshoot problems that can occur in a MIMIX
environment.
You can also consult our website at www.mimix.com for the latest information and
updates for MIMIX products.
The following topics are included in this chapter:
•
“Gathering information before reporting a problem” on page 278 describes the
information you should gather before you report a problem. A procedure is
included to help you gather this information.
•
“Reducing contention between MIMIX and user applications” on page 279
describes a processing timing issue that may be resolved by specifying an Object
retrieval delay value on the commands for creating or changing data group
entries.
•
“Data groups cannot be ended” on page 280 describes possible causes for a data
group that is taking too long to end.
•
“Verifying a communications link for system definitions” on page 281 describes
the process to verify that the communications link defined for each system
definition is operational.
•
“Verifying the communications link for a data group” on page 282 includes a
process to use before synchronizing data to ensure that the communications link
for the data group is active.
•
“Checking file entry configuration manually” on page 283 includes the process for
checking that correct data group file entries exist with respect to the data group
object entries. This process uses the Check DG File Entries (CHKDGFE)
command.
•
“Data groups cannot be started” on page 285 describes some common reasons
why a data group may not be starting.
•
“Cannot start or end an RJ link” on page 286 describes possible reasons that can
prevent you from starting or ending an RJ link. This topic includes a procedure for
removing unconfirmed entries to free an RJ link.
•
“RJ link active but data not transferring” on page 287 describes why an RJ link
may not be transferring data and how to resolve this problem.
•
“Errors using target journal defined by RJ link” on page 288 describes why errors
when using a target journal defined by an RJ link can occur and how to resolve
them.
•
“Verifying data group file entries” on page 289 includes a procedure for verifying
data group file entries using the Verify Data Group File Entries (VFYDGFE)
command.
•
“Verifying data group data area entries” on page 289 includes a procedure for
276
verifying data group data area entries using the Verify Data Group Data Area
Entries (VFYDGDAE) command. Data area entries are only used when data
areas are replicated by the data area poller process, which is not preferred.
•
“Verifying key attributes” on page 289 includes a procedure for verifying key
attributes using the VFYKEYATR (Verify Key Attributes) command.
•
“Working with data group timestamps” on page 291 describes timestamps and
includes information for creating, deleting, displaying, and printing them.
•
“Removing journaled changes” on page 294 describes the configuration
conditions that must be met using the Remove Journaled Changes
(RMVJRNCHG) journal entry.
•
“Performing journal analysis” on page 295 describes and includes the procedure
for performing journal analysis of the source system.
277
Gathering information before reporting a problem
Gathering information before reporting a problem
Before you report a problem, you should gather the following information:
•
The MIMIX product, library, installed version, and IBM i operating system level on
the system you are using. To determine this information, follow the procedure
“Obtaining MIMIX and IBM i information from your system” on page 278.
•
The Message ID number for any error messages associated with the problem. If
you receive error messages, record the message number, any replacement text
(such as “Process X failed for file Y”), and the to and from program information, if
available. Since many messages have similar text, this information is much more
helpful to us and enables us to handle your call more efficiently.
•
The specific operation you were attempting to perform when the error condition
occurred. It is important that we understand what you were trying to do when you
encountered the problem. Try to write down the specific sequence of events that
you were doing when the error condition occurred, such as the commands
entered, the display you were working from, or the program that was running.
Obtaining MIMIX and IBM i information from your system
To obtain the necessary MIMIX and IBM i information before reporting a problem, do
the following:
1. Do one of the following to access the Lakeview Technology Installed Products
display:
•
If you are configured for a MIMIX replication environment, select option 31
(Product management menu). Then select option 2 (Work with products).
•
From a command line, enter LAKEVIEW/WRKPRD
2. Next to the product you want, type a 6 (About version) and press Enter. The About
pop-up appears, showing the Product, Library, Installed version, and the OS/400
level on this system.
3. Press F9 (Fixes) to see the Work with Installed Fixes display. From this display
you can determine the latest level of the MIMIX cumulative fix package that is
installed.
Note: You should know the version and release level (VnRnMn) of the IBM i
operating system that is on each system with which you are working. Use
the process above on each system.
278
Reducing contention between MIMIX and user applications
Reducing contention between MIMIX and user applications
If your applications are failing in an unexpected manner, it may be caused by MIMIX
locking your objects for object retrieval processing while your applications are trying to
access the object. This is a processing timing issue and can be significantly reduced,
or eliminated, by specifying an appropriate delay value for the Object retrieval delay
element under the Object processing (OBJPRC) parameter on the change or create
data group definition commands.
Although you can specify this value at the data group level, you can override the data
group value at the object level by specifying an Object retrieval delay value on the
commands for creating or changing data group entries.
For more information see “Selecting an object retrieval delay” in the MIMIX
Administrator Reference book.
You should use care when choosing the object retrieval delay. A long delay may
impact the ability of MIMIX system journal replication processes to move data from a
system in a timely manner. Too short a delay may allow MIMIX to retrieve an object
before an application is finished with it. You should make the value large enough to
reduce or eliminate contention between MIMIX and applications, but small enough to
allow MIMIX to maintain a suitable high availability environment.
279
Data groups cannot be ended
Data groups cannot be ended
A controlled end for a data group may take some time if there is a backlog of files to
process or if there are a number of errors that MIMIX is attempting to resolve before
ending.
If you think that a data group is taking too long to end, check the following for possible
causes:
•
Check to see how many transactions are backlogged for the apply process. Use
option 8 (Display status) on the Work with Data Groups display to access the
detailed status. A number in the Unprocessed Entry Count column indicates a
backlog. Use F7 and F8 to see additional information.
•
Determine which replication process is not ending. Use the command
WRKSBSJOB SBS(MIMIXSBS) to see the jobs in the MIMIXSBS subsystem.
Look for jobs for replication processes that have not changed to a status of END.
For example, abc_OBJRTV, where abc is a 3-character prefix.
•
Check the QSYSOPR message log to see if there is message that requires a
reply.
•
You can use the WRKDGACTE STATUS(*ACTIVE) command to ensure all data
group activity entries are completed. If a controlled end was issued, all activity
entries must be processed before the object processes are ended.
280
Verifying a communications link for system definitions
Verifying a communications link for system definitions
Do the following to verify that the communications link defined for each system
definition is operational:
1. From the MIMIX Basic Main Menu, type an 11 (Configuration menu) and press
Enter.
2. From the MIMIX Configuration Menu, type a 1 (Work with system definitions) and
press Enter.
3. From the Work with System Definitions display, type an 11 (Verify
communications link) next to the system definition you want and press Enter. You
should see a message indicating the link has been verified.
Note: If the system manager is not active, this process will only verify that
communications to the remote system is successful. You will also see a
message in the job log indicating that “communications link failed after 1
request.” This indicates that the remote system could not return
communications to the local system.
4. Repeat this procedure for all system definitions. If the communications link
defined for a system definition uses SNA protocol, do not check the link from the
local system.
Note: If your transfer definition uses the *TCP communications protocol, then
MIMIX uses the Verify Communications Link command to validate the
information that has been specified for the Relational database (RDB)
parameter. MIMIX also uses VFYCMNLNK to verify that the System 1 and
System 2 relational database names exist and are available on each
system.
281
Verifying the communications link for a data group
Verifying the communications link for a data group
Before you synchronize data between systems, ensure that the communications link
for the data group is active. This procedure verifies the primary transfer definition
used by the data group. If your configuration requires multiple data groups, be sure to
check communications for each data group definition.
Do the following:
1. From the MIMIX Basic Main Menu, type an 11 (Configuration menu) and press
Enter.
2. From the MIMIX Configuration Menu, type a 4 (Work with data group definitions)
and press Enter.
3. From the Work with Data Group Definitions display, type an 11 (Verify
communications link) next to the data group you want and press F4.
4. The Verify Communications Link display appears. Ensure that the values shown
for the prompts are what you want.
5. To start the check, press Enter.
6. You should see a message "VFYCMNLNK command completed successfully."
If your data group definition specifies a secondary transfer definition, use the following
procedure to check all communications links.
Verifying all communications links
The Verify Communications Link (VFYCMNLNK) command requires specific system
names to verify communications between systems. When the command is called from
option 11 on the Work with System Definitions display or option 11 on the Work with
Data Groups display, MIMIX identifies the specific system names.
For transfer definitions using TCP protocol: MIMIX uses the Verify
Communications Link (VFYCMNLNK) command to validate the values specified for
the Relational database (RDB) parameter. MIMIX also uses VFYCMNLNK to verify
that the System 1 and System 2 relational database names exist and are available on
each system.
When the command is called from option 11 on the Work with Transfer Definitions
display or when entered from a command line, you will receive an error message if
the transfer definition specifies the value *ANY for either system 1 or system 2.
1. From the Work with Transfer Definitions display, type an 11 (Verify
communications link) next to all transfer definitions and press Enter.
2. The Verify Communications Link display appears. If you are checking a Transfer
definition with the value of *ALL, you need to specify a value for the System 1 or
System 2 prompt. Ensure that the values shown for the prompts are what you
want and then press Enter.
You will see the Verify Communications Link display for each transfer definition
you selected.
3. You should see a message "VFYCMNLNK command completed successfully."
282
Checking file entry configuration manually
Checking file entry configuration manually
The Check DG File Entries (CHKDGFE) command provides a means to detect
whether the correct data group file entries exist with respect to the data group object
entries configured for a specified data group in your MIMIX configuration. When file
entries and object entries are not properly matched, your replication results can be
affected.
Note: The preferred method of checking is to use MIMIX AutoGuard to automatically
schedule the #DGFE audit, which calls the CHKDGFE command and can
automatically correct detected problems. For additional information, see
“Interpreting results for configuration data - #DGFE audit” on page 300.
To check your file entry configuration manually, do the following:
1. On a command line, type CHKDGFE and press Enter. The Check Data Group File
Entries (CHKDGFE) command appears.
2. At the Data group definition prompts, select *ALL to check all data groups or
specify the three-part name of the data group.
3. At the Options prompt, you can specify that the command be run with special
options. The default, *NONE, uses no special options. If you do not want an error
to be reported if a file specified in a data group file entry does not exist, specify
*NOFILECHK.
4. At the Output prompt, specify where the output from the command should be
sent—to print, to an outfile, or to both. See Step 6.
5. At the User data prompt, you can assign your own 10-character name to the
spooled file or choose not to assign a name to the spooled file. The default, *CMD,
uses the CHKDGFE command name to identify the spooled file.
6. At the File to receive output prompts, you can direct the output of the command to
the name and library of a specific database file. If the database file does not exist,
it will be created in the specified library with the name MXCDGFE.
7. At the Output member options prompts, you can direct the output of the command
to the name of a specific database file member. You can also specify how to
handle new records if the member already exists. Do the following:
a. At the Member to receive output prompt, accept the default *FIRST to direct
the output to the first member in the file. If it does not exist, a new member is
created with the name of the file specified in Step 6. Otherwise, specify a
member name.
b. At the Replace or add records prompt, accept the default *REPLACE if you
want to clear the existing records in the file member before adding new
records. To add new records to the end of existing records in the file member,
specify *ADD.
8. At the Submit to batch prompt, do one of the following:
•
If you do not want to submit the job for batch processing, specify *NO and
press Enter to check data group file entries.
283
Checking file entry configuration manually
•
To submit the job for batch processing, accept *YES. Press Enter and continue
with the next step.
9. At the Job description prompts, specify the name and library of the job description
used to submit the batch request. Accept MXAUDIT to submit the request using
the default job description, MXAUDIT.
10. At the Job name prompt, accept *CMD to use the command name to identify the
job or specify a simple name.
11. To start the data group file entry check, press Enter.
284
Data groups cannot be started
Data groups cannot be started
Two common reasons why a data group cannot be started are as follows:
•
The communications link between systems defined to the data group is not active.
Use the procedure “Verifying a communications link for system definitions” on
page 281.
•
The journaling environment for the data group has not been built. Verify that
journaling environment defined in the journal definition exists. If necessary, use
the appropriate procedure in the MIMIX Administrator Reference book.
•
The journal receiver has been deleted from the system. You can use WRKJRNA
to determine if the journal receiver exists on the source system.
285
Cannot start or end an RJ link
Cannot start or end an RJ link
In normal operations, unconfirmed entries are automatically handled by the RJ link
monitors. In the event of a switch, the unconfirmed entries are processed, ensuring
that you have the latest updates to your data.
However, there is a scenario where you may end up with a backlog of unconfirmed
entries that can prevent you from starting or ending an RJ link. This problem can
occur when all of the following are true:
•
The data group is not switchable or you do not want to switch it
•
A link failure on an RJ link that is configured for synchronous delivery leaves
unconfirmed entries
•
The RJ link monitors are not active, either because you are not using them or they
failed as a result of a bad link
To recover from this situation, you should run the Verify Communications Link
(VFYCMNLNK) command to assist you in determining what may by wrong and why
the RJ link will not start.
If you are using an independent ASP, check the transfer definition to ensure the
correct database name has been specified.
You also need to end the remote journal link from the target system. Ending the link
from the target system is a restriction of the IBM remote journal function.
Removing unconfirmed entries to free an RJ link
Note: You should never remove unconfirmed entries from a switchable data group
unless you are directed to by your MIMIX administrator or a CustomerCare
representative.
If you need to remove a backlog of unconfirmed entries, do the following:
1. Use the WRKRJLNK command to display the status of the RJ link. The status
shown on the Work with RJ Links display is the status of the link on the system
where you entered the command. (This system is identified at the upper right
corner of the display.) An RJ link with unconfirmed entries will have a state of
*INACTPEND.
Note: You may need to access this display from the other system defined by the
RJ link.
2. Ending the remote journal link on the system with unconfirmed entries will cause
them to be deleted. Do the following:
a. Type 10 (End) next to the link and press F4 (Prompt).
b. The End Remote Journal Link (ENDRJLNK) display appears. Default values
on this command ends the link from the source system. If there are
unconfirmed entries on the target system, press F10 (Additional parameters).
Then specify *TGT at the End RJ link on system prompt.
c. To process the request, press Enter.
286
RJ link active but data not transferring
RJ link active but data not transferring
Following an initial program load (IPL), the RJ link may appear to be active when data
cannot actually flow from the source system to the target system journal receiver. This
is an operating system restriction. MIMIX does not receive notification of a failure.
To recover, end the RJ lInk and restart it following an IPL. This can be included in
automation programs.
287
Errors using target journal defined by RJ link
Errors using target journal defined by RJ link
If you receive errors when using a target journal defined by an RJ link, you may need
to change the journal definition and journaling environment. This situation is caused
when the target journal definition is created as a result of adding an RJ link based on
a source journal definition which specified QSYSOPR as the threshold message
queue.
If you receive errors when using the target journal, do the following:
1. On the Work with Journal Definitions display, locate the target journal definition
that is identified by the errors.
2. Type a 5 (Display) next to the target journal definition and press Enter.
3. Page down to see the value of the Threshold message queue.
•
If the value is QSYSOPR, press F12 and continue with the next step.
•
For any other value, the cause of the problem needs further isolation beyond
this procedure.
4. Type a 2 (Change) next to the target journal definition and press Enter.
5. Press F9 (All parameters), then page down to locate the Threshold message
queue and Library prompts.
6. Change the Threshold message queue prompt to *JRNDFN and the Library
prompt to *JRNLIB, or to other acceptable values.
7. To accept the change, press Enter.
288
Verifying data group file entries
Verifying data group file entries
The Verify Data Group File Entries (VFYDGFE) command allows you to verify files
from a specific library by verifying the current state of the file on the system identified
in the data group as the source of data.
This procedure generates a report in a spooled file named MXVFYDGFE. The
information in the report includes whether each member for the specified search
criteria is defined to MIMIX, the journal and library to which it is journaled, whether it
uses after-image journaling or before- and after-image journaling, the apply session
used. This information can help you verify that you have all the files you need from a
library properly defined to MIMIX DB Replicator.
To verify data group file entries, do the following:
1. On a command line, type VFYDGFE (Verify Data Group File Entries). The Verify
DG File Entries display appears.
2. Specify the name of the data group at the Data group definition prompt.
3. At the System 1 file and Library prompts, specify the value you want and the
library in which the files are located.
4. If you want to create a spooled file that can be printed, specify *PRINT at the
Output prompt. Then press Enter.
Verifying data group data area entries
The Verify Data Group Data Area Entries (VFYDGDAE) command allows you to verify
the data areas in a specific library defined to a data group definition. The audit report
determines the data source for the data group and retrieves the appropriate
information.
This procedure generates a report in a spooled file named MXVFYDAE. The
information in the report includes whether each data area for the specified search
criteria is defined to MIMIX and the length of each data area. This information can
help you verify that you have all the data areas you need from a library defined to
MIMIX DB Replicator.
To verify data group data area entries, do the following:
1. On a command line, type VFYDGDAE (Verify Data Group Data Area Entries). The
Verify DG Data Area Entries (VFYDGDAE) display appears.
2. Specify the name of the data group at the Data group definition prompt.
3. At the System 1 data area and Library prompts, specify the value you want and
the library in which the data areas are located and press Enter.
Verifying key attributes
Before you configure for keyed replication, verify that the file or files you for which you
want to use keyed replication are actually eligible.
289
Verifying key attributes
Do the following to verify that the attributes of a file are appropriate for keyed
replication:
1. On a command line, type VFYKEYATR (Verify Key Attributes). The Verify Key
Attributes display appears.
2. Do one of the following:
•
To verify a file in a library, specify a file name and a library.
•
To verify all files in a library, specify *ALL and a library.
•
To verify files associated with the file entries for a data group, specify
*MIMIXDFN for the File prompt and press Enter. Prompts for the Data group
definition appear. Specify the name of the data group that you want to check.
3. Press Enter.
4. A spooled file is created that indicates whether you can use keyed replication for
the files in the library or data group you specified. Display the spooled file
(WRKSPLF command) or use your standard process for printing. You can use
keyed replication for the file if *BOTH appears in the Replication Type Allowed
column. If a value appears in the Replication Type Defined column, the file is
already defined to the data group with the replication type shown.
290
Working with data group timestamps
Working with data group timestamps
Timestamps allow you to view the performance of the database send, receive, and
apply processes for a data group to identify potential problem areas, such as a slow
send process, inadequate communications capacity, or excessive overhead on the
target system. Although they can assist you in identifying problem areas, timestamps
are not intended as an accurate means of calculating the performance of MIMIX.
A timestamp is a single record that is passed between all replication processes. The
timestamp originates on the source system as a journal entry, is sent to the target
system, and then processed by the associated apply session. The timestamp record
is updated with the date and time at each of the following areas during the replication
process:
•
Created - Date and time the journal entry is created
•
Sent - Date and time when the journal entry is sent to the target system
•
Received - Date and time when the journal entry is received
•
Applied - Date and time when the journal entry is applied
Note: For data groups that use remote journaling, the created and sent timestamps
will be set to the same value. The received timestamp will be set to the time
when the record was read on the target system by the database reader
process.
After all four timestamps have been added, the journal entry is converted and placed
into a file for viewing or printing. You can view timestamps only from the management
system. The system manager must be active to return the timestamps to the
management system.
Automatically creating timestamps
The data group definition includes a parameter for automatically creating timestamps.
MIMIX automatically creates a timestamp after the number of journal entries specified
in the Timestamp interval (TSPITV) has passed. The timestamp entry created is
placed at the end of all current entries in the journal receiver. You specify this value
when you create or change a data group definition. You can change this value at any
time.
Note: Data groups configured for remote journaling will not automatically generate
timestamps. To generate timestamps in this case, refer to “Creating
timestamps for remote journaling processing” on page 292.
Creating additional timestamps
Note: By using the Create Data Group Timestamps (CRTDGTSP) command in a
batch job, you can use timestamps to monitor performance at critical times in
your daily processing.
To create one or more timestamps, do the following:
1. From the Work with Data Groups display, type 41 (Timestamps) next to the data
group you want and press Enter.
291
Working with data group timestamps
2. The Work with DG Timestamps display appears. Type a 1 (Create) next to the
blank line at the top of the display and press Enter.
3. The Create Data Group Timestamps display appears. Specify the name of the
data group and the number of timestamps you want to create and press Enter.
Note: You should generate multiple timestamps to receive a more accurate view
of replication process performance.
Creating timestamps for remote journaling processing
If you need to generate timestamps to monitor replication performance, you can set
up automation to create them for remote journaling (RJ) data groups that you wish to
monitor.
In this procedure, you will create an interval monitor using the Create Monitor Object
(CRTMONOBJ) command. This is accomplished by specifying *CMD for the interface
exit program on the monitor object, and then Create Data Group Timestamps
(CRTDGTSP) for the command (*CMD). You can also run CRTDGTSP manually or
schedule a job to run the command in batch. For more information, see “Creating an
interval monitor” in the MIMIX Monitor book.
Do the following to create an interval monitor:
1. From the Work with Monitors display, type a 1 (Create) in the Opt column next to
the blank line at the top of the list and press Enter.
2. The Create Monitor Object (CRTMONOBJ) display appears. Do the following:
a. At the Monitor prompt, provide a unique name for the monitor.
b. At the Event class prompt, specify *INTERVAL.
c. At the Interface exit program prompt, specify *CMD.
d. At the Time interval (sec.) prompt, specify how often the interval monitor
should run and press Enter. By default, this monitor runs every 15 seconds.
Use your data group time stamp interval (default is every 20,000 entries) to
estimate how many entries you process a day. From there, determine how
often you need to run the monitor in order to provide an adequate sample.
3. The Add Monitor Information (ADDMONINF) display appears. Do the following:
a. At the Command prompt, type CRTDGTSP.
b. At the Library prompt, type the name of your installation library and press F4
(Prompt).
4. The Create Data Group Timestamps (CRTDGTSP) display appears. Do the
following:
a. At the Data group definition prompts, specify the name of the RJ data group.
b. At the Number of stamps to create prompt, specify the number of timestamps
you want to create and press Enter.
5. From the Work with Monitors display, type a 9 (Start) next to the interval monitor
you created. This allows you to start generating timestamps. For information
about viewing timestamps, see “Displaying or printing timestamps” on page 293.
292
Working with data group timestamps
Repeat this procedure for each RJ data group for which you want to generate
timestamps.
Deleting timestamps
You can delete all timestamps or you can select a group of one or more timestamps to
delete.
To delete timestamps for a data group, do the following:
1. From the Work with Data Groups display, type 41 (Timestamps) next to the data
group you want and press Enter.
2. The Work with DG Timestamps display appears. Type a 4 (Delete) next to the
timestamps you want to delete and press Enter.
3. A confirmation screen appears. Press Enter.
To selectively delete a range of timestamps, do the following:
1. Type the command DLTDGTSP and press F4 (Prompt).
2. The Delete Data Group Timestamps display appears. Specify values you want for
the Data group definition prompt.
3. Specify the values you want for the Starting date and time prompt and for the
Ending date and time prompt, then press Enter.
Displaying or printing timestamps
To display or print data group timestamps, do the following:
1. From the Work with Data Groups display, type 41 (Timestamps) next to the data
group you want and press Enter.
2. The Work with DG Timestamps display appears. Do one of the following:
•
To display the timestamp information, type a 5 (Display) next to the data group
you want.
•
To print the timestamp information, type a 6 (Print) next to the data group you
want.
3. Press Enter.
4. If you selected to display, the Display Data Group Timestamps display appears. If
you selected to print, a spooled file is created that you can print using your
standard printing procedures.
293
Removing journaled changes
Removing journaled changes
If the necessary environment is available, MIMIX can support the Remove Journaled
Changes (RMVJRNCHG) journal entry by simulating the Remove Journaled Changes
process on the backup system.
Note: This is a long running procedure and will affect your existing journal changes.
Ensure that performing this procedure is appropriate for your environment.
In order to use the Remove Journaled Changes journal entry, you must meet the
following criteria:
•
You must be configured for both before and after image journaling. This can be
defined as a default file entry option at the data group level or it can be defined for
individual data group file entries.
•
You must be configured with *SEND as the value of the Before images element of
the DB journal entry processing (DBJRNPRC) parameter of the data group
definition. This permits the database apply process to roll back certain types of
journal entries.
•
If you have large objects (LOBs), *YES must be the value for the Use remote
journal link (RJLNK) parameter of the data group definition.
•
The target system (where replicated changes are applied) must have the log
spaces that contain the original transactions. To ensure that the appropriate log
spaces are retained, you can do one of the following:
– Calculate how many log spaces need to be retained using the log space size
and the size and number of the receivers containing the appropriate journal
transactions. Then, set elements of the database apply processing
(DBAPYPRC) parameter in the data group definition.
– Use the Hold Data Group Log (HLDDGLOG) command to place a hold on the
delete operation of all log spaces for all apply sessions defined to the specified
data group. The log spaces are held until a request to release them with
Release Data Group Log (RLSDGLOG) command is received.
If you are changing an existing data group to have these values, you must end and
restart the data group before you are able to use the RMVJRNCHG command.
294
Performing journal analysis
Performing journal analysis
When a source system fails before MIMIX has sent all journal entries to the target
system, unprocessed transactions occur. Unprocessed transactions can also occur if
journal entries are in the communications buffer being sent to the target system when
the sending system fails.
Following an unplanned switch, unprocessed transactions on the original source
system must be addressed in order to prevent data loss before synchronizing data
and starting data groups.
The journal analysis process finds any missing transactions that were not sent to the
target system when the source system went down and an unplanned switch to the
backup was performed. Once unprocessed transactions are located, users must
analyze the journal entries and take appropriate actions to resolve them.
The time at which to perform journal analysis is when the original source system has
been brought back up and before performing the synchronization phase of the switch
(which synchronizes data and starts data groups). Analyze all data groups that were
not disabled at the time of the unplanned switch.
Note: The journal analysis tool is limited to database files replicated from a user
journal. The tool does not identify unprocessed transactions for data areas,
data queues, or IFS objects replicated through a user journal, or database files
configured for replication from the system journal.
From the original source system, do the following:
1. Ensure the following are started:
a. The port communications jobs (PORTxxxxx)
b. The MIMIX system managers using STRMMXMGR SYSDFN(*ALL)
MGR(*SYS) TGTJRNINSP(*NO)
IMPORTANT! Only the system managers should be started at this time. Do not
start journal managers. Also, do not start data groups at this time! Doing so will
delete the data required to perform the journal analysis.
2. From the Work with Data Groups display on the original source system, enter 43
(Journal analysis) next to the data group to be analyzed.
The Journal Analysis of Files display appears.
3. Check the list area for a pop-up window and do the following:
•
If a pop-up window with the message “Journal analysis information not
collected” is displayed, press Enter to collect journal analysis information, then
go to Step 5.
•
If you do not see a pop-up window, information about files from a previous run
of journal analysis exists, go to Step 4.
4. If you did not see a pop-up window in Step 3 and you need to clear data from a
previous run of journal analysis and collect new information, do the following:
a. If you want to keep information from a previous run, make a copy of file
DM6500P located in the installation library.
295
Performing journal analysis
b. Press F9 (Update display) to clear the screen and collect the new information.
A pop-up confirmation window with the following message is displayed:
“WARNING! The journal analysis journal entries file will be cleared!”
c. Press Enter to submit the update request.
5. The request to collect journal analysis information is submitted by job RTVFILANZ
using the job description MIMIXQGPL/MIMIXDFT. When the job completes,
“LVI3855 Retrieval of affected files for journal analysis completed normally”
appears in the message log. Press F5 (Refresh) to see the collected information.
It may take a short time to collect the information.
6. Retrieve journal entries. The journal entries for the files identified on the display
must be retrieved before you can use options to display or print statistics (5 and 6)
or display journal entries (11). Do one of the following:
•
Press F14 (Retrieve all entries). A pop-up window stating “Confirm retrieval of
ALL analysis journal entries” appears. Press Enter. (The retrieved information
is placed in an internal file.This does not produce a spool file.)
•
If there are a large number of files listed on the display, you may want to
retrieve entries for only a selected file at a time. Type option 9 (Retrieve journal
entries) next to the file to retrieve journal entries for and press Enter. The
retrieved journal entries are placed in a spool file named MXJEANZL.
Message: “LVI3856 Retrieval of journal entries for journal analysis completed
normally” appears in the message log.
7. Review the collected information using the following:
•
Use option 11 (Display journal entries) to view the entries for each file.
•
Use F21 (Print list) to print all entries for a file.
•
You can use options 5 (Display statistics) and 6 (Print statistics) to see the
statistical breakdown of journal entries for a selected file member identified by
journal analysis. The statistics include the number of adds, deletes, and
updates, along with the related file transactions and dates of the first and last
journal entries.
Figure 36 shows an example of the information displayed by option 11 for one
journal entry.
296
Performing journal analysis
Figure 36. Sample of one journal entry
Data group definition:
Journal definition:
<DGDFN> <SYS1> <SYS2>
<JRNDFN> <SYSDFN>
File identification
File . . . . : <FILE>
Library . : <LIB>
Member . . . : <MBR>
Journal header information
Journal code . . . . . :
Journal type . . . . . :
Generated date . . . . :
Generated time . . . . :
Job name . . . . . . . :
User name . . . . . . :
Job number . . . . . . :
Program name . . . . . :
R
DL
9/08/09
10:36:31
<JOB NAME>
<USER>
<JOB NBR>
<PROGRAM>
Record-level information
Delete record
Journal header information (continued)
Record length . . . . : 607
Record number . . . . : 838
Operation indicator . : 0
Commit cycle ID . . . : 0
Journal identification
Journal name . . . . . :
Library . . . . . . :
<JOURNAL>
<JRNLIB>
Receiver identification
Receiver name . . . . :
Library . . . . . . :
Sequence number . . . :
<RCVR>
<RCVRLIB>
<JOURNAL SEQUENCE>
8. Determine what action you need to take for each unprocessed entry. For example:
•
You may need to run the original job again on the current source system to
reproduce the entries.
•
If a file has already been updated on the current source system (manually or
otherwise), you may need to merge data from both files. If this is the case, do
not synchronize the files.
•
If there are write changes (R-PT entries), these changes should be made on
the current source system before running the synchronization phase of the
switch or starting data groups in order to maintain Relative Record Number
consistency within the file. If this is done after the data group has been started,
the relative record numbers could become unsynchronized between the two
systems.
Note: It is the customer’s responsibility to fix the files.
Removing journal analysis entries for a selected file
You can use option 4 (Remove journal entries) to remove all journal analysis journal
entries for a selected file member. A confirmation display appears to confirm your
choices. When you continue with the confirmation, the journal entries for the selected
297
Performing journal analysis
file member are immediately removed from the journal analysis information that is
displayed. It does not delete any other information contained in other MIMIX files.
298
Interpreting audit results supporting information
APPENDIX A
Audits use commands that compare and synchronize data. The results of the audits
are placed in output files associated with the commands. The following topics provide
supporting information for interpreting data returned in the output files.
•
“When the difference is “not found”” on page 302 provides additional
considerations for interpreting result of not found in priority audits.
•
“Interpreting results for configuration data - #DGFE audit” on page 300 describes
the #DGFE audit which verifies the configuration data defined to your
configuration using the Check Data Group File Entries (CHKDGFE) command.
•
“Interpreting results of audits for record counts and file data” on page 303
describes the audits and commands that compare file data or record counts.
•
“Interpreting results of audits that compare attributes” on page 306 describes the
Compare Attributes commands and their results.
299
Interpreting results for configuration data - #DGFE audit
Interpreting results for configuration data - #DGFE audit
The #DGFE audit verifies the configuration data that is defined for replication in your
configuration. This audit invokes the Check Data Group File Entries (CHKDGFE)
command for the audit’s comparison phase. The CHKDGFE command collects data
on the source system and generates a report in a spooled file or an outfile.
The report is available on the system where the command ran. The values in the
Result column of the report indicate detected problems and the result of any
attempted automatic recovery actions. Table 50 shows the possible Result values and
describes the action to take to resolve any reported problems.
Table 50.
CHKDGFE - possible results and actions to for resolving errors
Result
Recovery Actions
*NODGFE
No file entry exists.
Create the DGFE or change the DGOBJE to COOPDB(*NO)
Note: Changing the object entry affects all objects using the object entry. If you
do not want all objects changed to this value, copy the existing DGOBJE
to a new, specific DGOBJE with the appropriate COOPDB value.
*EXTRADGFE
An extra file entry exists.
Delete the DGFE or change the DGOBJE to COOPDB(*YES)
Note: Changing the object entry affects all objects using the object entry. If you
do not want all objects changed to this value, copy the existing DGOBJE
to a new, specific DGOBJE with the appropriate COOPDB value.
*NOFILE
No file exists for the existing file entry.
Delete the DGFE, re-create the missing file, or restore the missing file.
*NOMBR
No file member exists for the existing file entry.
Delete the DGFE for the member or add the member to the file.
*RCYFAILED
Automatic audit recovery actions were attempted but failed to correct the
detected error.
Run the audit again.
*RECOVERED
Recovered by automatic recovery actions.
No action is needed.
*UA
File entries are in transition and cannot be compared.
Run the audit again.
The Option column of the report provides supplemental information about the
comparison. Possible values are:
*NONE - No options were specified on the comparison request.
*NOFILECHK - The comparison request included an option that prevented an
error from being reported when a file specified in a data group file entry does not
exist.
*DGFESYNC - The data group file entry was not synchronized between the
source and target systems. This may have been resolved by automatic recovery
300
Interpreting results for configuration data - #DGFE audit
actions for the audit.
One possible reason why actual configuration data in your environment may not
match what is defined to your configuration is that a file was deleted but the
associated data group file entries were left intact. Another reason is that a data group
file entry was specified with a member name, but a member is no longer defined to
that file. If you use the automatic scheduling and automatic audit recovery functions of
MIMIX AutoGuard, these configuration problems can be automatically detected and
recovered for you. Table 51 provides examples of when various configuration errors
might occur.
Table 51.
CHKDGFE - possible error conditions
Result
File
exists
Member
exists
DGFE
exists
DGOBJE exists
*NODGFE
Yes
Yes
No
COOPDB(*YES)
*EXTRADGFE
Yes
Yes
Yes
COOPDB(*NO)
*NOFILE
No
No
Yes
Exclude
*NOMBR
Yes
No
Yes
No entry
301
When the difference is “not found”
When the difference is “not found”
For audits that compare replicated data, a difference indicating the object was not
found requires additional explanation. This difference can be returned for these
audits:
•
For the #FILDTA and #MBRRCDCNT audits, a value of *NF1 or *NF2 for the
difference indicator (DIFIND) indicates the object was not found on one of the
systems in the data group. The 1 and 2 in these values refer to the system as
identified in the three-part name of the data group.
•
For the #FILATR, #FILATRMBR, #IFSATR, #OBJATR, and #DLOATR audits, a
not found condition is indicated by a value of *NOTFOUND in either the system 1
indicator (SYS1IND) or system 2 indicator (SYS2IND) fields. Typically, the DIFIND
field result is *NE.
Audits can report not found conditions for objects that have been deleted from the
source system. A not found condition is reported when a delete transaction is in
progress for an object eligible for selection when the audit runs. This is more likely to
occur when there are replication errors or backlogs, and when policy settings do not
prevent audits from comparing when a data group is inactive or in a threshold
condition.
A scheduled audit will not identify a not found condition for an object that does not
exist on either system because it selects existing objects based on whether they are
configured for replication by the data group. This is true regardless of whether the
audit is automatically submitted or run immediately.
Because a priority audit selects already replicated objects, it will not audit objects for
which a create transaction is in progress.
Prioritized audits will not identify a not found condition when the object is not found on
the target system because prioritized auditing selects objects based on the replicated
objects database. Only objects that have been replicated to the target system are
identified in the database.
Priority audits can be more likely to report not found conditions when replication errors
or backlogs exist.
302
Interpreting results of audits for record counts and file data
Interpreting results of audits for record counts and file
data
The audits and commands that compare file data or record counts are as follows:
•
#FILDTA audit or Compare File Data (CMPFILDTA) command
•
#MBRRCDCNT audit or Compare Record Count (CMPRCDCNT) command
Each record in the output files for these audits or commands identifies a file member
that has been compared and indicates whether a difference was detected for that
member.
What differences were detected by #FILDTA
The Difference Indicator (DIFIND) field identifies the result of the comparison. Table
52 identifies values for the Compare File Data command that can appear in this field
Table 52.
Possible values for Compare File Data (CMPFILDTA) output file field Difference
Indicator (DIFIND)
Values
Description
*APY
The database apply (DBAPY) job encountered a problem
processing a U-MX journal entry for this member.
*CMT
Commit cycle activity on the source system prevents active
processing from comparing records or record counts in the
selected member.
*CO
Unable to process selected member. Cannot open file.
*CO (LOB)
Unable to process selected member containing a large object
(LOB). The file or the MIMIX-created SQL view cannot be opened.
*DT
Unable to process selected member. The file uses an unsupported
data type.
*EQ
Data matches. No differences were detected within the data
compared. Global difference indicator.
*EQ (DATE)
Member excluded from comparison because it was not changed or
restored after the timestamp specified for the CHGDATE
parameter.
*EQ (OMIT)
No difference was detected. However, fields with unsupported
types were omitted.
*FF
The file feature is not supported for comparison. Examples of file
features include materialized query tables.
*FMC
Matching entry not found in database apply table.
*FMT
Unable to process selected member. File formats differ between
source and target files. Either the record length or the null
capability is different.
303
Interpreting results of audits for record counts and file data
Table 52.
Possible values for Compare File Data (CMPFILDTA) output file field Difference
Indicator (DIFIND)
Values
Description
*HLD
Indicates that a member is held or an inactive state was detected.
*IOERR
Unable to complete processing on selected member. Messages
preceding LVE0101 may be helpful.
*NE
Indicates a difference was detected.
*NF1
Member not found on system 1.
*NF2
Member not found on system 2.
*REP
The file member is being processed for repair by another job
running the Compare File Data (CMPFILDTA) command.
*SJ
The source file is not journaled, or is journaled to the wrong journal.
*SP
Unable to process selected member. See messages preceding
message LVE3D42 in job log.
*SYNC
The file or member is being processed by the Synchronize DG File
Entry (SYNCDGFE) command.
*UE
Unable to process selected member. Reason unknown. Messages
preceding message LVE3D42 in job log may be helpful.
*UN
Indicates that the member’s synchronization status is unknown.
See “When the difference is “not found”” on page 302 for additional information.
What differences were detected by #MBRRCDCNT
Table 53 identifies values for the Compare Record Count command that can appear
in the Difference Indicator (DIFIND) field.
Table 53.
Possible values for Compare Record Count (CMPRCDCNT) output file field Difference Indicator (DIFIND)
Values
Description
*APY
The database apply (DBAPY) job encountered a problem
processing a U-MX journal entry for this member.
*CMT
Commit cycle activity on the source system prevents active
processing from comparing records or record counts in the
selected member.
*EC
The attribute compared is equal to configuration
*EQ
Record counts match. No difference was detected within the record
counts compared. Global difference indicator.
304
Interpreting results of audits for record counts and file data
Table 53.
Possible values for Compare Record Count (CMPRCDCNT) output file field Difference Indicator (DIFIND)
Values
Description
*FF
The file feature is not supported for comparison. Examples of file
features include materialized query tables.
*FMC
Matching entry not found in database apply table.
*HLD
Indicates that a member is held or an inactive state was detected.
*LCK
Lock prevented access to member.
*NE
Indicates a difference was detected.
*NF1
Member not found on system 1.
*NF2
Member not found on system 2.
*SJ
The source file is not journaled, or is journaled to the wrong journal.
*UE
Unable to process selected member. Reason unknown. Messages
preceding LVE3D42 in job log may be helpful.
*UN
Indicates that the member’s synchronization status is unknown.
See “When the difference is “not found”” on page 302 for additional information.
305
Interpreting results of audits that compare attributes
Interpreting results of audits that compare attributes
Each audit that compares attributes does so by calling a Compare Attributes1
command and places the results in an output file. Each row in an output file for a
Compare Attributes command can contain either a summary record format or a
detailed record format. Each summary row identifies a compared object and includes
a prioritized object-level summary of whether differences were detected. Each detail
row identifies a specific attribute compared for an object and the comparison results.
For example, an authorization list can contain a variable number of entries. When
comparing authorization lists, the CMPOBJA command will first determine if both lists
have the same number of entries. If the same number of entries exist, it will then
determine whether both lists contain the same entries. If differences in the number of
entries are found or if the entries within the authorization list are not equal, the report
will indicate that differences are detected. The report will not provide the list of
entries—it will only indicate that they are not equal in terms of count or content.
You can see the full set of fields in the output file by viewing it from a 5250 emulator.
What attribute differences were detected
The Difference Indicator (DIFIND) field identifies the result of the comparison. Table
54 identifies values that can appear in this field. Not all values may be valid for every
Compare command.
When the output file is viewed from a 5250 emulator, the summary row is the first
record for each compared object and is indicated by an asterisk (*) in the Compared
Attribute (CMPATR) field. The summary row’s Difference Indicator value is the
prioritized summary of the status of all attributes checked for the object. When
included, detail rows appear below the summary row for the object compared and
show the actual result for the attributes compared.
The Priority2 column in Table 54 indicates the order of precedence MIMIX uses when
determining the prioritized summary value for the compared object.
Table 54.
Possible values for output file field Difference Indicator (DIFIND)
Values1
Description
Summary
Record2 Priority
*EC
The values are based on the MIMIX configuration settings. The
actual values may or may not be equal.
5
*EQ
Record counts match. No differences were detected. Global
difference indicator.
5
*NA
The values are not compared. The actual values may or may not
be equal.
5
1. The Compare Attribute commands are: Compare File Attributes (CMPFILA), Compare Object
Attributes (CMPOBJA), Compare IFS Attributes (CMPIFSA), and Compare DLO Attributes (CMPDLOA).
306
Interpreting results of audits that compare attributes
Table 54.
Possible values for output file field Difference Indicator (DIFIND)
Values1
Description
Summary
Record2 Priority
*NC
The values are not equal based on the MIMIX configuration
settings. The actual values may or may not be equal.
3
*NE
Indicates differences were detected.
2
*NS
Indicates that the attribute is not supported on one of the systems.
Will not cause a global not equal condition.
5
*RCYSBM
Indicates that MIMIX AutoGuard submitted an automatic audit
recovery action that must be processed through the user journal
replication processes. The database apply (DBAPY) will attempt
the recovery and send an *ERROR or *INFO notification to indicate
the outcome of the recovery attempt.
*RCYFAILED
Used to indicate that automatic recovery attempts via MIMIX
AutoGuard failed to recover the detected difference.
*RECOVERED
Indicates that recovery for this object was successful.
1
*SJ
Unable to process selected member. The source file is not
journaled.
1
*SP
Unable to process selected member. See messages preceding
message LVE3D42 in job log.
1
*UA
Object status is unknown due to object activity. If an object
difference is found and the comparison has a value specified on
the Maximum replication lag prompt, the difference is seen as
unknown due to object activity. This status is only displayed in the
summary record.
2
Note: The Maximum replication lag prompt is only valid when a data
group is specified on the command.
*UN
1.
2.
Indicates that the object’s synchronization status is unknown.
4
Not all values may be possible for every Compare command.
Priorities are used to determine the value shown in output files for Compare Attribute commands.
For most attributes, when the outfile is viewed from a 5250 emulator, when a detailed
row contains blanks in either of the System 1 Indicator or System 2 Indicator fields,
MIMIX determines the value of the Difference Indicator field according to Table 55.
307
Interpreting results of audits that compare attributes
For example, if the System 1 Indicator is *NOTFOUND and the System 2 Indicator is
blank (Object found), the resultant Difference Indicator is *NE.
Table 55.
Difference Indicator values that are derived from System Indicator values.
Difference Indicator
System 1 Indicator
Object
*NOTCMPD *NOTFOUND *NOTSPT *RTVFAILED *DAMAGED
Found (blank
value)
Object Found *EQ / *NE /
(blank value) *UA / *EC /
*NC
*NA
*NE
*NS
*UN
*NE
*NA
*NE
*NS
*UN
*NE
*NE / *UA
*EQ
*NE / *UA *NE / *UA
*NE
*NS
*NE
*NS
*UN
*NE
*RTVFAILED *UN
*UN
*NE
*UN
*UN
*NE
*DAMAGED
*NE
*NE
*NE
*NE
*NE
System *NOTCMPD *NA
2
*NOTFOUND *NE / *UA
Indicator
*NOTSPT
*NS
*NE
When viewed through Vision Solutions Portal, data group directionality is
automatically resolved so that differences are viewed as Source and Target instead of
System1 and System2.
For a small number of specific attributes, the comparison is more complex. The
results returned vary according to parameters specified on the compare request and
MIMIX configuration values. For more information about comparison results for
journal status and other journal attributes, auxiliary storage pool ID (*ASP), user
profile status (*USRPRFSTS), and user profile password (*PRFPWDIND) see the see
the MIMIX Administrator Reference book.
Where was the difference detected
The System 1 Indicator (SYS1IND) and System 2 (SYS2IND) fields show the status
of the attribute on each system as determined by the compare request. Table 56
identifies the possible values. These fields are available in both summary and detail
rows in the output file.
Table 56.
Possible values for output file fields SYS1IND and SYS2IND
Value
Description
Summary Record1
Priority
<blank>
No special conditions exist for this object.
5
*DAMAGED
Object damaged condition.
3
*MBRNOTFND
Member not found.
2
*NOTCMPD
Attribute not compared. Due to MIMIX configuration settings, this
attribute cannot be compared.
N/A2
308
Interpreting results of audits that compare attributes
Table 56.
Possible values for output file fields SYS1IND and SYS2IND
Value
Description
Summary Record1
Priority
*NOTFOUND
Object not found.
1
*NOTSPT
Attribute not supported. Not all attributes are supported on all IBM
i releases. This is the value that is used to indicate an
unsupported attribute has been specified.
N/A2
*RTVFAILED
Unable to retrieve the attributes of the object. Reason for failure
may be a lock condition.
4
1.
2.
The priority indicates the order of precedence MIMIX uses when setting the system indicators fields in the summary
record.
This value is not used in determining the priority of summary level records.
For comparisons which include a data group, the Data Source (DTASRC) field
identifies which system is configured as the source for replication.
What attributes were compared
In each detailed row, the Compared Attribute (CMPATR) field identifies a compared
attribute. For more information about identifying attributes that can be compared by
each command and the possible values returned, see the MIMIX Administrator
Reference book.
“Attributes compared and expected results - #FILATR, #FILATRMBR audits” on page 677
309
MIMIX procedures when performing an initial program load (IPL)
IBM Power™ Systems operations
that affect MIMIX
APPENDIX B
The following topics describe how to protect the integrity of your MIMIX environment
when you perform operations such as IPLs and IBM i operating system upgrades.
Only basic procedures for a standard one-to-one MIMIX installation are covered. If
you are operating in a complex environment—if you have cluster, SAP R/3, IBM
WebSphere MQ, or other application considerations, for example—contact your
Certified MIMIX Consultant. Ultimately, you must tailor these procedures to suit the
needs of your particular environment.
These topics describe MIMIX-specific steps only. Refer to the user manuals that
correspond to any additional applications installed in your environment. For
instructions on performing IBM Power™ Systems operations, consult your IBM
manuals or the IBM Information Center at
http://publib.boulder.ibm.com/pubs/html/as400/infocenter.html.
The following topics are included:
•
“MIMIX procedures when performing an initial program load (IPL)” on page 310
includes the MIMIX-specific steps for performing an initial program load (IPL) to
help ensure the integrity of your MIMIX environment is not compromised.
•
“MIMIX procedures when performing an operating system upgrade” on page 311
describes when and how to perform recommended MIMIX-specific steps while
performing a standard upgrade of IBM i.
•
“MIMIX procedures when upgrading hardware without a disk image change” on
page 318 describes MIMIX prerequisites and procedures for performing a
hardware upgrade without a disk image change.
•
“MIMIX procedures when performing a hardware upgrade with a disk image
change” on page 321 describes prerequisites for saving and restoring MIMIX
software when upgrading from one system to another.
•
“Handling MIMIX during a system restore” on page 325 includes prerequisites for
restoring MIMIX software within a MIMIX system pair, to one system from a save
of the other system when an environment meets the conditions specified.
MIMIX procedures when performing an initial program
load (IPL)
An initial program load (IPL) loads the operating system and prepares the system for
user operations. Performing the recommended MIMIX-specific steps can help ensure
that objects are not damaged during the IPL and that the integrity of your MIMIX
environment is not compromised.
Notes:
310
MIMIX procedures when performing an operating system upgrade
•
This procedure describes an IPL performed under normal circumstances. It
does not address IPL considerations for system switching environments.
•
Before beginning this procedure, review your startup procedures to determine
whether subsystems will start after the IPL. This startup program is defined in
the QSTRUPPGM system value.
To perform an IPL in a MIMIX environment, do the following:
1. End MIMIX from either the source or target system. The End MIMIX (ENDMMX)
command attempts to end the MIMIX processes for the installation, including the
MIMIX managers and the data groups. Data groups can be ended in an
immediate (*IMMED) or controlled (*CNTRLD) manner.
ENDMMX ENDOPT(*CNTRLD) ENDRJLNK(*YES)
Note: For more information about the ENDMMX command, see “Commands for
ending replication” on page 184.
2. Ensure that all MIMIX jobs are ended before performing this step. End the
MIMIX subsystems on both the source and target system. On each system, type
the following on a command line and press Enter:
ENDSBS SBS(MIMIXSBS) OPTION(*IMMED)
3. Perform the IPL.
4. If your subsystems do not start during the startup procedures defined in the
QSTRUPPGM system value, start the MIMIX subsystems on both the source and
target systems. On each system, type the following on a command line and press
Enter:
STRSBS SBSD(MIMIXQGPL/MIMIXSBS)
5. Verify the communication links start, using the Verify Communications Link
(VFYCMNLNK) command.
Note: For more information about the VFYCMNLNK command, see “Verifying a
communications link for system definitions” on page 281.
6. Start MIMIX from either the source or target system. The Start MIMIX (STRMMX)
command starts the MIMIX processes for the installation, including the MIMIX
managers and the data groups.
Note: For more information about the STRMMX command, see “Starting MIMIX”
on page 179.
MIMIX procedures when performing an operating system
upgrade
This topic describes when and how to perform recommended MIMIX-specific steps
while performing a standard upgrade of the IBM i operating system (slip-install, where
the IBM i release is upgraded without a restore of the user libraries). Performing these
recommended steps can help ensure that MIMIX products start properly once the
operating system upgrade is complete.
311
MIMIX procedures when performing an operating system upgrade
Table 57 indicates which procedures are needed for different upgrade scenarios. Use
these instructions in conjunction with the instructions provided by IBM for upgrading
from one IBM i release to another IBM i release.
Table 57.
IBM i operating system upgrade scenarios and recommended processes for handling MIMIX during the upgrade
To upgrade
Perform these procedures
Backup system only
1. Perform the preparation steps described in “Prerequisites for performing
an OS upgrade on either system” on page 312.
2. Follow the procedure in “MIMIX-specific steps for an OS upgrade on the
backup system” on page 313.
Production system only
1. Perform the preparation steps described in “Prerequisites for performing
an OS upgrade on either system” on page 312.
2. Perform one of the following procedures:
• If you need to maintain user access to production applications during the
upgrade, perform a planned switch as described in “MIMIX-specific steps
for an OS upgrade on the production system with switching” on page 315.
Your production operations will be temporarily running on the backup
system.
• If you have more flexibility with scheduling downtime, you can perform the
upgrade without switching as described in “MIMIX-specific steps for an
OS upgrade on the production system without switching” on page 316.
Both backup and production
systems
1. Perform the preparation steps described in “Prerequisites for performing
an OS upgrade on either system” on page 312.
2. Upgrade the backup system first following the “MIMIX-specific steps for
an OS upgrade on the backup system” on page 313. By doing this first,
you can ensure that the backup system supports all the capabilities of the
production system and you can work through problems or custom
operations before affecting your production environment.
3. Once you have the verified that the backup system is upgraded and
operating as desired, perform one of the following procedures to upgrade
IBM i on the production system:
• If you need to maintain user access to production applications during the
upgrade, perform a planned switch as described in “MIMIX-specific steps
for an OS upgrade on the production system with switching” on page 315.
Your production operations will be temporarily running on the backup
system.
• If you have more flexibility with scheduling downtime, you can perform the
upgrade without switching as described in “MIMIX-specific steps for an
OS upgrade on the production system without switching” on page 316
Prerequisites for performing an OS upgrade on either system
Before you start an upgrade of the IBM i operating system on either system, do the
following:
1. Access Support information on the web as you perform the following steps to
ensure that the system is ready to upgrade:
312
MIMIX procedures when performing an operating system upgrade
a. Check the compatibility of the operating systems on the production and backup
systems, ensuring the systems will meet the requirements of a MIMIXsupported environment once the IBM i operating system upgrade has
occurred.
b. Ensure the recommended IBM IBM i PTFs have been applied according to
your IBM i version.
c. Ensure the recommended MIMIX service packs have been applied according
to your MIMIX version. Review the Read Me document that corresponds to the
MIMIX service pack, and check the website for relevant Technical Alerts and
FAQs.
2. Review your startup procedures to understand how your environment is
configured to start after an IPL. This startup program is defined in the
QSTRUPPGM system value. An IBM i upgrade may include rebuilding access
paths, converting formats, or performing other operations that must be complete
before MIMIX or other applications are started. The upgrade may not complete
successfully if your QSTRUPPGM procedures start MIMIX or other applications
during an IPL. Ensure that these processes are disabled before continuing with
the IBM i upgrade.
MIMIX-specific steps for an OS upgrade on the backup system
Use this procedure to upgrade the operating system on the backup system.
Notes:
•
If you plan to upgrade both the production and backup systems during the same
scheduled maintenance period, upgrade the backup system first.
•
In the following steps, the terms production and backup always refer to the original
roles of the systems before upgrading the operating system on either system. The
icons at the beginning of some steps show the state of the systems and
replication as a result of the action in the step. The arrow in the icon indicates the
direction and state of replication for a classic production to backup environment.
•
MIMIX Model Switch Framework commands, such as RUNSWTFWK, are typically
run from the backup system.
To perform an operating system upgrade of the backup system in a MIMIX
environment, do the following:
1. Ensure that you have completed any prerequisite tasks for your upgrade scenario.
See Table 57 for a list of required tasks for different upgrade scenarios.
2. End all user applications, user interfaces, and operations actively running on the
backup system. Be sure to address the following:
•
Disarm any monitors, such as MIMIX Monitor, robot jobs, or other job
schedulers.
•
Make sure all users are off the system.
Note: For more information, refer to your Runbook, Using MIMIX Monitor, and
your applications’ user manuals.
313
MIMIX procedures when performing an operating system upgrade
3. End the data groups from either system using the command:
ENDDG DGDFN(*ALL) ENDOPT(*CNTRLD)
Note: For more information about ending data groups see “Commands for
ending replication” on page 184.
4. Wait until the status of each data group becomes inactive (red) by monitoring the
status on the Work with Data Groups (WRKDG) display.
Note: For more information about the WRKDG display, see “The Work with Data
Groups display” on page 99.
5. If you have applications that use commitment control, ensure there are no open
commit cycles. For more information, see “Checking for open commit cycles” on
page 183.
a. If an open commit cycle exists, restart the data group and repeat Step 3,
Step 4, and Step 5 until there is no open commit cycle for any apply session.
6. Use the following command to end other MIMIX products in the installation library,
end the MIMIX managers, and end the RJ link:
ENDMMX ENDOPT(*CNTRLD) ENDRJLNK(*YES)
7. Use the following command on the production and backup system to end the
MIMIX subsystems:
ENDSBS SBS(MIMIXSBS) OPTION(*IMMED)
8. Complete the operating system upgrade. Allow any upgrade conversions and
access path rebuilds to complete before continuing with the next step.
Note: During the IBM i upgrade, make sure you perform a system save on the
system being upgraded. This step will provide you with a backup of
existing data.
9. Ensure the names of the journal receivers match the journal definitions:
a. From the backup system, specify the command:
installation-name/WRKJRNDFN JRNDFN(QAUDJRN *LOCAL)
b. Next to the JRNDFN(QAUDJRN *LOCAL) journal definition, specify 14 (Build)
and press F4. Type *JRNDFN for the Source for values parameter and press
Enter.
10. Start the MIMIX subsystems on both the production and backup systems using
the following command from each system:
STRSBS SBSD(MIMIXQGPL/MIMIXSBS)
11. Perform a normal start of the data groups from either system using the STRMMX
command. This step also starts the MIMIX managers.
12. Perform your normal process for validating the IBM i release upgrade.
Notes:
•
At your convenience, schedule a switch to verify that your applications function
on the new operating system on the backup system.
314
MIMIX procedures when performing an operating system upgrade
•
After the IBM i upgrade, you may receive object errors for program object types
(*PGM) if your source IBM i version is higher than your target IBM i version.
MIMIX is unable to save/restore *PGM objects in this case. To avoid these
errors, compile the *PGM objects on the source system using the Create
Program (CRTPGM) command. On the Target release prompt, specify the IBM
i version of your target system.
MIMIX-specific steps for an OS upgrade on the production system with
switching
Use this procedure if you need to maintain user access to production applications
during the production system upgrade. This procedure temporarily switches
production activity to the backup system before the upgrade and switches back to
normal operations after the production system upgrade is complete.
Notes:
•
In the following steps, the terms production and backup always refer to the original
roles of the systems before upgrading the operating system on either system. The
icons at the beginning of some steps show the state of the systems and
replication as a result of the action in the step. The arrow in the icon indicates the
direction and state of replication for a classic production to backup environment.
•
MIMIX Model Switch Framework commands, such as RUNSWTFWK, are typically
run from the backup system. You can find more information about using the
RUNSWTFWK command in the Using MIMIX Monitor book.
To perform an operating system upgrade of the production system in a MIMIX
environment while maintaining availability, do the following:
1. Ensure that you have completed any prerequisite tasks for your upgrade scenario.
See Table 57 for a list of required tasks for different upgrade scenarios.
2. Use the procedures in your Runbook to perform a planned switch to the
backup system.
Note: Do not perform the synchronize phase of your switch procedures.
If you do not have a Runbook, you need to follow your processes for the following:
•
End all user applications, user interfaces, and operations actively running on
the production system. Disarm any monitors, robot jobs, or other job
schedulers and make sure all users are off the system.
•
Resolve any errors in MIMIX and perform a controlled end of the data groups.
•
Perform a planned switch to the backup system.
3. End MIMIX products in the installation library and end the RJ link using the
command:
ENDMMX ENDOPT(*CNTRLD) ENDRJLNK(*YES)
Note: For more information about the ENDMMX command, see “Commands for
ending replication” on page 184.
4. From the production system, use the following command to end the MIMIX
315
MIMIX procedures when performing an operating system upgrade
subsystems:
ENDSBS SBS(MIMIXSBS) OPTION(*IMMED)
5. Start applications on the backup system and allow users to access their
applications from the backup system.
6. On the production system, complete the operating system upgrade. Allow any
upgrade conversions and access path rebuilds to complete before continuing with
the next step.
Note: During the IBM i upgrade, make sure you perform a system save on the
system being upgraded. This step will provide you with a backup of
existing data.
7. Ensure the names of the journal receivers match the journal definitions:
a. From the original production system, specify the command:
installation-name/WRKJRNDFN JRNDFN(QAUDJRN *LOCAL)
b. Next to the JRNDFN(QAUDJRN *LOCAL) journal definition, specify 14 (Build)
and press F4. Type *JRNDFN for the Source for values parameter and press
Enter.
8. Follow your Runbook procedures to perform a synchronization. If you do not
have a Runbook, you need to follow your processes for the following:
•
Starting MIMIX subsystems
•
Starting data groups
9. Follow your Runbook procedures to perform a planned switch back to the
production system and start replication. If you do not have a Runbook, you
need to follow your processes to switch replication so that you return to your
normal replication environment.
10. Perform your normal process for validating the IBM i release upgrade.
Note: After the IBM i upgrade, you may receive object errors for program object
types (*PGM) if your source IBM i version is higher than your target IBM i
version. MIMIX is unable to save/restore *PGM objects in this case. To avoid
these errors, compile the *PGM objects on the source system using the
Create Program (CRTPGM) command. On the Target release prompt, specify
the IBM i version of your target system.
MIMIX-specific steps for an OS upgrade on the production system without switching
Use this procedure if you have more flexibility with scheduling downtime and can
perform the upgrade without switching.
Notes:
•
In the following steps, the terms production and backup always refer to the original
roles of the systems before upgrading the operating system on either system. The
icons at the beginning of some steps show the state of the systems and
replication as a result of the action in the step. The arrow in the icon indicates the
316
MIMIX procedures when performing an operating system upgrade
direction and state of replication for a classic production to backup environment.
•
MIMIX Model Switch Framework commands, such as RUNSWTFWK, are typically
run from the backup system. You can find more information about using the
RUNSWTFWK command in the Using MIMIX Monitor book.
To perform an operating system upgrade of the production system in a MIMIX
environment without switching, do the following:
1. Ensure that you have completed any prerequisite tasks for your upgrade scenario.
See Table 57 for a list of required tasks for different upgrade scenarios.
2. End all user applications, user interfaces, and operations actively running on the
production system. Be sure to address the following:
•
Disarm any monitors, such as MIMIX Monitor, robot jobs, or other job
schedulers.
•
Make sure all users are off the system.
Note: For more information, refer to your Runbook, Using MIMIX Monitor, and
your applications’ user manuals.
3. End the data groups from either system using the command:
ENDDG DGDFN(*ALL) ENDOPT(*CNTRLD)
Note: For more information about ending data groups see “Commands for
ending replication” on page 184.
4. Wait until the status of each data group becomes inactive (red) by monitoring the
status on the Work with Data Groups (WRKDG) display.
Note: For more information about the WRKDG display, see “The Work with Data
Groups display” on page 99.
5. If you have applications that use commitment control, ensure there are no open
commit cycles. For more information, see “Checking for open commit cycles” on
page 183.
a. If an open commit cycle exist, restart the data group and repeat Step 3, Step 4,
and Step 5 until there is no open commit cycle for any apply session.
6. Use the following command to end other MIMIX products in the installation library,
end the MIMIX managers, and end the RJ link:
ENDMMX ENDOPT(*CNTRLD) ENDRJLNK(*YES)
7. End the MIMIX subsystems on the production system and on the backup system.
On each system, type the following on a command line and press Enter:
ENDSBS SBS(MIMIXSBS) OPTION(*IMMED)
8. Complete the operating system upgrade. Allow any upgrade conversions and
access path rebuilds to complete before continuing with the next step.
Note: During the IBM i upgrade, make sure you perform a system save on the
system being upgraded. This step will provide you with a backup of
existing data.
9. Start the MIMIX subsystems on the production system and the backup system as
317
MIMIX procedures when upgrading hardware without a disk image change
you would during the synchronization phase of a switch. From each system, type
the following on a command line and press Enter:
STRSBS SBSD(MIMIXQGPL/MIMIXSBS)
10. Ensure the names of the journal receivers match the journal definitions:
a. From the production system, specify the command:
installation-name/WRKJRNDFN JRNDFN(QAUDJRN *LOCAL)
b. Next to the JRNDFN(QAUDJRN *LOCAL) journal definition, specify 14 (Build)
and press F4. Type *JRNDFN for the Source for values parameter and press
Enter.
c. Record the newly attached journal receiver name by placing the cursor on the
posted message and pressing F1 or Help.
11. Using the information you gathered in Step 10, start each data group as follows
(This step also starts the MIMIX managers.):
a. From the WRKDG display, type an 9 (Start DG) next to the data group and
press Enter. The Start Data Group display appears.
b. At the Object journal receiver prompt, specify the receiver name recorded in
Step 10c.
c. At the Object large sequence number prompt, specify *FIRST.
d. At the Clear pending prompt, specify *YES.
12. Start any applications that you disabled prior to completing the IBM i upgrade
according to your Runbook instructions. These applications are normally started
in the program defined in the QSTRUPPGM system value. Allow users back on
the production system.
Note: After the IBM i upgrade, you may receive object errors for program object
types (*PGM) if your source IBM i version is higher than your target IBM i
version. MIMIX is unable to save/restore *PGM objects in this case. To avoid
these errors, compile the *PGM objects on the source system using the
Create Program (CRTPGM) command. On the Target release prompt, specify
the IBM i version of your target system.
MIMIX procedures when upgrading hardware without a
disk image change
This topic describes MIMIX prerequisites and procedures for a hardware upgrade
without a disk image change that will change a model, feature, or serial number and
require a new license key. Performing these steps can ensure that MIMIX products
start properly once the hardware upgrade is complete.
Considerations for performing a hardware system upgrade without a disk
image change
Before you start a hardware upgrade on either system, consider the following:
318
MIMIX procedures when upgrading hardware without a disk image change
•
Ensure the new system is compatible with and meets the requirements for a
MIMIX-supported environment. For more information, see the Supported
Environments Matrix in the Technical Documents section of Support Central.
•
Apply the latest MIMIX fixes on both systems. The fixes are available by product
in the Downloads section of Support Central.
•
Obtain new MIMIX product license keys. These codes are required for products
when a model, feature, or serial number changes. For more information, see
“Working with license keys” in the License and Availability Manager book.
•
Determine whether a planned switch is required prior to the hardware upgrade.
For example, a switch would be necessary if the source system is being upgraded
and users need to continue working while the upgrade takes place. To perform a
switch, follow the steps in your runbook. For more information, see “Switching” on
page 244.
•
Determine if the transfer definitions need to be changed. For example, transfer
definitions would need to be changed if the IP addresses or host names change.
For more information, see “Configuring transfer definitions” in the MIMIX
Administrator Reference book.
MIMIX-specific steps for a hardware upgrade without a disk image
change
Use this procedure to restart your MIMIX installation when updating your hardware. If
you have special considerations, contact your Certified MIMIX Consultant for
assistance. Before you begin, ensure that “Considerations for performing a hardware
system upgrade without a disk image change” on page 318 have been reviewed and
completed where applicable.
Hardware upgrade without a disk image change - preliminary steps
To perform this portion of the upgrade process, do the following on the system prior to
the upgrade:
1. Ensure MIMIX is operating normally before performing the upgrade. There should
be no files or objects in error and all transactions should be caught up. See
“Resolving common replication problems” on page 207 for more information about
resolving problems.
2. Ensure users are logged off the system and perform a controlled shutdown of all
MIMIX data groups. For more information, see “Ending a data group in a
controlled manner” on page 195.
3. End all MIMIX products. For more information, see “Ending MIMIX” on page 179.
4. Print the status information for each data group by doing the following:
a. From the Work with Data Groups display, type 8 (Display detail status) next to
each data group.
b. Press Enter.
c. Press F7 for object status and print the display. Keep the information for later
use.
319
MIMIX procedures when upgrading hardware without a disk image change
d. Press F8 for database status and print the display. Keep the information for
later use.
5. Optional step: Save the MIMIX software by doing a full system save or by saving
all MIMIX installation libraries:
•
LAKEVIEW
•
MIMIXQGPL
•
MIMIX-installation-library
•
MIMIX-installation-library_0
•
MIMIX-installation-library_1
•
/LakeviewTech (directory tree)
6. Optional step: If upgrading the source system and performing a switch, follow the
steps in your runbook. For more information, see “Switching” on page 244.
Hardware upgrade without a disk image change - subsequent steps
To perform this portion of the upgrade process, do the following on the system after
the upgrade has been completed:
1. Optional step: Update any transfer definitions that require changes. For more
information, see “Configuring transfer definitions” in the MIMIX Administrator
Reference book.
2. Enter the new product license key on the system. Do the following:
a. From the MIMIX main menu select option 31 (Product Management Menu).
The License Manager Main Menu appears.
b. Select option 1 (Update license key). The Update License Keys (UPDLICKEY)
command appears. Follow the instructions displayed for obtaining license
keys. For more information, see “Obtaining license keys using UPDLICKEY
command” in the License and Availability Manager book.
3. Confirm that communications work between the new system and other systems in
the MIMIX environment. For more information, see “Verifying a communications
link for system definitions” on page 281.
4. Optional step: Perform a data group switch by following the steps in your
runbook, then skip to Step 6. See “Considerations for performing a hardware
system upgrade without a disk image change” on page 318 to determine if a
switch is required.
5. Use the Start Data Group (STRDG) command to start all data groups. For more
information, see “Starting and ending replication” on page 169.
6. Run your MIMIX audits to verify the systems are synchronized. See “Running an
audit immediately” on page 131 for more information about running audits.
320
MIMIX procedures when performing a hardware upgrade with a disk image change
MIMIX procedures when performing a hardware upgrade
with a disk image change
When a hardware upgrade is being performed on a system, MIMIX software may
need to be saved from the system being replaced and then restored to the system
that is its replacement. The saved MIMIX information must be restored on a system
that performs the same role within MIMIX operations. For example, if the network
system is being replaced, MIMIX software must be saved from the network system
and restored on the new network system. A network system cannot be restored to a
new management system.
This topic describes steps to consider prior to saving and restoring MIMIX software
when upgrading from one system to another. Performing these steps can ensure that
MIMIX products start properly once the hardware upgrade is complete.
IMPORTANT! To ensure the integrity of your data, contact your Certified MIMIX
Consultant for assistance performing a hardware upgrade.
Considerations for performing a hardware system upgrade with a disk
image change
Before you start a hardware upgrade on either system, consider the following:
•
Contact your contact your Certified MIMIX Consultant prior to performing the
upgrade for instructions that may be specific to your environment.
•
Ensure the new system is compatible with and meets the requirements for a
MIMIX-supported environment. For more information, see the Supported
Environments Matrix in the Technical Documents section of Support Central.
•
Apply the latest MIMIX fixes on both systems. The fixes are available by product
in the Downloads section of Support Central.
•
Obtain new MIMIX product license keys. These codes are required for products
when a model, feature, or serial number changes. For more information, see
“Working with license keys” in the License and Availability Manager book.
•
Determine whether a planned switch is required prior to the hardware upgrade.
For example, a switch would be necessary if the source system is being upgraded
and users need to continue working while the upgrade takes place. To perform a
switch, follow the steps in your runbook. For more information, see “Switching” on
page 244.
•
Determine if the transfer definitions need to be changed. For example, transfer
definitions would need to be changed if the IP addresses or host names change.
For more information, see “Configuring transfer definitions” in the MIMIX
Administrator Reference book.
•
Copy all automation for MIMIX to the new machine, including exit programs.
•
Transfer any modifications of programs such as QSTARTUP to the new system.
Modifications may be needed to start the MIMIX subsystem after an IPL. Refer to
your Runbook for an overview of the required automation changes that need to be
performed on the system.
321
MIMIX procedures when performing a hardware upgrade with a disk image change
MIMIX-specific steps for a hardware upgrade with a disk image change
Use this procedure to save and restore your MIMIX installation when updating your
hardware with a disk image change. If you have special considerations, contact your
Certified MIMIX Consultant for assistance. Before you begin, ensure that
“Considerations for performing a hardware system upgrade with a disk image
change” on page 321 have been reviewed and completed where applicable.
Hardware upgrade with a disk image change - preliminary steps
To perform the save portion of the upgrade process, do the following on the old
system prior to the upgrade:
1. Ensure MIMIX is operating normally before performing the upgrade. There should
be no files or objects in error and all transactions should be caught up. See
“Resolving common replication problems” on page 207 for more information about
resolving problems.
2. Ensure users are logged off the system and all applications have ended. Perform
a controlled shutdown of all MIMIX data groups. For more information, see
“Ending a data group in a controlled manner” on page 195.
3. Optional step: Perform a data group switch by following the steps in your
runbook. See “Considerations for performing a hardware system upgrade with a
disk image change” on page 321 to determine if a switch is required.
4. End all MIMIX products. For more information, see “Ending MIMIX” on page 179.
5. Ensure there are no open commit cycles. For more information, see “Checking for
open commit cycles” on page 183.
a. If open commit cycles exist, restart the data group and repeat Step 4 to end all
MIMIX products.
6. Print the status information for each data group by doing the following:
a. From the Work with Data Groups display, type 8 (Display detail status) next to
each data group.
b. Press Enter.
c. Press F7 for object status and print the display. Keep the information for later
use.
d. Press F8 for database status and print the display. Keep the information for
later use.
7. Print the list of system values. Type the following on a command line and press
Enter: WRKSYSVAL SYSVAL(*ALL) OUTPUT(*PRINT)
8. Save the MIMIX software from the old system by doing a full system save or by
saving all MIMIX installation libraries:
•
LAKEVIEW
•
MIMIXQGPL
•
MIMIX-installation-library
322
MIMIX procedures when performing a hardware upgrade with a disk image change
•
MIMIX-installation-library_0
•
MIMIX-installation-library_1
•
/LakeviewTech (directory tree)
Hardware upgrade with a disk image change - subsequent steps
To perform this portion of the upgrade process, do the following after you have
upgraded and restored all user data, including all MIMIX libraries:
Note: To ensure that journaling is properly started, restore journals and journal
receivers before restoring user data.
1. Ensure the following system values are set the same way on the new system as
they were on the old system: QAUDCTL, QAUDLVL, QALWOBRST,
QALWUSRDMN, and QLIBLCKLVL.
2. On a command line, type LAKEVIEW/UPDINSPRD and press Enter.
3. Enter the new product license key on the system. Do the following:
a. From the MIMIX main menu select option 31 (Product Management Menu).
The License Manager Main Menu appears.
b. Select option 1 (Update license key). The Update License Key (UPDLICKEY)
command appears. Follow the instructions displayed for obtaining license
keys. For more information, see “Obtaining license keys using UPDLICKEY
command” in the License and Availability Manager book.
4. On a command line, type CALL MXXPREG and press Enter to register the MIMIX
exit points in the system registry.
5. Update any transfer definitions that require changes. For more information, see
“Considerations for performing a hardware system upgrade with a disk image
change” on page 321.
6. Confirm that communications work between the new system and other systems in
the MIMIX environment. For more information, see “Verifying a communications
link for system definitions” on page 281.
7. Ensure all automation, including MIMIX exit programs, for MIMIX is available and
configured on the new system.
8. Make any necessary modifications to the QSTARTUP program. This may need to
be modified to start the MIMIX subsystem after an IPL. For more information, see
“Considerations for performing a hardware system upgrade with a disk image
change” on page 321.
9. Start the MIMIX subsystem with the following command: STRSBS
SBSD(MIMIXQGPL/MIMIXSBS).
10. Optional step: Perform a data group switch by following the steps in your
runbook, then skip to Step 13. See “Considerations for performing a hardware
system upgrade with a disk image change” on page 321 to determine if a switch is
required.
11. Start the system manager with the following command: STRMMXMGR
SYSDFN(*ALL) MGR(*SYS).
323
MIMIX procedures when performing a hardware upgrade with a disk image change
12. Start MIMIX with the following:
If the source system was upgraded
a. On the source system, type WRKJRNDFN JRNDFN(*ALL *LOCAL) on a
command line, and press Enter.
b. Press F10 to verify the Receiver Prefix, Library, and all other parameters
(option 5) are correct. Make any necessary changes from the MIMIX
management system before continuing.
c. For each journal definition that has an RJ Link parameter value of *SRC or
*NONE do the following:
• Type option 14 and press F4=PROMPT.
• Type *JRNDFN for the Source for values parameter and press Enter.
• Record the newly attached journal receiver name by placing the cursor on
the posted message and pressing F1 or Help.
d. For each data group, run the following Verify Journaling File Entry
(VFYJRNFE) command to ensure that the file entries for that data group are
journaled to the correct journal and that the journal options are the same as
those configured for the data group: VFYJRNFE DGDFN(DGNAME)
FILE1(*ALL).
e. Start the data groups with a clear pending start from the receivers recorded in
Step c of this procedure: STRDG DGDFN(data-group-name)
DBJRNRCV(user-journal-receiver) DBSEQNBR2(*FIRST)
OBJJRNRCV(security-journal-receiver) OBJSEQNBR2(*FIRST)
CLRPND(*YES)
f. Delete any old receivers with different library or prefix names.
g. User and application activity can be resumed on the system.
If the target system was upgraded
a. On the target system, type WRKJRNDFN JRNDFN(*ALL *LOCAL) on a
command line, and press Enter.
b. Press F10 to verify the Receiver Prefix, Library, and all other parameters
(option 5) are correct. Make any necessary changes from the MIMIX
management system before continuing.
c. Type option 14 for each journal definition that has an RJ Link parameter
value of *SRC or *NONE. Do not press enter.
d. On the command line, type JRNVAL(*JRNDFN) and press Enter to build a new
journal receiver for the journal definitions.
e. Type WRKJRNDFN JRNDFN(*ALL *LOCAL) RJLNK(*TGT)on a command
line, and press Enter.
f. For each journal definition listed, do the following:
• Type option 17 (Work with jrn attributes) and press Enter.
• Type option 15 (Work with receiver directory) and press Enter.
324
Handling MIMIX during a system restore
• Type option 4 (Delete) for all receivers in the list. If message CPA7025 is
issued, reply with an “I”.
g. For each data group, run the following Verify Journaling File Entry
(VFYJRNFE) command to ensure that the file entries for that data group are
journaled to the correct journal and that the journal options are the same as
those configured for the data group: VFYJRNFE DGDFN(DGNAME)
FILE1(*ALL).
h. Use this Start Data Group (STRDG) command to start all data groups with the
information collected in Step 6 of “Hardware upgrade with a disk image change
- preliminary steps” on page 322:
STRDG DGDFN(data-group-name) DBJRNRCV(last-processed-data-basejournal-receiver) DBSEQNBR2(last-processed-data-base-sequencenumber) OBJJRNRCV(last-processed-object-journal-receiver)
OBJSEQNBR2(last-processed-object-sequence-number) CLRPND(*YES)
i. Delete any old receivers with different library or prefix names.
13. Run your MIMIX audits to verify the systems are synchronized. See “Running an
audit immediately” on page 131 for more information about running audits.
Handling MIMIX during a system restore
Occasionally, an entire system may need to be restored because of a system failure.
For example, if there is a processor or OS (operating system) failure. This topic
includes prerequisites for restoring MIMIX software within a MIMIX system pair (two
systems using the same MIMIX installation) to one system from a save of the other
system. A system restore may need to be performed when the when the following
conditions exist:
•
The original production system, including the Licensed Internal Code and the
OS, has been recovered from the backup system by tape.
•
The IBM installed release level is the same on each system.
IMPORTANT! To ensure the integrity of your data, contact your Certified MIMIX
Consultant for assistance performing a system restore.
For information about MIMIX-supported environments, see the Supported
Environments Matrix in the Technical Documents section of Support Central.
Prerequisites for performing a restore of MIMIX
Before you restore MIMIX on a system, consider the following steps which help
ensure that MIMIX products start properly once the restore is complete:
•
Contact your contact your Certified MIMIX Consultant prior to performing the
restore for instructions that may be specific to your environment.
•
Locate your MIMIX product license keys. These codes may be required after the
restore. For more information, see “Working with license keys” in the Using
License Manager book.
325
Index
Symbols
*ATTN
application group 65
managers for node 67
monitors 62
replication 69
*CANCEL, step status of 88
*CANCELED, procedure status of 82, 91
*FAILED activity entry 224, 227
*FAILED status
procedure 82, 91
step 88
*HLD
file entry 210
tracking entry 219
*HLDERR
file entry 210
tracking entry 219
*HLDRLTD file entry 210
*INACTIVE
application group 66
node managers 67
replication 69
*MSGW status
procedure 81
step 87
*UNKNOWN 66
A
accessing
MIMIX Availability Status display 93
MIMIX Main Menu 24
activity entries, object
confirm delay/retry cycle 228
failed, resolving 224
remove history 229
retrying 227
additional resources 13
application group
resolving reported problems 64
status of 60
application group definition 17, 20
application node status 65
applications, reducing contention with 279
audit
#DGFE considerations 127
#DLOATR considerations 127
#FILDTA considerations 127
#IFSATR considerations 127
#MBRRCDCNT considerations 127
after a configuration change 126
authority level to run 23
automatic starting of 124
before switching 126
best practice 23, 126, 145
bi-directional environment considerations 27
change history retention criteria 43
changing schedule 41
compare phase 123
comparison levels 53
compliance 144
compliance threshold 52, 53
definition of 17
differences, resolving 133
displaying compliance status 145
displaying history 137
displaying runtime status 129
displaying schedule 147
displaying time of next scheduled run 147
displaying when automatic audits run 147
ending 136
history 137
job log 135
last performed 145
last successful run 144
no objects selected 139
objects compared 139
policies which affect 36
policies, runtime behavior 36
policies, submitting automatically 37
prevent from running 45
priority selection example 139
priority, default settings of 37
problems reported in installation 99
recovery phase 123
results 133
results recommendations 127
retain history of 54
rule name 39
running immediately 131
schedule 147
schedule, changing 41
scheduled, default settings of 37
status from 5250 emulator 96
status, compliance 144
status, runtime 129
summary 129
three or more node considerations 27
when not to audit 28
326
audit history
change criteria 43
audit level
best practice 53
changing before switch 43
audit results 133
#DGFE rule 300
#FILDTA rule 303
#MBRRCDCNT rule 303
interpreting, attribute comparisons 306
interpreting, file data comparisons 303
resolving problems 133, 300
troubleshooting 135
auditing level, object
set when starting a data group 174
used for replication 231
authority level
for product access 23
AutoGuard, MIMIX 17
automatic error recovery
replication, policies for 32
system journal replication 34
user journal replication 33
automatic recovery
audits 50
concept 17
system journal replication 50
user journal replication 50
AutoNotify feature, MIMIX 163
Availability Status display, MIMIX 93
B
backlog
starting shared object send job 175
system manager 151
backlog, identifying a 115
backup node sequence
changing 71
examples of changing 73
verifying 70
backup system 18
best practice
audit frequency 145
audit level 53, 126
audit level before switch 43, 53, 126
audit threshold 52, 53
switch frequency 244, 253
switch threshold 56
switching 245
best practices
auditing 126
bi-directional environment policy considerations 27
C
cancel
procedure 92
clear error entries
processing 175
when to 181
clear pending entries
check for open commits 183
open commit cycle prevents 183
processing 175
resolving open commits before 183
when to 181
cluster services 21
cold start, replacement for 175
collector services 21
ending 153
starting 152
status 149
collector services status 67
command, by name
Work with Audit History 137
commands, by mnemonic
CHGDG 270
CHGPROCSTS 89
CHKDGFE 283, 300
CNLPROC 92
CRTDGTSP 291
DLTDGTSP 293
DSPDGSTS 105
DSPDGTSP 293
DSPMMXMSGQ 208
DSPRJLNK 264
ENDAG 169
ENDDG 169, 184, 190
ENDJRNFE 236
ENDJRNIFSE 239
ENDJRNOBJE 242
ENDJRNPF 236
ENDMMX 169, 184, 192
ENDRJLNK 274
ENDSVR 261
HLDDGLOG 294
MIMIX 24
RLSDGLOG 294
327
RUNPROC 90
STRAG 169, 171
STRDG 169, 171, 174
STRJRNFE 235
STRJRNIFSE 238
STRJRNOBJE 241
STRMMX 169, 171, 179
STRRJLNK 274
STRSVR 260
SWTDG 255, 257
VFYCMNLNK 281, 282
VFYJRNFE 237
VFYJRNIFSE 240
VFYJRNOBJE 243
VFYKEYATR 289
WRKAG 60
WRKAUDHST 137
WRKAUDOBJ 139
WRKAUDOBJH 142
WRKCPYSTS 263
WRKDG 99
WRKDGACT 224
WRKDGACTE 225
WRKDGFE 210
WRKDGIFSTE 219
WRKDGOBJTE 219
WRKDGTSP 291
WRKDTARGE 68
WRKMMXSTS 93, 164
WRKMSGLOG 209
WRKNFY 160
WRKNODE 66
WRKPROCSTS 78
WRKRJLNK 265, 267
WRKSTEPSTS 83
commands, by name
Cancel Procedure 92
Change Data Group 270
Change Procedure Status 89
Check Data Group File Entries 283, 300
Create Data Group Timestamps 291
Delete DG Timestamps 293
Display Data Group Status 105
Display Data Group Timestamps 293
Display MIMIX Message Queue 208
Display RJ Link 264
End Application Group 169
End Data Group 169, 184, 190
End Journal Physical File 236
End Journaling File Entry 236
End Journaling IFS Entries 239
End Journaling Obj Entries 242
End Lakeview TCP Server 261
End MIMIX 169, 184, 192
End RJ Link 274
Hold Data Group Log 294
MIMIX 24
MIMIX Availability Status 93
Release Data Group Log 294
Run Procedure 90
Start Application Group 169, 171
Start Data Group 169, 171, 174
Start Journaling File Entry 235
Start Journaling IFS Entries 238
Start Journaling Obj Entries 241
Start Lakeview TCP Server 260
Start MIMIX 169, 171, 179
Start RJ Link 274
Switch Data Group 255, 257
Verify Communications Link 281, 282
Verify Journaling File Entry 237
Verify Journaling IFS Entries 240
Verify Journaling Obj Entries 243
Verify Key Attributes 289
Work with Application Groups 60
Work with Audited Obj. History 142
Work with Audited Objects 139
Work with Copy Status 263
Work with Data Group Activity 224
Work with Data Groups 99
Work with Data Rsc. Grp. Ent. 68
Work with DG Activity Entries 225
Work with DG File Entries 210
Work with DG IFS Tracking Ent. 219
Work with DG Obj Tracking Ent. 219
Work with DG Timestamps 291
Work with Message Log 209
Work with MIMIX Availability Status 164
Work with Node Entries 66
Work with Notifications 160
Work with Procedure Status 78
Work with RJ Links 265, 267
Work with Step Status 83
commit cycles
effect on audit comparison 303, 304
commit cycles, open
checking for 183
checking for after a controlled end 196
preventing problems with 187
preventing STRDG request 183
328
commit mode change
prevents starting with open commits 183
communications
ending TCP sever 261
starting TCP sever 260
compare phase 123
compliance
audit 144
concept 125
switch 253
switch, policies for 49
compliance status
switch 253
concepts
auditing 122
MIMIX 17
configuration
audit after changing 126
determining data areas and data queues 272
determining, IFS objects 271
results of #DGFE audit after changing 300
configuration changes deployed 174
contacting Vision Solutions 14
contention with applications, reducing 279
controlled end
confirm end 196
description 186
procedure 195
wait time 187
cooperative processing 20
copying active files 263
correcting
file-level errors 216
record-level errors 217
CustomerCare 14
D
data areas and data queues
determining configuration of 272
holding user journal entries for 221
resolving problems 220
tracking entries 219
verifying journaling 243
data group 17
backlogs 115
controlled vs. immediate end 186
definition 20
determining if RJ link used 267
disabling 270
enabling 270
ending considerations 190
ending controlled 195
ending immediately 198
ending selected processes 198
indication of disabled state 269
recovery point cleared 190
starting selected processes 181
state, disabled or enabled 269
status from 5250 emulator 95
status, database view 112
status, detailed 105
status, merged view 106
status, object view 110
status, summary 99
switching 249, 255
timestamps 291
when to exclude from auditing 28
data group entry
description 20
data resource group 68
replication status summary 68
database apply (DBAPY) status 113
database apply cache policy 51
database error recovery, automatic 33
definition
application group 20
data group 20
journal 20
remote journal (RJ) link 20
system 20
transfer 20
definitions
application group 17
delay/retry cycle, confirm object in a 228
differences, resolving audit 133
disabled data group 269
displaying
data group spooled file information 262
data group status details 105
long IFS object names 262
RJ link 264
RJ link status 265
status 93
documents, MIMIX 11
E
ending
audit 136
329
collector services 153
MIMIX managers 152
MIMIXSBS subsystem 193
system and journal managers 152
target journal inspection 154
TCP server 261
ending data group
clears recovery point 190
considerations when ending 190
controlled end 195
controlled end wait time 187
controlled vs. immediate 186
how to confirm end 196
immediate end 198
processes 187
processes, effect of 203
processes, specifying selected 198
when to end RJ link 188
ending journaling
data areas and data queues 242
files 236
IFS objects 239
IFS tracking entry 239
object tracking entry 242
ending MIMIX 192
controlled vs. immediate 186
end subsystem, when to also 193
follow up after 193
included processes 188
using default values 192
using specified values 192
when to end RJ link 188
ending replication 169
choices 184
controlled vs. immediate 186
ending RJ link
independently from data group 274
when to end 188
errors
file level 216
record level 217
system journal replicated objects 224
target journal of RJ link 288
user journal replicated files 210
user journal replicated objects 219
example
priority audit object selection 139
examples
changing backup node sequence 73
F
file
file-level errors 216
hold journal entries 214
new 231
not journaled 102
record-level errors 217
replicated 210
file identifiers (FIDs) 273
file in error
examine held journal entries 213
resolving 210
file on hold
release and apply held entries 215
release and clear entries 216
release at synchronization point 215
H
hardware upgrade
MIMIX-specific steps 319
no disk image change 318
prerequisites 318
with a disk image change 321
held error (*HLDERR)
file entry 210
preferred action for entry 211, 220
tracking entry 219
history
audited object 142
completed audits 137
displaying audit 137
history log, removing completed entries 229
history of, retaining
procedures 31
hold (*HLD)
preferred action for held entry 211, 220
put file entry on hold 214
put tracking entry on hold 221
release a held file entry 215
release a held tracking entry 223
hold ignore (*HLDIGN)
preferred action for ignored entry 211, 220
put file entry on hold ignore 214
put tracking entry on hold ignore 222
hold related (*HLDRLTD) 211
hot backup 15
I
i5/OS upgrade 311
330
IFS objects
determining configuration 271
file IDs (FIDs) 273
hold user journal entries for 221
path names 262
resolving problems 220
tracking entries for 219
verifying journaling 240
immediate end
description 186
incomplete tracking entry 186
information and additional resources 13
inspection
target journal 22
installation, status of
from 5250 emulator 93
IPL 310
implicitly started 231
requirements for starting 231
starting for data areas and data queues 241
starting for IFS objects 238
starting for physical files 235
starting, ending, and verifying 230
verifying for data areas and data queues 243
verifying for IFS objects 240
verifying for physical files 237
journaling status
data areas and data queues 241
files 235
IFS objects 238
J
L
job log
for audit 135
jobs
used by procedures 77
used by procedures, status of 83
journal 19
inspection on target system 22
journal at create
requirements 231
requirements and restrictions 232
journal cache or state
resolving problems 103, 119
status 117
journal definition 20
defined to RJ Link 268
journal entry
description 19
unconfirmed 286
journal manager 21
ending 152
resolving problems 149
starting 152
status 149
journal receiver 19
journaling 19
cannot end 236
data group problem with 101
ending for data areas and data queues 242
ending for IFS objects 239
ending for physical files 236
last audit performed 144
last switch performed 253
log space 22
long IFS path names 262
K
keyed replication
verifying file attributes 289
M
management system 19
manager
journal 21
system 21
manager status 67
menu
MIMIX Main 24
message queue, primary and secondary 208
messages
ENDMMX 172
STRMMX 172
MIMIX AutoGuard 17
MIMIX CDP feature
exclude from audit 28
recovery point cleared 190
MIMIX installation 17
MIMIX managers
checking for a backlog 151
ending 152
resolving problems 149, 151
starting 152
MIMIX Model Switch Framework 22, 249
policy default 56
MIMIX rules 122
331
MIMIX subsystem (MIMIXSBS)
starting 179
when to end 193
MIMIX Switch Assistant 22
setting default switch framework 48
setting switch compliance policies 49
MMNFYNEWE monitor 163
monitor for newly created objects 163
monitors
nodes where needed 62
status of 61
N
names, displaying long 262
network system 19
new hardware upgrade 321
MIMIX-specific steps 322
prerequisites 321
new objects
IFS object journal at create requirements 231
journal at create selection criteria 232
newly created objects, notification of 163
node entries 66
node status
application group 65
data resource group 68
nodes, policy considerations for multiple 27
notification status 63
notifications
definition 18, 159
displaying 160, 164
new problems in installation 99
severity level 125, 161
status 160
O
object
audited history 142
object auditing
concept 19
setting level with STRDG 174
used for replication 231
object error recovery, automatic 34
object send process
considerations for starting a shared 175
objects
audited object list 139
configuration of non-file 271
displaying long IFS names 262
displaying objects in error 108
displaying objects with active entries 108
in error, resolving 224
new 231
reducing contention 279
tracking entries for data areas and data
queues 219
open commit cycles
audit results 303, 304
prevent problems with 187
resolving before starting replication 183
shown in status 196
when starting a data group 183
operations
common, where to start 98
less common 259
orphaned recoveries 167
output file fields
Difference Indicator 303, 306
System 1 Indicator field 308
System 2 Indicator field 308
P
path names, IFS 262
planned switch 245
policies 18
audit, automatically submitting 37
audit, runtime behavior of 36
changing values 29
for auditing 36
for replication 32
for switching 48
installation-level only 31
introduction 26
multi-node and bi-directional environment
considerations 27
policy
action for running audits 54
audit action threshold 53
audit history retention 54
audit level 53
audit notify on success 50
audit rule 50
audit schedule 57
audit warning threshold 52
automatic audit recovery 50
automatic database recovery 50
automatic object recovery 50
CMPRCDCNT commit threshold 56
332
data group definition 50
database apply cache 51
default model switch framework 56
independent ASP library ratio 56
journaling attribute difference action 51
maximum rule runtime 52
notification severity 50
object only on target 51
prioritized audit in effect 147
procedure history retention 57
run rule on system 54
switch action threshold 56
switch warning threshold 56
synchronize threshold size 55
system journal recovery success 50
third delay retry interval 56
third delay retry interval, number of 55
user journal apply threshold 51
user journal recovery success 50
PPRC replication status 65, 68
problems
reporting a problem 278
troubleshoot 276
problems, journaling
data areas and data queues 241
files 235
IFS objects 238
problems, resolving
audit results 133, 300
common errors 207
common system level errors 149
data group cannot end 280
data group cannot start 285
files in error 210
files not journaled 102
journal cache or state 103, 119
objects in error 224
open commits when starting data group 183
RJ link cannot end 286
RJ link cannot start 286
switch compliance 254
system level processes 149
tracking entries 219
procedure
acknowledging failed or canceled 89
begin at step 90, 173, 248
canceling 92
defined 22
displaying status 78
history retention 57
how to run 90
last run of all 78
multiple jobs 77
multiple jobs, status of 83
overriding step attributes 91
resolve problems 80
resuming canceled or failed 91
run type *USER 90
run type other than *USER 90
status 80
status history of a 79
step status 83
procedure history
change criteria 31
procedures 77
change history retention criteria 31
history retention 31
processes
system level 149
production system 18
publications, IBM 13
Q
QDFTJRN data area
restrictions 232
role in processing new objects 232
QSTRUPPGM system value 311, 313
R
recommendations
auditing 126
before planned switch 245
checking audit results 127
policies in bi-directional environment 27
policies in three or more node environment
27
starting shared object send 175
recoveries
active in installation 99
definition 18, 160
detected database errors 33
displaying details 164
occurring in installation 164
orphaned 167
orphaned, identifying 167
orphaned. removing 168
recovery domain
changing backup sequence 71
verifying sequence 70
333
recovery phase 123
recovery point
cleared by ENDDG 190
release (*RLS)
held file entry 215
held tracking entry 223
release clear (*RLSCLR)
file entry 216
tracking entry 223
release wait (*RLSWAIT)
file entry 215
tracking entry 222
remote journal
i5/OS function 19
remote journal (RJ) link 20
remote journal environment
processes ended by ENDDG 203
processes started by STRDG 199
unconfirmed journal entry 286
removing
activity history entries 229
duplicate tracking entries 223
unconfirmed entries 286
reorganizing, active files 263
replication
automatic error recovery 32
backlogs, identifying 115
before starting 171
commands for ending 184
commands for starting 171
direction of 18
ending 169
policies that affect 32
resolve replication errors 207
starting 169
status from 5250 emulator 95
status summary 65, 68
supported paths 15
switching 244
system journal 15, 21
user journal 15, 21
replication path 21
replication, problems
troubleshoot 276
where to start 207
requirements
audits 126
journal at create 231
journaling 231
resolving problems
application group 64
application group *ATTN status 65
application group other problem status values 66
common replication errors 207
data resource group status 68
node entry status 66
system level jobs 149
troubleshooting 276
resource group, data 68
status 68
restore MIMIX
prerequisites 325
restrictions
journal at create 232
QDFTJRN data area 232
retry objects in error 227
retrying, data group activity entries 227
RJ link 20
displaying 264
ending independently 274
errors for target journal of 288
identifying data groups that use 267
journal definitions by an 268
operating without a data group 274
removing unconfirmed entries 286
status 265
when to end 188
rule
#DGFE 39
#DLOATR 39
#FILATR 39
#FILATRMBR 39
#FILDTA 39
#IFSATR 39
#OBJATR 39
rules
MIMIX 122
rules, MIMIX
descriptions 39
run
procedure 90
running
audits immediately 131
S
schedule
automatically submitted audits 37
changing audit 41
334
scheduler
auditing 124
servers
ending TCP 261
starting TCP 260
service
cluster 21
status collector 21
services
collector, ending 153
collector, starting 152
status from 5250 emulator 96
severity level, notification 161
source system 18
spooled files, displaying MIMIX-created 262
standby journaling
IBM i5/OS option 42 117
overview 117
starting
collector services 152
MIMIX managers 152
procedure at step 90, 173, 248
RJ link independently 274
system and journal managers 152
target journal inspection 153
TCP server 260
starting data group
at specified journal location 175
deploy configuration 174
prevented by open commit cycles 183
procedure 181
processes, effect of 199
set object auditing level 174
when to clear entries 181
starting journaling
data areas and data queues 241
file entry 235
files 235
IFS objects 238
IFS tracking entry 238
object tracking entry 241
starting MIMIX
included processes 171
procedure 179
starting replication 169
before 171
choices 171
status 60
active file operations 263
application group 60
audit compliance 145
audits 129
audits (runtime) 96
checking from 5250 emulator 93
collector services 67
data group detail 105
data group summary 95
database apply (DBAPY) 113
installation summary 93
journal cache or state 117
journaling data areas and data queues 241
journaling files 235
journaling IFS objects 238
journaling tracking entries 238, 241
monitors 61
node entries 66
notification 160
notification new in installation 96
notifications 63
procedures 78
recoveries active in installation 164
replication 95
replication, application group level 65
replication, data resource group level 68
replication, logical 68
replication, PPRC 68
RJ link 265
services 96
steps in a procedure 83
switch compliance 253
switching 104
system and journal managers 67
system-level processes 149
target journal inspection processes 155
Work with Data Groups display 99
step
begin procedure at 90, 173, 248
defined 22
resolve problems 85
status 83, 85
subsystem, MIMIXSBS
ended 217
ending 193
starting 179
Switch Assistant, MIMIX 22
switch framework
disable policy when not used 49
specify a default 48
switching 244
application group 250
335
best practice 23, 244, 245, 253
change audit level before 43, 53
compliance 253
conditions that end 257
description, planned switch 245
description, unplanned switch 246
journal analysis after unplanned switch 295
last switch field 253
phases of a 245
policies for 48
problems checking compliance 254
reasons for 245
setting switch compliance policies 49
setting switch framework policy 48
switch framework vs. SWTDG command 249
SWTDG command details 257
unplanned, actions to complete an 246
using option 6 on MIMIX Basic Main Menu
251
using STRDG command 255
synchronize
file entry 211
objects, system journal replicated 226
tracking entries 221
system definition 20
system journal replication 15, 21
detailed status 110
errors automatically recovered 34
journaling requirements 231
system level processes 149
system manager 21
backlog 151
ending 152
resolving problems 149, 151
starting 152
status 149
system roles
management or network 19
production or backup 18
source or target 18
T
target journal inspection 22
last entry inspected 158
results 156
starting 153
status 149, 155
target system 18
threshold
audit action 53
audit warning 52
CMPRCDCNT open commit 56
switch action 56
switch warning 56
synchronize size 55
user journal apply 51
timestamps 291
automatically created 291
creating additional 291
deleting 293
displaying 293
printing 293
tips
displaying data group spooled files 262
displaying long IFS object names 262
removing journaled changes 294
working with active file operations 263
tracking entry 21
file identifiers (FIDs) 273
IFS 219
incomplete 186
not journaled 102
object 219
removing duplicate 223
transfer definition 20
U
unconfirmed journal entries, removing 286
unplanned switch 246
performing journal analysis 295
unprocessed entries 196
upgrade
hardware, no disk image change 318
hardware, with a disk image change 321
new hardware 321
OS/400 311
user journal replication 15, 21
detailed status 112
errors automatically recovered 33
journaling requirements 231
non-file objects 271
tracking entries 219
tracking entry 21
V
verifying
communications link 281, 282
journaling, IFS tracking entries 240
336
journaling, object tracking entries 243
journaling, physical files 237
key attributes 289
viewing status, active file operations 263
W
wait time, data group controlled end 187
wait time, data group controlled end during
switch 257
337
Download