MIMIX® Availability™ Version 7.1 MIMIX Operations–5250 Notices MIMIX Operations - 5250 User Guide January 2014 Version: 7.1.19.00 © Copyright 1999, 2014 Vision Solutions®, Inc. All rights reserved. The information in this document is subject to change without notice and is furnished under a license agreement. This document is proprietary to Vision Solutions, Inc., and may be used only as authorized in our license agreement. No portion of this manual may be copied or otherwise reproduced without the express written consent of Vision Solutions, Inc. Vision Solutions provides no expressed or implied warranty with this manual. The following are trademarks or registered trademarks of their respective organizations or companies: • MIMIX and Vision Solutions are registered trademarks and AutoGuard, Data Manager, Director, Dynamic Apply, ECS/400, GeoCluster, IntelliStart, Integrator, iOptimize, iTERA, iTERA Availability, MIMIX AutoNotify, MIMIX Availability, MIMIX Availability Manager, MIMIX DB2 Replicator, MIMIX Director, MIMIX dr1, MIMIX Enterprise, MIMIX Global, MIMIX Monitor, MIMIX Object Replicator, MIMIX Professional, MIMIX Promoter, OMS/ODS, RecoverNow, Replicate1, RJ Link, SAM/400, Switch Assistant, Vision AutoValidate, and Vision Suite are trademarks of Vision Solutions, Inc. • Double-Take Share, Double-Take Availability, and Double-Take RecoverNow—DoubleTake Inc. • AIX, AIX 5L, AS/400, DB2, eServer, IBM, Informix, i5/OS, iSeries, OS/400, Power, System i, System i5, System p, System x, System z, and WebSphere—International Business Machines Corporation. • Adobe and Acrobat Reader—Adobe Systems, Inc. • HP-UX—Hewlett-Packard Company. • Teradata—Teradata Corporation. • Intel—Intel Corporation. • Java, all Java-based trademarks, and Solaris—Sun Microsystems, Inc. • Linux—Linus Torvalds. • Internet Explorer, Microsoft, Windows, and Windows Server—Microsoft Corporation. • Mozilla and Firefox—Mozilla Foundation. • Netscape—Netscape Communications Corporation. • Oracle—Oracle Corporation. • Red Hat—Red Hat, Inc. • Sybase—Sybase, Inc. • Symantec and NetBackup—Symantec Corporation. • UNIX and UNIXWare—the Open Group. All other brands and product names are trademarks or registered trademarks of their respective owners. If you need assistance, contact Vision Solutions’ CustomerCare team at: CustomerCare Vision Solutions, Inc. Telephone: 1.800.337.8214 or 1.949.724.5465 Email: support@visionsolutions.com Web Site: www.visionsolutions.com/Support/Contact-CustomerCare.aspx Contents Who this book is for................................................................................................... 11 What is in this book ............................................................................................. 11 The MIMIX documentation set .................................................................................. 11 Sources for additional information............................................................................. 13 How to contact us...................................................................................................... 14 Chapter 1 MIMIX overview 15 MIMIX concepts......................................................................................................... 17 Product concepts................................................................................................. 17 System role concepts .......................................................................................... 18 Journaling concepts ............................................................................................ 19 Configuration concepts........................................................................................ 20 Process concepts ................................................................................................ 21 Additional switching concepts ............................................................................. 22 Best practices for maintaining your MIMIX environment ........................................... 23 Authority to products and commands........................................................................ 23 Accessing the MIMIX Main Menu.............................................................................. 24 Chapter 2 MIMIX policies 26 Environment considerations for policies.................................................................... 27 Policies in environments with more than two nodes or bi-directional replication. 27 When to disable automatic recovery for replication and auditing ........................ 28 Disabling audits and recovery when using the MIMIX CDP feature .............. 29 Setting policies - general ........................................................................................... 29 Changing policies for an installation .................................................................... 29 Changing policies for a data group...................................................................... 30 Resetting a data group-level policy to use the installation level value ................ 30 Policies which affect an installation ........................................................................... 31 Changing retention criteria for procedure history ................................................ 31 Policies which affect replication................................................................................. 32 Errors handled by automatic database recovery ................................................. 33 Errors handled by automatic object recovery ...................................................... 34 Policies which affect auditing .................................................................................... 36 Policies for auditing runtime behavior ................................................................. 36 Policies for submitting audits automatically ......................................................... 37 When automatically submitted audits run...................................................... 38 Changing auditing policies ........................................................................................ 41 Changing when automatic audits are allowed to run........................................... 41 Changing scheduling criteria for automatic audits......................................... 41 Changing the selection frequency of priority auditing categories ........................ 42 Changing the audit level policy when switching .................................................. 43 Changing the system where audits are performed.............................................. 43 Changing retention criteria for audit history......................................................... 43 Restricting auditing based on the state of the data group ................................... 44 Preventing audits from running ........................................................................... 45 Disabling all auditing for an installation ......................................................... 46 Disabling all auditing for a data group ........................................................... 46 Disabling automatically submitted audits....................................................... 46 Policies for switching with model switch framework .................................................. 48 Specifying a default switch framework in policies ............................................... 48 3 Setting polices for MIMIX Switch Assistant ......................................................... 49 Setting policies when MIMIX Model Switch Framework is not used.................... 49 Policy descriptions..................................................................................................... 50 Chapter 3 Checking status in environments with application groups 60 Checking application group status ............................................................................ 60 Resolving problems reported in the Monitors field .............................................. 61 Resolving problems reported in the Notifications field ........................................ 63 Resolving problems reported in Status columns ................................................. 64 Resolving a procedure status problem .......................................................... 64 Resolving an *ATTN status for an application group..................................... 65 Resolving other common status values for an application group .................. 66 Status for Work with Node Entries ............................................................................ 66 Status for Work with Data Resource Group Entries .................................................. 68 Verifying the sequence of the recovery domain ........................................................ 70 Changing the sequence of backup nodes ................................................................. 71 Examples of changing the backup sequence ...................................................... 73 Chapter 4 Working with status of procedures and steps 77 Displaying status of procedures ................................................................................ 78 Displaying status of the last run of all procedures ............................................... 78 Displaying available status history of procedure runs ......................................... 79 Resolving problems with procedure status................................................................ 80 Responding to a procedure in *MSGW status..................................................... 81 Resolving a *FAILED or *CANCELED procedure status..................................... 82 Displaying status of steps within a procedure run ..................................................... 83 Resolving problems with step status ......................................................................... 85 Responding to a step with a *MSGW status ....................................................... 87 Resolving *CANCEL or *FAILED step statuses .................................................. 88 Acknowledging a procedure ...................................................................................... 89 Running a procedure................................................................................................. 90 Resuming a procedure ........................................................................................ 91 Overriding the attributes of a step ................................................................. 91 Canceling a procedure .............................................................................................. 92 Chapter 5 Monitoring status with MIMIX Availability Status 93 Checking replication status from the MIMIX Availability Status display .................... 95 Checking audit and notification status from the MIMIX Availability Status display.... 96 Checking status of supporting services from the MIMIX Availability Status display.. 96 Chapter 6 Working with data group status 98 The Work with Data Groups display.......................................................................... 99 Problems reflected in the Audits/Recov./Notif. field .......................................... 101 Problems reflected in the Data Group column .................................................. 101 Resolving problems highlighted in the Data Group column......................... 102 Manager problems reflected in the Source and Target columns....................... 103 Replication problems reflected in the Source and Target columns ................... 103 Setting the automatic refresh interval ................................................................ 104 Working with the detailed status of data groups...................................................... 105 Displaying data group detailed status ............................................................... 105 Merged view ................................................................................................ 106 4 Object detailed status views ........................................................................ 110 Database detailed status views ................................................................... 112 Identifying replication processes with backlogs....................................................... 115 Data group status in environments with journal cache or journal state ................... 117 Resolving a problem with journal cache or journal state ................................... 119 Chapter 7 Working with audits 121 Auditing overview .................................................................................................... 122 Components of an audit .................................................................................... 122 Phases of audit processing ............................................................................... 123 Object selection methods for automatic audits.................................................. 123 How priority auditing determines what objects to select.............................. 124 How audits are submitted automatically ............................................................ 124 Audit status and results ..................................................................................... 125 Audit compliance ............................................................................................... 125 Guidelines and considerations for auditing ............................................................. 126 Auditing best practices ...................................................................................... 126 Considerations for specific audits...................................................................... 127 Recommendations when checking audit results ............................................... 127 Displaying audit runtime status ............................................................................... 129 Running an audit immediately ........................................................................... 131 Resolving audit problems .................................................................................. 133 Checking the job log of an audit ........................................................................ 135 Ending audits..................................................................................................... 136 Displaying audit history ........................................................................................... 137 Audits with no selected objects ......................................................................... 139 Working with audited objects................................................................................... 139 Displaying audited objects from a specific audit run ......................................... 141 Displaying a customized list of audited objects ................................................. 141 Working with audited object history......................................................................... 142 Displaying the audit history for a specific object................................................ 143 Displaying audit compliance.................................................................................... 144 Determining whether auditing is within compliance........................................... 145 Displaying scheduling information for automatic audits .......................................... 147 Chapter 8 Working with system-level processes 149 Displaying status of system-level processes........................................................... 149 Resolving *ACTREQ status for a system manager ........................................... 151 Checking for a system manager backlog .......................................................... 151 Starting a system manager or a journal manager ............................................. 152 Ending a system manager or a journal manager .............................................. 152 Starting collector services ................................................................................. 152 Ending collector services................................................................................... 153 Starting target journal inspection processes ..................................................... 153 Ending target journal inspection processes....................................................... 154 Displaying status of target journal inspection .......................................................... 155 Displaying results of target journal inspection ......................................................... 156 Displaying details associated with target journal inspection notifications.......... 157 Displaying messages for TGTJRNINSP notifications.................................. 157 Identifying the last entry inspected on the target system ........................................ 158 5 Chapter 9 Working with notifications and recoveries 159 What are notifications and recoveries ..................................................................... 159 Displaying notifications............................................................................................ 160 What information is available for notifications ................................................... 160 Detailed information..................................................................................... 161 Options for working with notifications ................................................................ 162 Notifications for newly created objects .................................................................... 163 Displaying recoveries .............................................................................................. 164 What information is available for recoveries...................................................... 165 Detailed information..................................................................................... 166 Options for working with recoveries .................................................................. 166 Orphaned recoveries ......................................................................................... 167 Determining whether a recovery is orphaned.............................................. 167 Removing an orphaned recovery ................................................................ 168 Chapter 10 Starting and ending replication 169 Before starting replication........................................................................................ 171 Commands for starting replication........................................................................... 171 What is started with the STRMMX command.................................................... 171 STRMMX and ENDMMX messages............................................................ 172 What is started by the default START procedure for an application group ....... 172 Choices when starting or ending an application group...................................... 172 What occurs when a data group is started .............................................................. 174 Journal starting point identified on the STRDG request .................................... 175 Journal starting point when the object send process is shared ................... 175 Clear pending and clear error processing ......................................................... 175 Starting MIMIX......................................................................................................... 179 Starting an application group................................................................................... 180 Starting selected data group processes .................................................................. 181 Starting replication when open commit cycles exist ................................................ 183 Checking for open commit cycles...................................................................... 183 Resolving open commit cycles .......................................................................... 183 Before ending replication......................................................................................... 184 Commands for ending replication............................................................................ 184 Command choice by reason for ending replication ........................................... 184 Additional considerations when ending replication............................................ 186 Ending immediately or controlled ...................................................................... 186 Controlling how long to wait for a controlled end to complete ..................... 187 Ending all or selected processes....................................................................... 187 When to end the RJ link .................................................................................... 188 What is ended by the ENDMMX command ....................................................... 188 What is ended by the default END procedure for an application group ............ 189 What occurs when a data group is ended ............................................................... 190 Ending MIMIX.......................................................................................................... 192 Ending with default values................................................................................. 192 Ending by prompting the ENDMMX command.................................................. 192 After you end MIMIX products ........................................................................... 193 Ending an application group.................................................................................... 194 Ending a data group in a controlled manner ........................................................... 195 Preparing for a controlled end of a data group .................................................. 195 6 Performing the controlled end ........................................................................... 195 Confirming the end request completed without problems ................................. 196 Ending selected data group processes ................................................................... 198 What replication processes are started by the STRDG command.......................... 199 What replication processes are ended by the ENDDG command .......................... 203 Chapter 11 Resolving common replication problems 207 Working with message queues ............................................................................... 208 Working with the message log ................................................................................ 209 Working with user journal replication errors ............................................................ 210 Working with files needing attention (replication and access path errors)......... 210 Working with journal transactions for files in error....................................... 213 Placing a file on hold ......................................................................................... 214 Ignoring a held file ............................................................................................. 214 Releasing a held file at a synchronization point ................................................ 215 Releasing a held file .......................................................................................... 215 Releasing a held file and clearing entries.......................................................... 216 Correcting file-level errors ................................................................................. 216 Correcting record-level errors............................................................................ 217 Record written in error ................................................................................. 217 Working with tracking entries .................................................................................. 219 Accessing the appropriate tracking entry display .............................................. 219 Holding journal entries associated with a tracking entry ................................... 221 Ignoring journal entries associated with a tracking entry................................... 222 Waiting to synchronize and release held journal entries for a tracking entry .... 222 Releasing held journal entries for a tracking entry ............................................ 223 Releasing and clearing held journal entries for a tracking entry........................ 223 Removing a tracking entry................................................................................. 223 Working with objects in error ................................................................................... 224 Using the Work with DG Activity Entries display ............................................... 225 Retrying data group activity entries ................................................................... 227 Retrying a failed data group activity entry ................................................... 227 Determining whether an activity entry is in a delay/retry cycle .......................... 228 Removing data group activity history entries........................................................... 229 Chapter 12 Starting, ending, and verifying journaling 230 What objects need to be journaled.......................................................................... 231 Authority requirements for starting journaling.................................................... 232 MIMIX commands for starting journaling................................................................. 233 Journaling for physical files ..................................................................................... 235 Displaying journaling status for physical files .................................................... 235 Starting journaling for physical files ................................................................... 235 Ending journaling for physical files .................................................................... 236 Verifying journaling for physical files ................................................................. 237 Journaling for IFS objects........................................................................................ 238 Displaying journaling status for IFS objects ...................................................... 238 Starting journaling for IFS objects ..................................................................... 238 Ending journaling for IFS objects ...................................................................... 239 Verifying journaling for IFS objects.................................................................... 240 Journaling for data areas and data queues............................................................. 241 7 Displaying journaling status for data areas and data queues............................ 241 Starting journaling for data areas and data queues .......................................... 241 Ending journaling for data areas and data queues............................................ 242 Verifying journaling for data areas and data queues ......................................... 243 Chapter 13 Switching 244 About switching ....................................................................................................... 244 Planned switch .................................................................................................. 245 Unplanned switch .............................................................................................. 246 Switching application group environments with procedures.............................. 247 Switching data group environments with MIMIX Model Switch Framework ...... 248 Switching an application group................................................................................ 250 Switching a data group-only environment ............................................................... 251 Switching to the backup system ........................................................................ 251 Synchronizing data and starting MIMIX on the original production system ....... 252 Switching to the production system ................................................................... 252 Determining when the last switch was performed ................................................... 253 Checking the last switch date ............................................................................ 253 Problems checking switch compliance.................................................................... 254 Performing a data group switch............................................................................... 255 Switch Data Group (SWTDG) command................................................................. 257 Chapter 14 Less common operations 259 Starting the TCP/IP server ...................................................................................... 260 Ending the TCP/IP server........................................................................................ 261 Working with objects ............................................................................................... 262 Displaying long object names............................................................................ 262 Considerations for working with long IFS path names ................................ 262 Displaying data group spooled file information.................................................. 262 Viewing status for active file operations .................................................................. 263 Displaying a remote journal link .............................................................................. 264 Displaying status of a remote journal link................................................................ 265 Identifying data groups that use an RJ link ............................................................. 267 Identifying journal definitions used with RJ ............................................................. 268 Disabling and enabling data groups ........................................................................ 269 Procedures for disabling and enabling data groups .......................................... 270 Determining if non-file objects are configured for user journal replication............... 271 Determining how IFS objects are configured .................................................... 271 Determining how data areas or data queues are configured ............................ 272 Using file identifiers (FIDs) for IFS objects .............................................................. 273 Operating a remote journal link independently........................................................ 274 Starting a remote journal link independently ..................................................... 274 Ending a remote journal link independently ...................................................... 274 Chapter 15 Troubleshooting - where to start 276 Gathering information before reporting a problem .................................................. 278 Obtaining MIMIX and IBM i information from your system ................................ 278 Reducing contention between MIMIX and user applications................................... 279 Data groups cannot be ended ................................................................................. 280 Verifying a communications link for system definitions ........................................... 281 Verifying the communications link for a data group................................................. 282 8 Verifying all communications links..................................................................... 282 Checking file entry configuration manually.............................................................. 283 Data groups cannot be started ................................................................................ 285 Cannot start or end an RJ link................................................................................. 286 Removing unconfirmed entries to free an RJ link.............................................. 286 RJ link active but data not transferring .................................................................... 287 Errors using target journal defined by RJ link.......................................................... 288 Verifying data group file entries............................................................................... 289 Verifying data group data area entries .................................................................... 289 Verifying key attributes ............................................................................................ 289 Working with data group timestamps ...................................................................... 291 Automatically creating timestamps .................................................................... 291 Creating additional timestamps ......................................................................... 291 Creating timestamps for remote journaling processing ..................................... 292 Deleting timestamps .......................................................................................... 293 Displaying or printing timestamps ..................................................................... 293 Removing journaled changes.................................................................................. 294 Performing journal analysis ..................................................................................... 295 Removing journal analysis entries for a selected file ........................................ 297 Appendix A Interpreting audit results - supporting information 299 Interpreting results for configuration data - #DGFE audit........................................ 300 When the difference is “not found” .......................................................................... 302 Interpreting results of audits for record counts and file data ................................... 303 What differences were detected by #FILDTA.................................................... 303 What differences were detected by #MBRRCDCNT ......................................... 304 Interpreting results of audits that compare attributes .............................................. 306 What attribute differences were detected .......................................................... 306 Where was the difference detected................................................................... 308 What attributes were compared ........................................................................ 309 Appendix B IBM Power™ Systems operations that affect MIMIX 310 MIMIX procedures when performing an initial program load (IPL) .......................... 310 MIMIX procedures when performing an operating system upgrade........................ 311 Prerequisites for performing an OS upgrade on either system ......................... 312 MIMIX-specific steps for an OS upgrade on the backup system....................... 313 MIMIX-specific steps for an OS upgrade on the production system with switching 315 MIMIX-specific steps for an OS upgrade on the production system without switching............................................................................................................................ 316 MIMIX procedures when upgrading hardware without a disk image change .......... 318 Considerations for performing a hardware system upgrade without a disk image change..................................................................................................................... 318 MIMIX-specific steps for a hardware upgrade without a disk image change..... 319 Hardware upgrade without a disk image change - preliminary steps .......... 319 Hardware upgrade without a disk image change - subsequent steps ......... 320 MIMIX procedures when performing a hardware upgrade with a disk image change... 321 Considerations for performing a hardware system upgrade with a disk image change..................................................................................................................... 321 9 MIMIX-specific steps for a hardware upgrade with a disk image change.......... 322 Hardware upgrade with a disk image change - preliminary steps ............... 322 Hardware upgrade with a disk image change - subsequent steps .............. 323 Handling MIMIX during a system restore ................................................................ 325 Prerequisites for performing a restore of MIMIX ............................................... 325 Index 326 10 Who this book is for Who this book is for The MIMIX Operations - 5250 book describes how to perform routine operational tasks and basic troubleshooting for MIMIX® Enterprise™ and MIMIX® Professional™ from a 5250 emulator. What is in this book The MIMIX Operations - 5250 book provides these distinct types of information: • A summary of concepts within MIMIX • Application group and data group status and troubleshooting • Audit status, troubleshooting, scheduling, and history • Procedures for starting, ending, and switching replication • Procedures for starting, ending, and verifying journaling • Procedures for handling MIMIX when performing operations such as IPLs or hardware and operating system upgrades. The MIMIX documentation set The following documents about MIMIX® Availability™ products are available: Using License Manager License Manager currently supports MIMIX® Availability™, iTERA Availability™, and iOptimize™. This book describes software requirements, system security, and other planning considerations for installing software and software fixes for Vision Solutions products that are supported through License Manager. The preferred way to obtain license keys and install software is by using Vision AutoValidate™ and the product’s Installation Wizard. However, if you cannot use the wizard or AutoValidate, this book provides instructions for obtaining licenses and installing software from a 5250 emulator. This book also describes how to use the additional security functions from Vision Solutions which are available for License Manager and MIMIX and implemented through License Manager. MIMIX Administrator Reference This book provides detailed conceptual, configuration, and programming information for MIMIX® Enterprise™ and MIMIX® Professional™. It includes checklists for setting up several common configurations, information for planning what to replicate, and detailed advanced configuration topics for custom needs. It also identifies what information can be returned in outfiles if used in automation. MIMIX Operations with IBM i Clustering This book is for administrators and operators in an IBM i clustering environment who either use the basic support for IBM i clustering provided within MIMIX or who use MIMIX® Global™ to integrate cluster management with MIMIX logical replication or supported hardware-based replication techniques. This book 11 The MIMIX documentation set focuses on addressing problems reported in MIMIX status and basic operational procedures such as starting, ending, and switching. MIMIX Operations - 5250 This book provides high level concepts and operational procedures for managing your high availability environment using MIMIX® Enterprise™ or MIMIX® Professional™ from a 5250 emulator. This book focuses on tasks typically performed by an operator, such as checking status, starting or stopping replication, performing audits, and basic problem resolution. Using MIMIX Monitor This book describes how to use the MIMIX Monitor user and programming interfaces available with MIMIX® Enterprise™ or MIMIX® Professional™. This book also includes programming information about MIMIX Model Switch Framework and support for hardware switching. Using MIMIX Promoter This book describes how to use MIMIX commands for copying and reorganizing active files. MIMIX Promoter is available with MIMIX® Enterprise™ and as nocharge feature for MIMIX® Professional™. MIMIX for IBM WebSphere MQ This book identifies requirements for the MIMIX for MQ feature which supports replication in IBM WebSphere MQ environments. This book describes how to configure MIMIX for this environment and how to perform the initial synchronization and initial startup. Once configured and started, all other operations are performed as described in the MIMIX Operations - 5250 book. 12 Sources for additional information Sources for additional information This book refers to other published information. The following information, plus additional technical information, can be located in the IBM System i and i5/OS Information Center. From the Information center you can access these IBM Power™ Systems topics, books, and redbooks: • Backup and Recovery • Journal management • DB2 Universal Database for IBM Power™ Systems Database Programming • Integrated File System Introduction • Independent disk pools • OptiConnect for OS/400 • TCP/IP Setup • IBM redbook Striving for Optimal Journal Performance on DB2 Universal Database for iSeries, SG24-6286 • IBM redbook AS/400 Remote Journal Function for High Availability and Data Replication, SG24-5189 • IBM redbook Power™ Systems iASPs: A Guide to Moving Applications to Independent ASPs, SG24-6802 The following information may also be helpful if you replicate journaled data areas, data queues, or IFS objects: • DB2 UDB for iSeries SQL Programming Concepts • DB2 Universal Database for iSeries SQL Reference • IBM redbook AS/400 Remote Journal Function for High Availability and Data Replication, SG24-5189 13 How to contact us How to contact us For contact information, visit our Contact CustomerCare web page. If you are current on maintenance, support for MIMIX products is also available when you log in to Support Central. It is important to include product and version information whenever you report problems. 14 CHAPTER 1 MIMIX overview This book provides operational information and procedures for using MIMIX® Enterprise™ and MIMIX® Professional™ through its 5250 emulator user interface. For simplicity, this book uses the term MIMIX to refer to the functionality provided by either product unless a more specific name is necessary. MIMIX® Availability™ version 7.1 provides high availability for your critical data in a production environment on IBM Power™ Systems through real-time replication of changes and the ability to quickly switch your production environment to a ready backup system. These capabilities allow your business operations to continue when you have planned or unplanned outages in your System i environment. MIMIX also provides advanced capabilities that can help ensure the integrity of your MIMIX environment. Replication: MIMIX continuously captures changes to critical database files and objects on a production system, sends the changes to a backup system, and applies the changes to the appropriate database file or object on the backup system. The backup system stores exact duplicates of the critical database files and objects from the production system. MIMIX uses two replication paths to address different pieces of your replication needs. These paths operate with configurable levels of cooperation or can operate independently. • The user journal replication path captures changes to critical files and objects configured for replication through a user journal. When configuring this path, shipped defaults use the remote journaling function of the operating system to simplify sending data to the remote system. In previous versions, MIMIX DB2 Replicator provided this function. • The system journal replication path handles replication of critical system objects (such as user profiles, program objects, or spooled files), integrated file system (IFS) objects, and document library object (DLOs) using the system journal. In previous versions MIMIX Object Replicator provided this function. Configuration choices determine the degree of cooperative processing used between the system journal and user journal replication paths when replicating database files, IFS objects, data areas, and data queues. Switching: One common use of MIMIX is to support a hot backup system to which operations can be switched in the event of a planned or unplanned outage. If a production system becomes unavailable, its backup is already prepared for users. In the event of an outage, you can quickly switch users to the backup system where they can continue using their applications. MIMIX captures changes on the backup system for later synchronization with the original production system. When the original production system is brought back online, MIMIX assists you with analysis and synchronization of the database files and other objects. 15 Automatic verification and correction: MIMIX enables earlier and easier detection of problems known to adversely affect maintaining availability and switch-readiness of your replication environment. MIMIX automatically detects and corrects potential problems during replication and auditing. MIMIX also helps to ensure the integrity of your MIMIX configuration by automatically verifying that the files and objects being replicated are what is defined to your configuration. MIMIX is shipped with these capabilities enabled. Incorporated best practices for maintaining availability and switch-readiness are key to ensuring that your MIMIX environment is in tip-top shape for protecting your data. User interfaces allow you to fine-tune to the needs of your environment. Analysis: MIMIX also provides advanced analysis capabilities through the MIMIX portal application for Vision Solutions Portal (VSP). When using the VSP user interface, you can see what objects are configured for replication as well as what replicated objects on the target system have been changed by people or programs other than MIMIX. (Objects changed on the target system affect your data integrity.) You can also check historical arrival and backlog rates for replication to help you identify trends in your operations that may affect MIMIX performance. Uses: MIMIX is typically used among systems in a network to support a hot backup system. Simple environments have one production system and one backup system. More complex environments have multiple production systems or backup systems. MIMIX can also be used on a single system. You can view the replicated data on the backup system at any time without affecting productivity. This allows you to generate reports, submit (read-only) batch jobs, or perform backups to tape from the backup system. In addition to real-time backup capability, replicated databases and objects can be used for distributed processing, allowing you to off-load applications to a backup system. The topics in this chapter include: • “MIMIX concepts” on page 17 summarizes key concepts that you need to know about MIMIX. • “Best practices for maintaining your MIMIX environment” on page 23 summarizes recommendations from Vision Solutions. • “Authority to products and commands” on page 23 identifies authority levels to MIMIX functions when additional security features provided by Vision Solutions are used. • “Accessing the MIMIX Main Menu” on page 24 describes the MIMIX Basic Main menu and the MIMIX Intermediate Main Menu. The MIMIX Basic Main menu is used to access the MIMIX Availability Status (WRKMMXSTS) display. 16 MIMIX concepts MIMIX concepts The following subtopics organize the basic concepts associated with MIMIX® into related groups. More detailed information is available in the MIMIX Administrator Reference book. Product concepts MIMIX installation - The network of IBM Power™ Systems systems that transfer data and objects among each other using functions of a common MIMIX product. A MIMIX installation is defined by the way in which you configure the MIMIX product for each of the participating systems. A system can participate in multiple independent MIMIX installations. Replication - The activity that MIMIX performs to continuously capture changes to critical database files and objects on a production system as they occur, send the changes to a backup system, and apply the changes to the appropriate database file or object on the backup system. Switch - The process by which a production environment is moved from one system to another system and the production environment is made available there. A switch may be performed as part of a planned event such as for system maintenance, or an unplanned event such as a power or equipment failure. MIMIX provides customizable functions for switching. Audits - Audits are predetermined programs that are used to check for differences in replicated objects and other conditions between systems. Audits run and can correct detected problems automatically. Policies control when audits run and many other aspects of how audits are performed. Additional auditing concepts and recommendations are described in the auditing chapter of this book. Automatic recovery - MIMIX provides a set of functions that provide the ability to automatically correct problems detected in a MIMIX installation during database replication, object replication, and auditing. During these activities, when MIMIX detects any of a set of scenarios known to interfere with maintaining your MIMIX environment, it will automatically start recovery actions to correct them. Through policies, you have the ability to disable automatic recovery in any of these areas at the installation or data group level. Application group - A MIMIX construct used to group and control resources from a single point in a way that maintains relationships between them. The use of application groups is best practice for MIMIX® Professional™ and MIMIX® Enterprise™ and required for MIMIX® Global™. Data group - A MIMIX construct that is used to control replication activities. A data group is a logical grouping of database files, data areas, objects, IFS objects, DLOs, or a combination thereof that defines a unit of work by which MIMIX replication activity is controlled. A data group may represent an application, a set of one or more libraries, or all of the critical data on a given system. Application environments may define a data group as a specific set of files and objects. 17 MIMIX concepts Prioritized status - MIMIX assigns a priority to status values to ensure that problems with the highest priorities, those for detected problems or situations that require immediate attention or intervention, are reflected on the highest level of the user interface. Additional detail and lower priority items can be viewed by drilling down to the next level within the interfaces. Those interfaces are the Work with Systems display and depending on your configuration, either the Work with Application Groups display or the Work with Data Groups display. Policies - A policy is a mechanism used to enable, disable, or provide input to a function such as replication, auditing, or MIMIX Model Switch Framework. For most policies, the initially shipped values apply to an installation. However, policies can be changed and most can also be overridden for individual data groups. Policies that control when audits are automatically performed can be set only for each specific combination of audit rule and data group. Notifications - A notification is the resulting automatic report associated with an event that has already occurred. The severity of a notification is reflected in the overall status of the installation. Notifications can be generated by a process, program, command, or monitor. Because the originator of notifications varies, it is important to note that notifications can represent both real-time events as well as events that occurred in the past but, due to scheduling, are being reported in the present. Recoveries - This term recovery is used in two ways. The most common use refers to the recovery action taken by a replication process or an audit to correct a detected difference when automatic recovery polices are enabled. The second use refers to a temporary report that provides details about a recovery action in progress that is created when the recovery action starts and is removed when it completes. System role concepts MIMIX uses several pairs of terms to refer to the role of a system within a particular context. These terms are not interchangeable. Production system and backup system - These terms describe the role of a system relative to the way applications are used on that system. A production system is the system currently running the production workload for the applications. In normal operations, the production system is the system on which the principal copy of the data and objects associated with the application exist. A backup system is the system that is not currently running the production workload for the applications. In normal operations, the backup system is the system on which you maintain a copy of the data and objects associated with the application. These roles are not always associated with a specific system. For example, if you switch application processing to the backup system, the backup system temporarily becomes the production system. Typically, for normal operations in basic two-system environment, replicated data flows from the system running the production workload to the backup system. Source system and target system - These terms identify the direction in which an activity occurs between two participating systems. 18 MIMIX concepts A source system is the system from which MIMIX replication activity between two systems originates. In replication, the source system contains the journal entries. Information from the journal entries is either replicated to the target system or used to identify objects to be replicated to the target system. A target system is the system on which MIMIX replication activity between two systems completes. Management system and network system - These terms define the role of a system relative to how the products interact within a MIMIX installation. These roles remain associated with the system within the MIMIX installation to which they are defined. One system in the MIMIX installation is designated as the management system and the remaining one or more systems are designated as network systems. A management system is the system in a MIMIX installation that is designated as the control point for all installations of the product within the MIMIX installation. The management system is the location from which work to be performed by the product is defined and maintained. Often the system defined as the management system also serves as the backup system during normal operations. A network system is any system in a MIMIX installation that is not designated as the management system (control point) of that MIMIX installation. Work definitions are automatically distributed from the management system to a network system. Often a system defined as a network system also serves as the production system during normal operations. Journaling concepts MIMIX uses journaling to perform replication and to support newer analysis functionality. Journaling and object auditing - Journaling and object auditing are techniques that allow object activity to be logged to a journal. Journaling logs activity for selected objects of specific object types to a user journal. Object auditing logs activity for all objects to the security audit journal (QAUDJRN, the system journal), including those defined to a user journal. MIMIX relies on these techniques and the entries placed in the journal receivers for replicating logged activity. Journal - An IBM i system object that identifies the objects being journaled and the journal receivers associated with the journal. The system journal is a specialized journal on the system which MIMIX uses. Journal receiver - An IBM i system object that is associated with a journal and contains the log of all activity for objects defined to the journal. Journal entry - A record added to a journal receiver that identifies an event that occurred on a journaled object. MIMIX uses file and record level journal entries to recreate the object on a designated system. Remote journaling - A function of IBM i that allows you to establish journals and journal receivers on one system and associate them with specific journals and journal receivers on another system. Once the association is established, the operating system can use the pair of journals to replicate journal entries in one direction, from the local journal to the remote journal on the other system. In some configurations, 19 MIMIX concepts MIMIX uses remote journaling for transferring data to be replicated from the source system to the target system. Configuration concepts MIMIX configuration provides considerable flexibility to enable supporting a wide variety of customer environments. Configuration is implemented through sets of related commands. The following terms describe configuration concepts. Definitions - MIMIX uses several types of named definitions to identify related configuration choices. • System definitions identify systems that participate in a MIMIX installation. Each system definition identifies one system. • Transfer definitions identify the communications path and protocol to be used between systems. • Journal definitions identify journaling environments that MIMIX uses for replication Each journal definition identifies a system and characteristics of the journaling environment on that system. • Data group definitions identify the characteristics of how replication occurs between two systems. Each data group definition determines the direction in which replication occurs between the systems, whether that direction can be switched, and the default processing characteristics for replication processes. • Application group definitions identify whether the replication environment does or does not use IBM i clustering. When clustering is used, the application group also defines information about an application or proprietary programs necessary for controlling operations in the clustering environment. Data group entries - A data group entry is a configuration construct that identifies a source of information to be replicated by or excluded from replication by a data group. Each entry identifies at least one object and its location on the source system. Classes of data group entries are based on object type. MIMIX uses data group entries to determine whether a journal entry should be replicated. Data groups that replicate from both the system journal and a user journal can have any combination of data group entries. Remote journal link (RJ link) - An RJ link is a MIMIX configuration element that identifies an IBM i remote journaling environment used by user journal replication processes. An RJ link identifies the journal definitions that define the source and target journals, primary and secondary transfer definitions for the communications path used by MIMIX, and whether the IBM i remote journal function sends journal entries asynchronously or synchronously. Cooperative processing - Cooperative processing refers to MIMIX techniques that efficiently replicate certain object types by using a coordinated effort between the system journal and user journal replication paths. Configuration choices in data group definitions and data group entries determine the degree of cooperative processing used between the system journal and user journal replication paths when replicating database files, IFS objects, data areas, and data queues. 20 MIMIX concepts Tracking entries - Tracking entries identify objects that can be replicated using advanced journaling techniques and assist with tracking the status of their replication. A unique tracking entry is associated with each IFS object, data area, and data queue that is eligible for replication using advanced journaling. IFS tracking entries identify eligible, existing IFS objects while object tracking entries identify eligible, existing data areas and data queues. Process concepts The following terms identify MIMIX processes. Some, like the system manager, are required to allow MIMIX to function. Others, like procedures, are used only when invoked by users. Replication path - A replication path is a series of processes used for replication that represent the critical path on which data to be replicated moves from its origin to its destination. MIMIX uses two replication paths to accommodate differences in how replication occurs for user journal and system journal entries. These paths operate with configurable levels of cooperation or can operate independently. • The user journal replication path captures changes to critical files and objects configured for replication through a user journal. When configuring this path, shipped defaults use the remote journaling function of the operating system to simplify sending data to the remote system. The changes are applied to the target system. • The system journal replication path handles replication of critical system objects (such as user profiles, program objects, or spooled files), integrated file system (IFS) objects, and document library object (DLOs) using the system journal. Information about the changes are sent to the target system where it is applied. System manager - The system manager is a pair of communications jobs between the management system and a network system which must be active to enable replication. The system manager monitors for configuration changes and automatically moves any configuration changes to the network system. Dynamic status changes are also collected and returned to the management system. The system manager also gathers messages and timestamp information from the network system and places them in a message log and timestamp file on the management system. In addition, the system manager performs periodic maintenance tasks, including cleanup of the system and data group history files. Journal manager - The journal manager is a job on each system that MIMIX uses to maintain the journaling environment on that system. By default, MIMIX performs both change management and delete management for journal receivers used by the replication process. Collector services - A group of jobs that are necessary for MIMIX to track historical data and to support using the MIMIX portal application within the Vision Solutions Portal. One or more collector service jobs collect and combine MIMIX status from all systems. Cluster services - When MIMIX Global is configured for IBM i clustering, MIMIX uses the cluster services function provided by IBM i to integrate the system management functions needed for clustering. Cluster services must be active in order for a cluster 21 MIMIX concepts node to be recognized by the other nodes in the cluster. MIMIX integrates starting and stopping cluster services into status and commands for controlling processes that run at the system level. Target journal inspection - A MIMIX process that reads a journal on a system being used as the target system for replication. The process identifies people or processes other than MIMIX that accessed replicated objects on the target system. Users can access the resulting information from the Replicated Objects portlet within the MIMIX portal application in Vision Solutions Portal. Procedures and steps - Procedures and steps are a highly customizable means of performing operations for application groups. A set of default procedures for each application group provide the ability to start, end, perform pre-check activity for switching, and switch the application group. Each operation is performed by a procedure that consists of a sequence of steps and multiple jobs. Each step calls a predetermined step program to perform a specific sub-task of the larger operation. Steps also identify runtime attributes for handling before and after the program call within the context of the procedure. Log space - A MIMIX object that provides an efficient storage and manipulation mechanism for replicated data that is temporarily stored on the target system during the receive and apply processes. Additional switching concepts The following concepts are specific to switching. Environments configured with application groups perform switching through procedures. Planned switch - An intentional change to the direction of replication for any of a variety of reasons. You may need to take the system offline to perform maintenance on its hardware or software, or you may be testing your disaster recovery plan. In a planned switch, the production system (the source of replication) is available. When you perform a planned switch, replication is ended on both the source and target systems. The next time you start replication, it will be set to replicate in the opposite direction. Unplanned switch - A change the direction of replication as a response to a problem. Most likely the production system is no longer available. When you perform an unplanned switch, you must initiate the switch from the target system. Replication is ended on the target system. The next time you start replication, it will be set to replicate in the opposite direction. MIMIX Model Switch Framework - A set of programs and commands that provide a consistent framework to be used when performing planned or unplanned switches in environments that do not use application groups. Typically, a model switch framework is customized to your environment through its exit programs. MIMIX Switch Assistant - A guided user interface that guides you through switching using your default MIMIX Model Switch Framework. MIMIX Switch Assistant is accessed from the MIMIX Basic Main Menu and does not support application groups. 22 Best practices for maintaining your MIMIX environment Best practices for maintaining your MIMIX environment MIMIX is shipped with default settings that incorporate many best practices for maintaining your environment. Others may require changing policies and adopting best practices within your organization. Best practices include: • Allow MIMIX to automatically correct differences detected during database and object replication processes that would otherwise result in errors. If MIMIX is unable to perform the recovery, the problem is reported as a replication error (a file is placed in held error or an object is in error). • Allow MIMIX to automatically perform audits and to automatically recover any differences detected by audits. Best practice is to allow regularly scheduled audits of all objects configured for replication and daily audits of prioritized categories of replicated objects. User interfaces summarize audit results and indicate whether MIMIX is unable to recover an object. • Perform all audits with the audit level set at level 30 immediately prior to a planned switch to the backup system and before switching back to the production system. • Perform switches on a regular basis. Best practice is to switch every three to six months. You need to set aside time for performing planned switches. Environments that continue to use MIMIX Switch Assistant can use policies so that compliance with regular switching is automatically reported in the user interface. Authority to products and commands If your MIMIX environment takes advantage of the additional security available in the product and command authority functions which Vision Solutions provides through License Manager, you may need a higher authority level in order to perform MIMIX daily operations. A MIMIX administrator can change your authorization level to commands and displays. Authorization levels typically fall into these categories: • Viewing information requires display (*DSP) authority. • Controlling operations requires operator (*OPR) authority. • Creating or changing configuration requires management (*MGT) authority. For example, consider audits. You can view an audit if you have display authority, perform audits if you have operator authority, and change policies that affect how auditing is performed if you have management authority. For more information about these provided security functions, see the Using License Manager book. 23 Accessing the MIMIX Main Menu Accessing the MIMIX Main Menu The MIMIX command accesses the main menu for a MIMIX installation. The MIMIX Main Menu has two assistance levels, basic and intermediate. The command defaults to the basic assistance level, shown in Figure 1, with its options designed to simplify day-to-day interaction with MIMIX. Figure 2 shows the intermediate assistance level. The options on the menu vary with the assistance level. In either assistance level, the available options also depend on the MIMIX products installed in the installation library and their licensing. The products installed and the licensing also affect subsequent menus and displays. Accessing the menu - If you know the name of the MIMIX installation you want, you can use the name to library-qualify the command, as follows: Type the command library-name/MIMIX and press Enter. The default name of the installation library is MIMIX. If you do not know the name of the library, do the following: 1. Type the command LAKEVIEW/WRKPRD and press Enter. 2. Type a 9 (Display product menu) next to the product in the library you want on the Vision Solutions Installed Products display and press Enter. Changing the assistance level - The F21 key (Assistance level) on the main menu toggles between basic and intermediate levels of the menu. You can also specify the the Assistance Level (ASTLVL) parameter on the MIMIX command. Figure 1. MIMIX Basic Main Menu MIMIX Basic Main Menu System: SYSTEM1 MIMIX Select one of the following: 1. 2. 3. 4. 5. 6. 10. 11. 12. 13. 14. Work with application groups Start MIMIX End MIMIX Switch all application groups Start or complete switch using Switch Asst. Work with data groups Availability status Configuration menu Work with monitors Work with messages Cluster menu WRKAG WRKDG WRKMMXSTS WRKMON WRKMSGLOG More... Selection or command ===>__________________________________________________________________________ ______________________________________________________________________________ F3=Exit F4=Prompt F9=Retrieve F21=Assistance level F12=Cancel (C) Copyright Vision Solutions, Inc., 1990, 2014. 24 Accessing the MIMIX Main Menu Note: On the MIMIX Basic Main Menu, options 5 (Start or complete switch using Switch Asst.) and 10 (Availability Status) are not recommended for installations that use application groups. Figure 2. MIMIX Intermediate Main Menu MIMIX Intermediate Main Menu System: SYSTEM1 MIMIX Select one of the following: 1. 2. 3. 4. 5. 6. 7. 11. 12. 13. 14. Work Work Work Work Work Work Work with with with with with with with data groups systems messages monitors application groups audits procedures WRKDG WRKSYS WRKMSGLOG WRKMON WRKAG WRKAUD WRKPROC Configuration menu Compare, verify, and synchronize menu Utilities menu Cluster menu More... Selection or command ===>__________________________________________________________________________ ______________________________________________________________________________ F3=Exit F4=Prompt F9=Retrieve F21=Assistance level F12=Cancel (C) Copyright Vision Solutions, Inc., 1990, 2014. 25 CHAPTER 2 MIMIX policies Each MIMIX policy is a mechanism used to enable, disable, or provide input to a function such as replication, auditing, or MIMIX Model Switch Framework. A policy may also determine how you are notified about certain problems that may occur. For most policies, the initially shipped values apply to an installation. However, policies can be changed and most can also be overridden for individual data groups. When a policy is set for a data group, it takes precedence over the installation policy. Some policies, such as ones that control when audits are automatically submitted, apply to individual audit rules for specific data groups. Policies must be changed from the management system. Changing policies requires that you have management-level authority to the Set MIMIX Policy (SETMMXPCY) command. You can set policies from a command line or from the Work with Audits, the MIMIX Availability Status, and the Work with DG Definitions displays. The topics in this chapter include: • “Environment considerations for policies” on page 27 describes additional considerations for setting policies for environments with more than two nodes or bi-directional replication. Also, applications and features can conflict with policycontrolled automatic recovery functions. • “Setting policies - general” on page 29 provides basic procedures for changing policies. Other topics in this chapter include more in-depth procedures for specific policy-controlled functionality. • “Policies which affect an installation” on page 31 identifies the policies that are set for an installation and which cannot be overridden by a data group-level setting. Also, this includes procedures for changing retention criteria for procedure history. • “Policies which affect replication” on page 32 identifies the policies associated with automatic error detection and correction during replication and identifies the common object and file error situations that can be automatically recovered. • “Policies which affect auditing” on page 36 identifies policies that influence audit runtime behavior and control scheduling for automatically submitted audits. Shipped audits and their descriptions and default scheduling details are included. • “Changing auditing policies” on page 41 provides additional information and procedures for changing policies associated with auditing. This includes changing the auditing level before switching, changing automatic audit scheduling, changing audit history retention, restricting auditing based on the state of data groups, and disabling auditing. • “Policies for switching with model switch framework” on page 48 identify the policies associated with model switch framework and includes instructions for changing these policies. • “Policy descriptions” on page 50 describes polices used by MIMIX. 26 Environment considerations for policies Environment considerations for policies Default settings for policies are chosen to address the needs of a broad set of customer environments. However, in more complex environments, you need to consider the effect of policies. Also, applications and other MIMIX features in some environments can conflict with automatic recovery actions during replication and with auditing. Policies in environments with more than two nodes or bi-directional replication Policy values may affect data throughout your entire environment, not just a single installation or data group. This is of particular concern in environments that have more than two systems (nodes) or which have replication occurring simultaneously in more than one direction (bi-directional). Specifically, be aware of the following: • In these environments, the value *DISABLED for the Objects only on target policy is recommended. When the policy is disabled, audits will detect that objects exist only on the target system but will not attempt to correct them. The commands used by an audit are aware of all objects on the target system, not just those which originate from the source system of the data group associated with the audit. In these environments, the values *DELETE and *SYNC must be used with care. When the policy value is Delete, audits will delete objects which may have originated from systems not associated with the data group being audited. When the policy value is Synchronize, audits will synchronize the objects to the source system of the data group being audited, which may not be the source system from which they originated. • Synchronization of user profiles and authorization lists associated with an object will occur unless the user profiles and authorization lists are explicitly excluded from the data group configuration. In the environments mentioned, this may result in user profiles and authorization lists being synchronized to other systems in your configuration. This behavior occurs whenever any of the automatic recovery policies are enabled (database, object, audit). To prevent this from occurring, you must explicitly exclude the user profiles and authorization lists from replication for any data group for which you do not want them synchronized. • In a simultaneously bi-directional environment, determine which system ‘wins’ in the event of a data conflict, that is, which system will be considered as having the correct data. Choose one direction of replication that will be audited and allow auditing for those data groups. Disable audits for data groups that replicate in the opposite direction. For example, data groups AB and BA are configured for bidirectional replication between system A and system B. Data group AB replicates from system A to system B and data group BA replicates the opposite direction. System B is also the management system for this installation. You chose system A as the winning system and want to permit auditing in the direction from A to B. The Audit level policy for data group AB must be set to a level that permits audits to run (level 10 or higher). The Audit level policy for data group BA must be set to disable audits. The results of audits of data group AB will be available on system B, because system B is the management system and default policy values cause 27 Environment considerations for policies rules to be run from the management system. • In environments with three or more systems in the same installation, you need to evaluate each pair of systems. For each pair of systems, evaluate the directions in which replication is permitted. If any pair of systems supports simultaneous bidirectional replication, determine the winning system in each pair and determine the direction to be audited. Set the audit level policy to permit auditing for the data group that replicates in the chosen direction. Disable auditing for the data group which replicates in the other direction. You may also want to consider changing the values of the Run rule on system policy for the installation or the audited data groups to balance processing loads associated with auditing. • In environments that permit multiple management systems in the same installation, in addition to evaluating the direction of replication permitted within each pair of systems, you must also consider whether the systems defined by each data group are both management systems. If any pair of systems supports simultaneous bi-directional replication, choose the winning system and change the Audit level policies for each data group so that only one direction is audited. You may need to change the Run rule on system policy to prevent certain data groups from being audited from specific management systems. When to disable automatic recovery for replication and auditing At times, you may need to disable automatic recoveries during replication and auditing for certain data groups because a feature in use or an application being replicated may interact with auditing in an undesirable way. Features - Do not use automatic recoveries during auditing and replication in any data group that is using functions provided by the MIMIX CDP™ feature. This feature, which requires an additional license key, permits you to perform operations associated with maintaining continuous data protection. By configuring a recovery window for a data group, you introduce an automatic delay into when the apply processes complete replication. By setting a recovery point for a data group, you identify a point that, when reached, will cause the apply processes to be suspended. In both cases, source system changes have been transferred to the target system but have not been applied. In such an environment, comparisons will report differences and automatic recoveries will attempt recovery for items that have not completed replication. To prevent this from occurring, disable comparisons and automatic recoveries for any data group which uses the MIMIX CDP feature. For details, see “Disabling audits and recovery when using the MIMIX CDP feature” on page 29. Applications - At times, data groups for some applications will encounter problems if the application cannot acquire locks on objects that are defined to MIMIX. These data groups may need to be excluded from auditing. MIMIX acquires locks occasionally to save and restore objects within the replication environment. Some applications may fail when they cannot acquire a lock on an object. Refer to our Support Central for FAQs that list specific applications whose data groups should be excluded from auditing. For those excluded data groups, you can still run compares to determine if objects are not synchronized between source and target systems. Care must be taken to recover from these unsynchronized conditions.The applications may need to be ended prior to manually synchronizing the objects. 28 Setting policies - general To exclude a data group from audits, use the instructions in “Preventing audits from running” on page 45. Disabling audits and recovery when using the MIMIX CDP feature The functions provided by the MIMIX CDP™ feature1 create an environment in which source system changes have been transferred to the target system but have not been applied. Any data group which uses this feature must disable automatic comparisons and automatic recovery actions for the data group. Do the following from the management system: 1. From the command line type SETMMXPCY and press F4 (Prompt). 2. For the Data group definition, specify the full three-part name of the data group that uses the MIMIX CDP feature. 3. Press Enter to see all the policies and their current values. 4. For Automatic object recovery, specify *DISABLED. 5. For Automatic database recovery, specify *DISABLED. 6. For Automatic audit recovery, specify *DISABLED. 7. For Audit level, select *DISABLED. 8. To accept the changes, press Enter. Setting policies - general Policies must be changed from the management system. Changing policies requires that you have management-level authority to the Set MIMIX Policy (SETMMXPCY) command. The following procedures describe the basic procedures for setting policies. Changing policies for an installation This procedure changes a policy value at the installation level. The installation level value will overridden if a data group level policy has been specified with a value other than *INST. Do the following from the management system: 1. From the command line type SETMMXPCY and press F4 (Prompt). 2. Verify that the value specified for Data group definition is *INST. 3. Press Enter to see all the policies and their current values. 4. Specify a value for the policy you want. Use F1 (Help) to view descriptions of possible values. 1. The MIMIX CDP™ feature requires an additional license key. 29 Setting policies - general 5. To accept the changes, press Enter. Changing policies for a data group Do the following from the management system: 1. From the command line type SETMMXPCY and press F4 (Prompt). 2. For the Data group definition, specify the full three-part name. 3. Press Enter to see all the policies and their current values. 4. Specify a value for the policy you want defined for the data group. Use F1 (Help) to view descriptions of possible values. 5. To accept the changes, press Enter. Resetting a data group-level policy to use the installation level value Do the following from the management system: 1. From the command line type SETMMXPCY and press F4 (Prompt). 2. For the Data group definition, specify the full three-part name. 3. Press Enter to see all the policies and their current values. 4. For the policy you want to reset, specify *INST. 5. To accept the changes, press Enter. 30 Policies which affect an installation Policies which affect an installation While many policies can be set for an installation, the policies in Table 1 cannot be overridden for an individual data group. At the data group level, these policies always have a value of *INST. Table 1. Policies that can be set only at the installation level and shipped default values. Policy Shipped Values – Installation Independent ASP library ratio 5 Procedure history retention • Minimum days • Minimum runs per procedure • Min. runs per switch procedure 7 1 1 Changing retention criteria for procedure history The procedure history retention policy determines how long to retain historical information about procedure runs that completed, completed with errors, or that failed or were canceled and have been acknowledged. Environments configured with application groups use procedures to control operations such as starting, ending, or switching. History information for a procedure includes timestamps indicating when the procedure was run and detailed information about each step within the procedure. The policy specifies how many days to keep history information and the minimum number of runs to keep. You can specify a different number of runs to keep for switch procedure runs than what is kept for other types of procedures. Each procedure run is evaluated individually against the policy and its history information is retained until the specified minimum days and minimum runs are both met. When a procedure run exceeds these criteria, system manager cleanup jobs will remove the historical information for that procedure run from all systems. The values specified at the time the cleanup jobs run are used for evaluation. To change the procedure history retention policy for the installation, do the following: 1. From the command line type SETMMXPCY and press F4 (Prompt). 2. Verify that the value *INST is specified for the Data group definition prompt: 3. Press Enter to see all the policies and their current values. 4. Locate the Procedure history retention policy. The current values are displayed. Specify values for the elements you want to change. 5. To accept the changes, press Enter. 31 Policies which affect replication Policies which affect replication Table 2 identifies the policies which can affect replication and their shipped default values. Table 2. Policies associated with replication and shipped default values. Policy Shipped Values Replication Processes Installation Data Groups System Journal User Journal Data group definition *INST Name1 Yes Yes Automatic system journal recovery *ENABLED *INST1 Yes2 – Automatic user journal recovery *ENABLED *INST – Yes2 System journal recovery notify on success *YES *INST Yes – User journal recovery notify on success *YES *INST – Yes DB apply cache *DISABLED *INST – Yes – Yes Access path maintenance3 • Optimize for DB apply • Maximum number of jobs *DISABLED 99 *INST *INST Synchronize threshold size 9,999,999 *INST Yes Yes Number of third delay retry attempts 100 *INST Yes – Third delay retry interval 15 *INST Yes – 1. 2. 3. A data group definition value of *INST indicates the policy is installation-wide. A name indicates the policies are in effect only for the specified data group. When this policy is enabled, the other policies in the same column are in effect unless otherwise noted. This policy is available only on systems running service pack 7.1.15.00 or higher. When running on earlier levels, the Parallel AP maintenance provides similar functionality. For more information about both access path maintenance functions, see the MIMIX Administrator Reference book. MIMIX can automatically attempt to correct problems it encounters during replication when the policies for Automatic system journal recovery and Automatic user journal recovery are enabled. The following topics identify what errors can be recovered in this way: • “Errors handled by automatic database recovery” on page 33 • “Errors handled by automatic object recovery” on page 34 32 Policies which affect replication Errors handled by automatic database recovery MIMIX can detect and correct the most common file error situations that occur during database replication. When the Automatic database recovery policy is enabled, database replication processes detect the types of errors listed in Table 3. When an error is detected, MIMIX automatically attempts to correct the error by starting a job to perform an appropriate recovery action. The recovery action also sends a report of a recovery in progress to the user interface. The reports are on the Work with Recoveries display (WRKRCY command). When the recovery action completes, the report is removed. The DB rcy. notify on success policy determines whether a successful recovery generates an informational notification. Only when all recovery options are exhausted without success is a file placed in hold error (*HLDERR) status. Recovery actions that end in an error do not generate a separate error notification because the error is already reflected in MIMIX status. Table 3. Errors detected and corrected during database replication when automatic database recovery is enabled. Error Description File level errors - and Unique-key record level error Typically invoked when there is a missing library, file, or member. Also invoked when an attempt to write a record to a file results in a unique key violation. Without database autonomics, these conditions result in the file being placed in *HLDERR status. Record level errors Invoked when the database apply process detects a data-level issue while processing record-level transactions. Without database autonomics, any configured collision resolution methods may attempt to correct the error. Otherwise, these conditions result in the file being placed in *HLDERR status. Errors on IFS objects configured for user journal replication Invoked during the priming of IFS tracking entries when replicated IFS objects are determined to be missing from the target system. Priming of tracking entries occurs when a data group is started after a configuration change or when Deploy Data Grp. Configuration (DPYDGCFG) is invoked. Errors on data area and data queue objects configured for user journal replication Invoked during the priming of object tracking entries when replicated data area and data queue objects are determined to be missing from the target system. Priming of tracking entries occurs when a data group is started after a configuration change or when the Deploy Data Grp. Configuration (DPYDGCFG) is invoked. Errors when DBAPY cannot open the file or apply transactions to the file Invoked when a temporary lock condition or an operating system condition exists that prevents the database apply process (DBAPY) from opening the file or applying transactions to the file. Without database autonomics, users typically have to release the file so the database apply process (DBAPY) can continue without error. 33 Policies which affect replication Errors handled by automatic object recovery MIMIX can detect and correct the most common object error situations that occur during replication. When the Automatic object recovery policy is enabled, object replication processes detect the types of errors listed in Table 4. When an error is detected, MIMIX automatically attempts to correct the error by starting a job to perform an appropriate recovery action. Unless the object is explicitly excluded from replication for a data group, the autonomic recovery action will synchronize the object to ensure that it is on the target system. Note: Object automatic recovery does not detect or correct the following problems: • Missing spooled files on the target system. • Files and objects that are cooperatively processed. Although the files and objects are not addressed, problems with authorities for cooperatively processed files and objects are addressed. • Activity entries that are “stuck” in a perpetual pending status (PR, PS, PA, or PB). The recovery action also sends a report of a recovery in progress to the user interface. In a 5250 emulator, the reports are on the Work with Recoveries display (WRKRCY command). When the recovery action completes, the report is removed. The Obj. rcy. notify on success policy determines whether a successful recovery generates an informational notification. Only when all recovery options are exhausted without success is an activity entry placed in error status. Recovery actions that end in an error do not generate a separate error notification because the error is already reflected in MIMIX status. Table 4. Errors detected and recoveries attempted by object autonomics during object replication Error Description Missing objects on target system1 An object (library-based, IFS, or DLO) exists on the source system and is within the name space for replication, but MIMIX detects that the object does not exist on the target system. Without object automatic recovery, this results in a failed activity entry. Notes: • Missing spooled files are not addressed. • Missing objects that are configured for cooperative processing are not synchronized. However, any problems with authorities (*AUTL or *USRPRF) for the missing objects are addressed. Missing parent objects on target system1 Any operation against an object whose parent object is missing on the target system. Without object autonomics, this condition results in a failed activity entry due to the missing parent object. Missing *USRPRF objects on target system1 Any operation that requires a user profile object (*USRPRF) that does not exist on the target system. Without object autonomics, this results in authority or object owner issues that cause replication errors. 34 Policies which affect replication Table 4. Errors detected and recoveries attempted by object autonomics during object replication Error Description Missing *AUTL objects on target system1 Any operation that requires a authority list (*AUTL) that does not exist on the target system.Without object autonomics, this results in authority issues that cause replication errors. In-use condition Applications which hold persistent locks on objects can result in object replication errors if the configured values for delay/retry intervals are exceeded. Default values in the data group definition provide approximately 15 minutes during which MIMIX attempts to access the object for replication. If the object cannot be accessed during this time, the result is activity entries with errors of Failed Retrieve (for locked objects on the source system) and Failed Apply (for locked objects on the target system) and a reason code of *INUSE. Notes: 1. The Number of third delay/retries policy and the Third retry interval policy determine whether automatic recovery is attempted for this error. 2. Automatic recovery for this error is not attempted when the objects are configured for cooperative processing. 1. The synchronize command used to automatically recover this problem during replication will correct this error any time the command is used. 35 Policies which affect auditing Policies which affect auditing Policies for auditing are divided into these subsets: • Policies that affect the behavior of all audits in an installation. These policies can be overridden at the data group level. When set for a specific data group, these policies affect all audits for the data group. • Policies that affect when audits automatically run and how those audits select objects. These policies are set for each unique combination of audit and data group. Policies for auditing runtime behavior The policies identified in Table 5 affect all audit runs regardless of whether the audit was automatically submitted or manually invoked. These policies can be set for the installation as well as overridden for an individual data group. The shipped default values for both levels are indicated. When the Set MIMIX Policies (SETMMXPCY) command specifies a data group definition value of *INST, the policies being changed are effective for all data groups in the installation, unless a data group-level override exists. When the data group definition specifies a name, policies which specify the value *INST inherit their value from the installation-level policy value and polices which specify other values are in effect for only the specified data group. Table 5. Shipped default values of policies associated with auditing runtime behavior. Policy Shipped Values Installation Data Groups Data group definition *INST Name Automatic audit recovery *ENABLED *INST Audit notify on success *RULE *INST Notification severity *RULE *INST Object only on target action *DISABLED *INST Journal attribute differences action • MIMIX configured higher • MIMIX configured lower *CHGOBJ *NOCHG *INST *INST User journal apply threshold action *END *INST Maximum rule runtime 1440 *INST Audit warning threshold1 7 *INST Audit action threshold1 14 *INST Audit level *LEVEL30 *INST Run rule on system *MGT *INST 36 Policies which affect auditing Table 5. Shipped default values of policies associated with auditing runtime behavior. Policy Shipped Values Installation Data Groups Action for running audits • Inactive data group • Repl. process in threshold *NOTRUN2 *NOTRUN *INST *INST Audit history retention • Minimum days • Minimum runs per audit • Object details • DLO and IFS details 7 1 *YES *YES *INST *INST *INST *INST Synchronize threshold size 9,999,999 *INST CMPRCDCNT commit threshold *NOMAX *INST 1. 2. These policies are not limited to recovery actions. This is the default shipped value on systems running MIMIX service pack 7.1.12.00 or higher. For earlier software levels, the shipped default value is *NONE. Policies for submitting audits automatically The Audit rule, Audit schedule, and Priority audit policies control when audits are automatically submitted. These policies do not have a shipped value for the installation level. The shipped values for the data group level are listed in Table 6. If the Audit level policy is disabled, all auditing is disabled, regardless of the values specified for Audit schedule and Priority audit policies. This includes manually submitted audits. Each shipped audit rule has default values for submitting priority audits as well as scheduled audits. The shipped values for a rule are used for all new data groups. When you specify names for Data group definition and Audit rule on the SETMMXPCY command, you can adjust the values for a specific audit of a single data group. Table 6. Shipped default values of policies for automatically submitting audits. Policy Shipped Values Installation Data Groups Data group definition *INST Name Audit rule – Varies by rule 37 Policies which affect auditing Table 6. Shipped default values of policies for automatically submitting audits. Policy Shipped Values Installation Audit schedule State Frequency Scheduled date Scheduled day Scheduled time Relative day of month – Priority audit State Start after Start until New objects selected Changed objects selected Unchanged objects selected Audited with no differences – 1. 2. 3. Data Groups *ENABLED1 *WEEKLY1 *SUN2 Varies by rule, see Table 7. *ENABLED3 0300003 080000 *DAILY *DAILY *WEEKLY *MONTHLY The State element in the Audit schedule policy is available in MIMIX version 7.1.12.00 and higher. For data groups that existed before upgrading to version 7.1.12.00, if the Frequency specified was a value other than *NONE, that value is preserved by the upgrade process and the State is set to *ENABLED. If the Frequency value was *NONE, it is changed to *WEEKLY and the State set to *DISABLED. The shipped default for Scheduled day changed in MIMIX version 7.1. For data groups created after installing version 7.1, the shipped default is *SUN (previously, it was *ALL). For data groups that existed before upgrading to version 7.1, the previous value for Scheduled day remains unchanged. The Priority audit policy is new in MIMIX version 7.1. The State element for the Priority audit policy is available in MIMIX version 7.1.12.00 and higher. For data groups that existed before upgrading from any version 7.0 level to version 7.1.12.00 or higher, State is set to *DISABLED and Start after is set to 030000. For data groups that existed before upgrading from versions 7.1.01.00 through 7.1.11.00 to version 7.1.12.00 or higher, if the Start after value specified was a value other than *NONE, that value is preserved by the upgrade process and the State is set to *ENABLED. However if the Start after value was *NONE, it is changed to 030000 and State is set to *DISABLED. When automatically submitted audits run For each audit rule, its shipped values enable both prioritized audits and scheduled audits to run automatically. A prioritized audit starts one or more times an hour every day during the time range specified in the Priority audit policy. A scheduled audit runs once at its specified time on the days or dates for its frequency as specified in the Audit schedule policy. For scheduled audits, the shipped value for start time of each 38 Policies which affect auditing audit rule is staggered, beginning at 2 a.m. Table 7 shows the default times for priority audits versus scheduled audits. Table 7. MIMIX rules and their shipped default times for Audit schedule (SCHEDULE) policy. Shipped Priority Start Range Shipped Scheduled Time Rule Name Description Job Name n/a1 2:00 a.m. #DGFE Checks configuration for files using cooperative processing. Uses the Check Data Group File Entries (CHKDGFE) command. sdn_DGFE All other audits: 3 a.m. to 8 a.m. 2:05 a.m. #OBJATR Compares all attributes for all object types supported for replication. Uses the Compare Object Attributes (CMPOBJA) command sdn_OBJATR 2:10 a.m. #FILATR Compares all file attributes. Uses the Compare File Attributes (CMPFILA) command. sdn_FILATR 2:15 a.m. #IFSATR Compares IFS attributes. Uses the Compare IFS Attributes (CMPIFSA) command. sdn_IFSATR 2:20 a.m. #FILATRMBR Compares basic file attributes at the member level. Uses the Compare File Attributes (CMPFILA) command. sdn_MBRATR 2:25 a.m. #DLOATR Compares all DLO attributes. Uses the Compare DLO Attributes (CMPDLOA) command. sdn_DLOATR 2:30 a.m. #MBRRCDCNT Compares the number of current records (*CURRDS) and the number of deleted records (*NBRDLTRCDS) for physical files that are defined to an active data group. Uses the Compare Record Counts (CMPRCDCNT) command. sdn_RCDCNT Note: Equal record counts suggest but do not guarantee that files are synchronized. This audit does not have a recovery phase. Differences detected by this audit appear as not recovered in the Audit Summary. 2:35 a.m. 1. #FILDTA2 Compares file contents. Uses the Compare File Data (CMPFILDTA) command. sdn_FILDTA The #DGFE audit is not eligible for prioritized auditing because it checks configuration data, not objects. 39 Policies which affect auditing 2. The #FILDTA audit and the Compare File Data (CMPFILDTA) command require TCP/IP communications as their communications protocol. 40 Changing auditing policies Changing auditing policies This topic describes how to change specific policies that affect auditing behavior and when automatic audits will run. MIMIX service providers are specifically trained to provide a robust audit solution that meets your needs. Changing when automatic audits are allowed to run Policies control aspects of when both prioritized auditing and scheduled auditing are automatically submitted. To effectively audit your replication environment you may need to fine-tune when one or both types of audits are submitted. For both types of auditing, consider: • How much time or system resource can you dedicate to audit processing each day, week, or month? • How often should all data within the database be audited? Business requirements as well as time and system resources need to be considered. • Does automatic scheduling conflict with regularly scheduled backups? • Are there jobs running at the same time as audits that could lock files needing to be accessed during recovery? For scheduled auditing (which select all objects), also consider: • Are there are a large number of objects to be compared? • Are there a large number of objects for which a rule is expected to attempt recovery? • Specific audits may have additional needs. See “Considerations for specific audits” on page 127. • While you may decide to vary the scheduled times, it is recommended that you maintain the same relative order indicated in “When automatically submitted audits run” on page 38. Changing scheduling criteria for automatic audits Both scheduled audits and priority audits have scheduling information. A change to an audit’s scheduling information is effective immediately. If an audit is in progress at the time its scheduling information is changed, the change is effective on the next automatic run of the audit. Do the following from the management system: 1. Do one of the following to access the Schedule view of the Work with Audits display: • From the MIMIX Intermediate Main Menu, select option 6 (Work with audits) and press Enter. Then use F10 as needed to access the Schedule view. • Enter the command: installation-library/WRKAUD VIEW(*SCHEDULE) 2. Type 37 (Change audit schedule) next to the audit you want to change and press 41 Changing auditing policies Enter. 3. The Set MIMIX Policies (SETMMXPCY) command appears, showing the selected audit rule and data group. The current values for the Audit schedule and Priority audit policies are displayed. Do one of the following: • To change when MIMIX is scheduled to run the audit to check all configured objects, specify the values you want for elements of the Audit schedule policy. • To change when MIMIX is allowed to submit priority-based runs of the audit every day, specify values for the Start after and Start until elements of the Priority audit policy. 4. To make the changes effective, press Enter. Changing the selection frequency of priority auditing categories When priority auditing is used, you can control how often objects within priorities are eligible for selection. Objects which had differences in their previous audit are always selected. For other priority classes, you can change how often objects within the class are eligible for selection by a prioritized audit. For descriptions of the priority classes with changeable frequencies, see the Priority audit policy description. If an audit is in progress at the time its category frequency information is changed, the change is effective on the next automatic run of the audit. Do the following from the management system: 1. Do one of the following to access the Work with Audits display: • From the MIMIX Intermediate Main Menu, select option 6 (Work with audits) and press Enter. Then use F10 as needed to access the Schedule view. • Enter the command: installation-library/WRKAUD 2. Type 37 (Change audit schedule) next to the audit you want to change and press Enter. 3. The Set MIMIX Policies (SETMMXPCY) command appears, showing the selected audit rule and data group. Page Down to see the current values of the Priority audit policy. 4. Specify values in the following prompts that indicate how often objects in each category are eligible for selection by a priority audit. • New objects selected • Changed objects selected • Unchanged objects selected • Audited with no diff. 5. To make the changes effective, press Enter. 42 Changing auditing policies Changing the audit level policy when switching Regardless of the level you use for daily operations, Vision Solutions strongly recommends that you perform audits at audit level 30 before the following events to ensure that 100 percent of the data is valid on the target system: • Before performing a planned switch to the backup system. • Before switching back to the production system. For more information about the risks associated with lower audit levels, see “Considerations for user-defined rules” on page 652. From a 5250 emulator, do the following from the management system: 1. From the command line type SETMMXPCY and press F4 (Prompt). 2. Verify that the value specified for Data group definition is *INST. 3. Press Enter to see all the policies and their current values. 4. For Audit level, specify *LEVEL30. Then press Enter. Changing the system where audits are performed The Run rule on system policy determines the system on which audits run. The shipped default is to run all audits for the installation from the management system. When changing the value of this policy, also consider your switching needs. Click this link to see additional information about the Run rule on system policy. Note: This procedure changes a policy value at the installation level. The installation level value can be overridden by a data group level policy value. Therefore, if a data group has value other than *INST for this policy, that value remains in effect. To change the policy for the installation, do the following 1. On the management system type the following command and press F4 (Prompt) installation-library/SETMMXPCY 2. Verify that the value *INST appears for the Data group definition. 3. Locate the Run rule on system policy. Specify the value you want. 4. Press Enter. Changing retention criteria for audit history The Audit history retention policy determines whether to retain information about the results of completed audits and the objects that were audited. The policy specifies how many days to keep history information and how many audit runs to keep, as well as whether details about audited library-based objects and audited DLO and IFS 43 Changing auditing policies objects are to be kept with the history information. Each audit is evaluated individually against the policy values. The policy is checked when an audit runs to determine whether to keep details about the objects audited by that run. The policy is also checked when system manager cleanup jobs run to determine if any audit has history information which exceeds both specified retention criteria. The policy value in effect at the time each check occurs determines the result. To change the audit history retention policy, do the following: 1. From the MIMIX Intermediate Main Menu, select option 6 (Work with Audits) and press Enter. 2. Determine whether to change the policy for the installation or at the data group level. From the Work with Audits display, do one of the following: • To change the policy for all audits in the installation, press F16 (Inst. policies). Then, press Enter when the Set MIMIX Policies (SETMMXPCY) command appears. • To change the policy for all audits for a specific data group, type 36 (Change DG policies) next to any audit for the data group you want and press Enter. 3. Locate the Audit history retention policy. The current values for the level you chose in Step 2 are displayed. Specify values for the elements you want to change. Note: When large quantities of objects are eligible for replication, specifying *YES to retain either Object details or DLO and IFS details may use a significant amount of disk storage. Consider the combined effect of the quantity of replicated objects for each data group, the number of days to retain history, the number of audits to retain, and the frequency in which audits are performed. 4. To accept the changes, press Enter. Restricting auditing based on the state of the data group You may want to control when audits are allowed to run based on the state of the data group at the time of the audit request. For example, if you end MIMIX so that a batch process can run, you may want to prevent audits from running while data groups are inactive. If a data group process has a backlog during peak activity, you may want to prevent audits from running while the backlog exists. Or, you may want to prevent only automatic recovery from occurring during a backlog or when the data group is inactive. The Action for running audits policy provides the ability to define what audit activity will be permitted based on the state of the data group at the time of audit request. This policy can be set for an installation or for a specific data group. Note: For installations running service pack 7.1.12.00 and higher, most audits check for threshold conditions in all database and object replication processes, including the RJ link. #FILDTA audits only check for threshold warning conditions in the RJ link and database replication processes. #DLOATR audits only check for threshold warning conditions in object replication processes. 44 Changing auditing policies For installations running earlier service packs, only database and object apply processes are checked for thresholds. Restricting audit activity in an installation based on data group state: Do the following from the management system: Note: This procedure changes a policy value at the installation level. The installation level value can be overridden by a data group level policy value. Therefore, if a data group has value other than *INST for this policy, that value remains in effect. 1. From the command line type SETMMXPCY and press F4 (Prompt). 2. Verify that the value specified for Data group definition is *INST. 3. Press Enter to see all the policies and their current values. 4. For Action for running audits, do the following: a. Specify the value you want for Inactive data group that indicates the audit actions to permit when the data group is inactive b. Specify the value you want for Repl. process in threshold that indicates the audit actions to permit when any replication process checked by an audit has reached its configured threshold. 5. To accept the changes, press Enter. Restricting audit activity for a specific data group based on its state: Do the following from the management system: 1. From the command line type SETMMXPCY and press F4 (Prompt). 2. For the Data group definition, specify the full three-part name. 3. Press Enter to see all the policies and their current values. 4. For Action for running audits, do the following: a. Specify the value you want for Inactive data group that indicates the audit actions to permit when the data group is inactive b. Specify the value you want for Repl. process in threshold that indicates the audit actions to permit when any replication process checked by an audit has reached its configured threshold. 5. To accept the changes, press Enter. Preventing audits from running There may be scenarios when you need to disable auditing completely for either an installation or a specific data group. Auditing may not be desirable on a test data group or during system or network maintenance. The Audit level policy can be used to disable all auditing, including manually invoked audits.The Audit level can be set for an installation or for specific data groups. Note that an explicitly set value for a data group will override the installation value and may still allow an audit to run. 45 Changing auditing policies You can also prevent audits for a data group from being submitted automatically but still allow them to be invoked manually. Automatic submission can be prevented for a specific audit of a data group by values specified for its priority audit and audit schedule policies. In addition to auditing, automatic recovery during replication may need to be prevented from running due to issues with applications or MIMIX features, For more information, see “When to disable automatic recovery for replication and auditing” on page 28. Disabling all auditing for an installation Note: This procedure changes a policy value at the installation level. The installation level value can be overridden by a data group level policy value. Therefore, if a data group has value other than *INST for this policy, that value remains in effect. Do the following from the management system: 1. From the command line type SETMMXPCY and press F4 (Prompt). 2. Verify that the value specified for Data group definition is *INST. 3. Press Enter to see all the policies and their current values. 4. Specify *DISABLED for the Audit level policy. 5. To accept the changes, press Enter. Disabling all auditing for a data group Do the following from the management system: 1. From the command line type SETMMXPCY and press F4 (Prompt). 2. For the Data group definition, specify the full three-part name. 3. Press Enter to see all the policies and their current values. 4. Specify *DISABLED for the Audit level policy. 5. To accept the changes, press Enter. Disabling automatically submitted audits You can control whether each audit for a data group can be submitted automatically by priority or by schedule. The Priority audit and Audit schedule policies act independently so that you can have both, one, or neither type of automatic auditing. Disabling a scheduled audit: Do the following from the management system: 1. From the command line type SETMMXPCY and press F4 (Prompt). 2. For the Data group definition, specify the full three-part name. 3. For Audit rule, specify the name of the MIMIX rule. 4. Press Enter to see the current values for the Audit schedule policy. 5. Do one of the following: 46 Changing auditing policies a. For installations running version 7.1.12.00 or higher, specify *DISABLED for the State prompt. b. For installations running earlier software levels, specify *NONE for the Frequency prompt. 6. To accept the changes, press Enter. Disabling a prioritized audit: Do the following from the management system: 1. From the command line type SETMMXPCY and press F4 (Prompt). 2. For the Data group definition, specify the full three-part name. 3. For Audit rule, specify the name of the MIMIX rule. 4. Press Enter to see the current values for the Priority audit policy. 5. Do one of the following: a. For installations running version 7.1.12.00 or higher, specify *DISABLED for the State prompt. b. For installations running earlier software levels, specify *NONE for the Start after prompt. 6. To accept the changes, press Enter. 47 Policies for switching with model switch framework Policies for switching with model switch framework In environments that do not use application groups, MIMIX Switch Assistant (which implements MIMIX Model Switch Framework) is usually used for switching. MIMIX Model Switch Framework cannot be used to switch application groups. Table 8 identifies the policies associated with switching using MIMIX Model Switch Framework and the shipped default values of those policies. For these policies, MIMIX Switch Assistant uses only the policy values specified for the installation. If MIMIX cannot determine whether a MIMIX Model Switch Framework is defined, the switch framework policy is *DISABLED. If the SETMMXPCY command specifies a data group name, the switch framework is required to be *INST. The switch thresholds are *DISABLED by default but can be changed. The policies in Table 8 have no effect on application group switching. Table 8. Shipped values of policies used by MIMIX Switch Assistant. Policy Shipped Values Installation Data Groups Data group definition *INST Name1 Switch warning threshold 90 *DISABLED Switch action threshold 180 *DISABLED Default model switch framework MXMSFDFT *INST 1. A data group definition value of *INST indicates the policy is installation-wide. A name indicates the policies are in effect only for the specified data group. Specifying a default switch framework in policies MIMIX Switch Assistant requires that you have a configured MIMIX Model Switch Framework and that you specify it in the default model switch framework policy for the installation. You may also want to adjust policies for thresholds associated with MIMIX Switch Assistant. If you do not have a configured MIMIX Model Switch Framework, contact your Certified MIMIX Consultant. From a 5250 emulator, do the following from the management system: 1. From the command line type SETMMXPCY and press F4 (Prompt). 2. Verify that the value specified for Data group definition is *INST. 3. Press Enter to see all the policies and their current values. 4. At the Default model switch framework prompt, specify the name of the switch framework to use for switching this installation. 5. To accept the changes, press Enter. 48 Policies for switching with model switch framework Setting polices for MIMIX Switch Assistant If the value of the installation-level policy is disabled, you must change the policy in order to use MIMIX Switch Assistant. Do the following from the management system: 1. From the command line type SETMMXPCY and press F4 (Prompt). 2. Verify that the value specified for Data group definition is *INST. 3. Press Enter to see all the policies and their current values. 4. Specify values for the following fields: a. For Switch warning threshold, the value 90 is recommended. b. For Switch action threshold, the value 180 is recommended. c. For Default model switch framework, specify the name of your MIMIX Model Switch Framework. 5. To accept the changes, press Enter. Setting policies when MIMIX Model Switch Framework is not used If you do not use MIMIX Model Switch Framework for switching, you disable the default model switch framework policy at the installation level. Do the following from the management system: 1. From the command line type SETMMXPCY and press F4 (Prompt). 2. Verify that the value specified for Data group definition is *INST. 3. Press Enter to see all the policies and their current values. 4. At the Default model switch framework prompt, specify *DISABLED. 5. To accept the change, press Enter. 49 Policy descriptions Policy descriptions There are minor differences in the names of policies between user interfaces for a 5250 emulator and Vision Solutions Portal. The names shown here are those used in the 5250 emulator. For a complete description of all policy values, see online help for the command. Data group definition - Select the scope of the policies to be set. When the value *INST is specified, the policies being set by the command apply to all systems and data groups in the installation, with the exception of any policy for which a data grouplevel override exists. When a three-part qualified name of a data group is specified, the policies being set by the command apply to only that data group and override the installation-level policy values. Audit rule - Select the MIMIX rule for which an audit schedule will be set for the specified data group definition. The Audit schedule policy determines when this rule will audit the data group. The audit rule must specify the value *NONE when changing any policy except the audit schedule. Automatic object recovery — Determines whether to enable functions that automatically start recovery actions to correct detected common object errors that occur during replication from the system journal. Automatic database recovery — Determines whether to enable functions that automatically start recovery actions to correct detected common file errors that occur during replication from the user journal. Automatic audit recovery — Determines whether to enable audits to start automatic recovery actions to correct differences detected during their compare phase. Object recovery notify on success — Determines whether automatic object recovery actions send an informational (*INFO) notification upon successful completion. This policy is only valid when the Automatic object recovery policy is enabled. Database recovery notify on success — Determines whether automatic database recovery actions send an informational (*INFO) notification upon successful completion. This policy is only valid when the Automatic database recovery policy is enabled. Audit notify on success — Determines whether activity initiated by audits, including recovery actions, should automatically send an informational (*INFO) notification upon successful completion. If an audit is run when the Automatic audit recovery policy is disabled, successful notifications are sent only for the compare phase of the audit. Notification severity — Determines the severity level of the notifications sent when a rule ends in error. This policy determines the severity of the notification that is sent, not the severity of the error itself. The policy is in effect whether the rule is invoked manually or automatically. This policy is useful for setting up an order precedence for notifications at the data group level. For example, if you set this policy for data group CRITICAL to be 50 Policy descriptions *ERROR when the value for the installation-level policy is *WARNING, any error notifications sent from data group CRITICAL will have a higher severity than those from other data groups. Object only on target action — Determines how the recovery action for specific audits should handle objects that are configured for replication but exist only on the target system. The following rules check for the only-on-target error: #OBJATR, #IFSATR, #DLOATR, #FILATR, and #FILATRMBR. When the Automatic audit recovery (AUDRCY) policy is enabled, these rules use the value from this policy to attempt recovery for this error. See “Policies in environments with more than two nodes or bi-directional replication” on page 27 for additional information. Journaling attribute difference action — Determines the recovery action to take for scenarios in which audits have detected differences between the actual and configured values of journaling attributes for objects journaled to a user journal. This type of difference can occur for the Journal Images attribute and the Journal Omit Open/Close attribute. Differences found on either the source or target object are affected by this policy. MIMIX configured higher Determines the recovery for correcting a difference in which the MIMIX configuration specifies an attribute value that results in a higher number of journal transactions than the object's journaling attribute. MIMIX configured lower Determines the recovery action for correcting a difference in which the MIMIX configuration specifies an attribute value that results in a lower number of journal transactions than the object's journaling attribute. DB apply threshold action — Determines what action to pass to the Compare File Data (CMPFILDTA) command or the Compare Record Count (CMPRCDCNT) command when it is invoked with *DFT specified for its DB apply threshold (DBAPYTHLD) parameter. The command’s parameter determines what to do if the database apply session backlog exceeds the threshold warning value configured for the database apply process. This policy applies whenever these commands are used and the backlog exceeds the threshold. The shipped default for this policy causes the requested command to end and may cause the loss of repairs in progress or inaccurate counts for members. You can also set this policy to allow the request to continue despite the exceeded threshold. DB apply cache — Determines whether to use database (DB) apply cache to improve performance for database apply processes.1 When this policy is enabled, MIMIX uses buffering technology within database apply processes in data groups that specify *YES for journal on target (JRNTGT). This policy is not used by data groups which specify JRNTGT(*NO) or by data groups whose target journals use journal caching or journal standby functionality provided by the IBM feature for High Availability Journal Performance (IBM i option 42). 1. This policy is not available in MIMIX Availability Manager. 51 Policy descriptions Note: When DB apply cache is used, before and after journal images are sent to the local journal on the target system.This will increase the amount of storage needed for journal receivers on the target system if before images were not previously being sent to the journal. Access path maintenance — Determines whether MIMIX can optimize access path maintenance during database apply processing as well as the maximum number of jobs allowed per data group when performing delayed maintenance. Enabling optimized access path maintenance improves performance for the database apply process. To make any change to this policy effective, end and restart the database apply processes for the affected data groups. This policy and the access path maintenance function it controls are available on systems running 7.1.15.00 or higher and replace the parallel AP maintenance (PRLAPMNT) policy and its related function offered in earlier software levels. For more information about either method of optimizing access path maintenance, see the MIMIX Administrator Reference book. Optimize for DB apply Specify whether to enable optimized access path maintenance. When enabled, the database apply processes are allowed to temporarily change the value of the access path maintenance attribute for eligible replicated files on the target system. Eligible files include physical files, logical files, and join logical files with keyed access paths that are not unique and that specify *IMMED for their access path maintenance. Maximum number of jobs Specify the maximum number of access path maintenance jobs allowed for a data group when optimized access path maintenance is enabled. The actual number of jobs varies as needed between a minimum of one job and the specified value. The default value is 99. Maximum rule runtime — Determines the maximum number of minutes an audit can run when the Automatic audit recovery policy is enabled. The compare phase of the audit is always allowed to complete regardless of this policy’s value. The elapsed time of the audit is checked when the recovery phase starts and periodically during the recovery phase. When the time elapsed since the rule started exceeds the value specified, any recovery actions in progress will end. This policy has no effect on the #MBRRCDCNT audit because it has no recovery phase. The shipped default for this policy of 1440 minutes (24 hours) prevents running multiple instances of the same audit within the same day. Valid values are 60 minutes through 10080 minutes (1 week). Audit warning threshold — Determines how many days can elapse after an audit was last performed before an indicator is set. When the number of days that have elapsed exceeds the threshold, the indicator is set to inform you that auditing needs your attention. The shipped default value of 7 days is at the limit of best practices for auditing. Note: It is recommended that you set this value to match the frequency with which you perform audits. It is possible for an audit to be prevented from running for several days due to environmental conditions or the Action for running audit policy. You may not notice that the audit did not run when expected until the 52 Policy descriptions Audit warning threshold is exceeded, potentially several days later. If you run all audits daily, specify 1 for the Audit warning threshold policy. If you do not run audits daily, set the value to what makes sense in your MIMIX environment. For example, if you run the #FILDTA audit once a week and run all other audits daily, the default value of 7 would cause all audits except #FILDTA to have exposure indicated. The value 1 would be appropriate for the daily audits but the #FILDTA audit would be identified as approaching out of compliance much of the time. Audit action threshold — Determines how many days can elapse after an audit was last performed before an indicator is set. When the number of days that have elapsed exceeds the threshold, the indicator is set to inform you that action is required because the audit is out of compliance. The shipped default of 14 days is the suggested value for this threshold, which is 7 days beyond the limit of best practices for auditing. Note: It is recommended that you set this value to match the frequency with which you perform audits. It is possible for an audit to be prevented from running for several days due to environmental conditions or the Action for running audit policy. You may not notice that the audit did not run when expected until the Audit action threshold is exceeded, potentially several days later. If you run all audits daily, specify 1 for the Audit action threshold policy. If you do not run audits daily, set the value to what makes sense in your MIMIX environment. For example, if you run the #FILDTA audit once a week and run all other audits daily, the default value of 14 would cause all audits except #FILDTA to have exposure indicated. The value 2 would be appropriate for the daily audits but the #FILDTA audit would be identified as approaching out of compliance much of the time. Audit level — Determines the level of comparison that an audit will perform when a MIMIX rule which supports multiple levels is invoked against a data group. The policy is in effect regardless of how the rule is invoked. The amount of checking performed increases with the level number. This policy makes it easy to change the level of audit performed without changing the audit scheduling or rules. No auditing is performed if this policy is set to *DISABLED. The audit level you choose for audits depends on your environment, and especially on the data compared by the #FILDTA, #DLOATR, and #IFSATR audits. When choosing a value, consider how much data there is to compare, how frequently it changes, how long the audit runs, how often you run the audit, and how often you need to be certain that data is synchronized between source and target systems. Note: Best practice is to use level 30 to perform the most extensive audit. If you use a lower level, consider its effect on how often you need to guarantee data integrity between source and target systems. Regardless of the level you use for daily operations, Vision Solutions strongly recommends that you perform audits at audit level 30 before the following events to ensure that 100 percent of the data is valid on the target system: • Before performing a planned switch to the backup system. • Before switching back to the production system. 53 Policy descriptions For additional information, see “Guidelines and considerations for auditing” on page 126 and “Changing auditing policies” on page 41. Run rule on system — Determines the system on which to run audits. This policy is used when audits are invoked with *YES specified for the value of the Use run rule on system policy (USERULESYS) parameter on the Run Rule (RUNRULE) or Run Rule Group (RUNRULEGRP) command. When *YES is specified in these commands, this policy determines the system on which to run audits. While this policy is intended for audits, any rule that meets the same criteria will use this policy. The policy’s shipped default value, *MGT, runs audits from the management system. In multi-management environments where both systems defined to a data group are management systems, the value *MGT will run audits only on the target system. You can also set the policy to run audits from the network system, the source or target system, or from a list of system definitions. When both systems of a data group are in the specified list, the target system is used. When choosing the value for the Run rule on system policy, also consider your switching needs. Action for running audits — Determines the type of audit actions permitted when certain conditions exist in the data group. If a condition exists at the time of an audit request, audit activity is restricted to the specified action. If multiple conditions exist and the values specified are different, only the most restrictive of the specified actions is allowed. If none of the conditions are present, the audit requests are performed according to other policy values in effect. Inactive data group Specify the type of auditing actions allowed when any replication process required by the data group is inactive. For example, a data group of TYPE(*ALL) is considered inactive if any of its database or object replication processes is in a state other than active. This element has no effect on the #FILDTA and #MBRRCDCNT audits because these audits can run only when the data group is active. Repl. process in threshold Specify the type of auditing actions allowed when a threshold warning condition exists for any process used in replicating the class of objects checked by an audit1. If a checked process has reached its configured warning value, auditing is restricted to the specified actions. Most audits check for threshold conditions in all database and object replication processes, including the RJ link. #FILDTA audits only check for threshold warning conditions in the RJ link and database replication processes. #DLOATR audits only check for threshold warning conditions in object replication processes. Audit history retention — Determines criteria for retaining historical information about audit results and the objects that were audited. History information for an audit includes timestamps indicating when the audit was performed, the list of objects that were audited, and result statistics. Each audit, a unique combination of audit rule and 1. This behavior applies to instances running service pack 7.1.12.00 or higher. Instances running earlier services packs check for thresholds on only the database apply and object apply processes. 54 Policy descriptions data group, is evaluated separately and its history information is retained until the specified minimum days and minimum runs are both met. When an audit exceeds these criteria, system manager cleanup jobs will remove the historical information for that audit from all systems and will remove the audited object details from the system on which the audit request originated. The values specified at the time the cleanup jobs run are used for evaluation. Minimum days Specify the minimum number of days to retain audit history for each completed audit. Valid values range from 0 through 365 days.The shipped default is 7 days. Minimum runs per audit Specify the minimum number of completed audits for which history is to retained. Valid values range from 1 through 365 runs. The shipped default is 1 completed audit. Object details Specify whether to retain the list of audited objects and their audit status for each completed audit of library-based objects. The specified value in effect at the time an audit runs determines whether object details for that run are retained. The specified value has no effect on cleanup of details for previously completed audit runs. Cleanup of retained details occurs at the time of audit history cleanup. The shipped default is *YES. DLO and IFS details Specify whether to retain the list of audited objects and their audit status for each completed audit of DLO and IFS objects. The specified value in effect at the time an audit runs determines whether object details for that run are retained. The specified value has no effect on cleanup of details for previously completed audit runs. Cleanup of retained details occurs at the time of audit history cleanup. The shipped default is *YES. Note: When large quantities of objects are eligible for replication, specifying *YES to retain either Object details or DLO and IFS details may use a significant amount of disk storage. Consider the combined effect of the quantity of replicated objects for all data groups, the number of days to retain history, the number of audits to retain, and the frequency in which audits are performed. Synchronize threshold size — Determines the threshold, in megabytes (MB), to use for preventing the synchronization of large objects during recovery actions. When any of the Automatic system journal recovery, Automatic user journal recovery, or Automatic audit recovery policies are enabled, all initiated recovery actions use this policy value for the corresponding synchronize command's Maximum sending size (MB) parameter. This policy is useful for preventing performance issues when synchronizing large objects. Number of third delay retry attempts — Determines the number of times to retry a process during the third delay/retry interval. This policy is used when the Automatic system journal recovery policy is enabled. Object replication processes use this policy value when attempting recovery of an in-use condition that persists after the data group’s configured values for the first and second delay/retry intervals are exhausted. The shipped default is 100 attempts. 55 Policy descriptions This policy and its related policy, Third delay retry interval, can be disabled so that object replication does not attempt the third delay/retry interval but still allow recoveries for other errors. Third delay retry interval — Determines the delay time (in minutes) before retrying a process in the third delay/retry interval. This policy is used when the Automatic system journal recovery policy is enabled. Object replication processes use this policy value when attempting recovery of an in-use condition that persists after the data group’s configured values for the first and second delay/retry intervals are exhausted. The shipped default is 15 minutes. Switch warning threshold — Determines how many days can elapse after the last switch was performed before an indicator is set for the installation. When the number of days that have elapsed exceeds this threshold, the indicator is set to inform you that switching may need your attention. The shipped default is 90 days, which is considered at the limit of best practices for switching. The indicator is associated with the Last switch field. The Last switch field identifies when the last completed switch was performed using the default model switch framework (DFTMSF) policy. Switch action threshold — Determines how many days can elapse after the last switch was performed before an indicator is set for the installation. When the number of days that have elapsed exceeds this threshold, the indicator is set to inform you that action is required. The shipped default of 180 days is the suggested value for this threshold, which beyond the limit of best practices for switching. The indicator is associated with the Last switch field. The Last switch field identifies when the last completed switch was performed using the default model switch framework (DFTMSF) policy. Default model switch framework — Determines the default MIMIX Model Switch Framework to use for switching. This value is used by configurations which switch via model switch framework. The shipped default value is MXMSFDFT, which is the default model switch framework name for the installation. If the default name is not being used, this value should be changed to the name of the MIMIX Model Switch Framework used to switch the installation. Independent ASP library ratio — Determines the number for n in a ratio (n:1) of independent ASP libraries (n) on the production system to SYSBAS libraries on the backup system1. For each switchable independent ASP defined to MIMIX by a device resource group, a monitor with the same name as the resource group checks this ratio. When the number of independent ASP libraries falls to a level that is below the specified ratio, the monitor sends a notification to inform you that action may be required. This signals that your recovery time objective could be in jeopardy because of a prolonged independent ASP switch time. CMPRCDCNT commit threshold — Determines the threshold at which a request to compare record counts (CMPRCDCNT command or #MBRRCDCNT audit) will not perform the comparison due to commit cycle activity on the source system. The value specified is the maximum number of uncommitted record operations that can exist for 1. The library ratio monitor and the policy it uses require a license key for MIMIX® Global™. 56 Policy descriptions files waiting to be applied at the time the compare request is invoked. Each database apply session is evaluated against the threshold independently. As a result, it is possible that record counts will be compared for files in one apply session but will not be compared for files in another apply session. For additional information see the MIMIX Administrator Reference book. Procedure history retention — Specifies criteria for retaining historical information about procedure runs that completed or completed with errors. History information for a procedure includes timestamps indicating when the procedure was run and detailed information about each step within the procedure. Each procedure run, a unique combination of procedure name and application group, is evaluated separately and its history information is retained until the specified minimum days and minimum runs are both met. When a procedure run exceeds these criteria, system manager cleanup jobs will remove the historical information for that procedure run from all systems. The values specified at the time the cleanup jobs run are used for evaluation. Minimum days Specifies the minimum number of days to retain procedure run history. The default value is 7. Minimum runs per procedure Specifies the minimum number of completed procedure runs for which history is to retained. This value applies to procedures of all other types except *SWTPLAN and *SWTUNPLAN. The default value is 1. Min. runs per switch procedure Specifies the minimum number of completed switch procedure runs for which history is to retained. This value applies to procedures of type *SWTPLAN and *SWTUNPLAN that are used to switch an application group. The default value is 12. Audit schedule — Determines the scheduling information that MIMIX uses to automatically submit audit requests for the specified data group and rule that will check all objects selected by data group configuration entries. Only configuration entries associated with the specified type of rule are used. To allow an audit to be automatically submitted, *ENABLED must be specified for State1. Changes to this policy are effective immediately. If an audit is in progress at the time of the change, the change will be reflected in the next scheduled run of the audit. Scheduled dates are entered and displayed in job date format. When the job date format is Julian, the equivalent month and day are used to determine when to schedule audit requests. State1 Specify whether scheduled auditing is enabled or disabled for this data group and audit rule. 1. The State element is available in installations running MIMIX version 7.1.12.00 or higher. In installations running earlier software levels, scheduled auditing requires specifying a value other than *NONE for Frequency and specifying values for Scheduled time and either Scheduled date or Scheduled day. Frequency is qualified by the values specified in the other elements 57 Policy descriptions Frequency Specify how often the audit request is submitted. The values specified for other elements further qualify the specified frequency. Scheduled date Select a value or specify a date, in job date format, on which the audit request is submitted. Scheduled day Select the day or days of the week on which the audit request is submitted. If today is the day of the week that is specified and the scheduled time has not passed, the audit request is submitted today. Otherwise, the job is submitted on the next occurrence of the specified day. For example, if it is 11:00 a.m. on a Friday when you set the audit schedule to specify Friday for Scheduled day and 12:00:00 for Scheduled time, the audit request is submitted today. If you are setting the policy at 4:00 p.m. on a Friday or at 11:00 a.m. on a Monday, the audit request is submitted the following Friday. Scheduled time Select a value or specify a time in 24-hour format at which the audit request is submitted on the scheduled date or day. Although the time can be specified to the second, the activity involved in submitting a job and the load on the system may affect the exact time at which the job is submitted. Time can be specified with or without a time separator. Without a time separator, specify a string of 4 or 6 digits (hhmm or hhmmss) where hh = hours, mm = minutes, and ss = seconds. Valid values for hh range from 00 to 23. Valid values for mm and ss range from 00 to 59. With a time separator, specify a string of 5 or 8 digits where the time separator specified for your job is used to separate the hours, minutes, and seconds. If this command is entered from the command line, the string must be enclosed in apostrophes. If a time separator other than the separator specified for your job is used, this command will fail. Relative day of month Select a value or specify one or more numbers with which to qualify what day a monthly audit request is submitted, relative to its occurrence in the month. A relative day is only valid when the schedule Frequency is Monthly and Scheduled day is a value other than None. For example, if Frequency is Monthly, Scheduled day is Tuesday and Thursday, and Relative day of month is 1, the audit request is submitted on the first Tuesday and first Thursday of every month. If both 1 and 4 are specified for relative day, the audit request is submitted on the first Tuesday, first Thursday, fourth Tuesday, and fourth Thursday of the month. Priority audit — Determines when priority-based audit requests for the specified data group and rule are allowed to automatically start and how often replicated objects are eligible for auditing based on their priority classification. The #DGFE rule does not support priority auditing. To allow priority-based auditing to be performed, *ENABLED must be specified for State.1 Changes to this policy are effective immediately. If an audit is in progress at 58 Policy descriptions the time of the change, the change will be reflected in the next priority-based run of the audit. State1 Specify whether priority auditing is enabled or disabled for this data group and audit rule. Start after Select a value or specify a time after which priority-based audits are allowed to start. This is the beginning of a range of time during which priority-based audits can start each day. The value *ANY allows priority-based audits to run repeatedly throughout the day. Note: Times specified for Start after and Start until elements is in 24-hour format and can be specified with or without a time separator. Without a time separator, specify a string of 4 or 6 digits (hhmm or hhmmss) where hh = hours, mm = minutes, and ss = seconds. Valid values for hh range from 00 to 23. Valid values for mm and ss range from 00 to 59. With a time separator, specify a string of 5 or 8 digits where the time separator specified for your job is used to separate the hours, minutes, and seconds. If this command is entered from the command line, the string must be enclosed in apostrophes. If a time separator other than the separator specified for your job is used, this command will fail. Start until Specify the end of the time range during which priority-based audits are allowed to start. Priority-based audits can start until this time. This value is ignored when Start after is *ANY. New objects selected Select the frequency at which new objects are considered for auditing. A new object is one that has not been audited since it was created. Changed objects selected Select the frequency at which changed objects are considered for auditing. A changed object is one that has been modified since the last time it was audited. Unchanged objects selected Select the frequency at which unchanged objects are considered for auditing. An unchanged object is one that has not been modified since the last time it was audited. Audited with no diff. Select the frequency at which objects with no differences are considered for auditing. An object with no differences is one that has not been modified since the last time it was audited and has been successfully audited on at least three consecutive audit runs. 1. The State element is available in installations running MIMIX 7.1.12.00 or higher. In installations running earlier software levels, priority auditing requires a value other than *NONE for Start after. 59 Checking application group status Checking status in environments with application groups CHAPTER 3 Monitoring status of environments that use application groups begins at the level of the application group and may include investigation into additional displays for more detailed information. The following displays are typically used: • Work with Procedure Status (WRKPROCSTS command) • Work with Application Groups (WRKAG command) • Work with Node Entries (WRKNODE command) • Work with Data Rsc. Grp. Ent. (WRKDTARGE command) • Work with Data Groups (WRKDG command) Note: This chapter does not include status for application groups that are configured for an IBM i clustering environment. If you are using clustering or have MIMIX® Global™ configured, see the MIMIX Operations with IBM i Clustering book for status information within a clustering environment. Checking application group status The status view of the Work with Application Groups display provides a summary of all status associated with an environment configured with application groups. 1. Do one of the following to access the Work with Application Groups display: • Select option 1 (Work with application groups) from the MIMIX Basic Main Menu. • Select option 5 (Work with application groups) from the MIMIX Intermediate Main Menu. • Enter the command: WRKAG 2. If necessary, use F10 to access the status view. Figure 3. Status view of Work with Application Groups display 60 Checking application group status Work with Application Groups System: SYSA Monitors . . . . . : *ACTIVE Notifications . . : *NONE Type options, press Enter. 1=Create 2=Change 10=End 12=Node entries Opt __ __ App Group __________ SAMPLEAG App Status 4=Delete 5=Display 13=Data resource groups App Node Status 6=Print 9=Start 15=Switch Data Rsc Data Node Repl. Grp Status Status Status *ACTIVE *ACTIVE Proc. Status *COMP Bottom Parameters or command ===> _________________________________________________________________________ F3=Exit F4=Prompt F5=Refresh F6=Create F9=Retrieve F10=View config F12=Cancel F13=Repeat F18=Subset F23=More options F24=More keys All status columns except the App Status column are summations of multiple processes. Investigation into lower-level displays may be necessary to determine the cause of a problem. Ideal status conditions exist when the fields and columns have the following values: • The Monitors field is *ACTIVE. • The Notifications field is *NONE. • The Proc. Status column is *COMP. • For a non-cluster application group, the App Node Status and Repl. Status fields are *ACTIVE. The App Status, Data Rsc Grp Status, and Data Node Status columns will always be blank. For any other status values, see the following: • “Resolving problems reported in the Monitors field” on page 61 • “Resolving problems reported in the Notifications field” on page 63 • “Resolving problems reported in Status columns” on page 64 Resolving problems reported in the Monitors field The Monitors field located in the upper right corner of the Work with Application Groups display summarizes the status of the MIMIX monitors on the local system. Each node or system in the product configuration has MIMIX monitors which run on that system to check for specific potential problems. A status of *ACTIVE indicates that all enabled monitors on the local system are active. 61 Checking application group status Table 9 shows possible status values for the Monitors field that require user action. For a complete list of possible values, press F1 (Help). Table 9. Monitor field status values that may require user action Monitor Status Description *ATTN Either one or more monitors on the local system failed or there are both active and inactive monitors on the local system. *INACTIVE All enabled monitors on the local system are inactive. Do the following: 1. Press F14 (Monitors) to display the list of monitors on the local system on the Work with Monitors display. 2. Check the Status column for status values of FAILED, FAILED/ACT, and INACTIVE. 3. If the monitor is needed on the local system as indicated in Table 10, use option 9 (Start) to start the monitor. Table 10. Possible monitors and the nodes on which they should be active Monitor When and Where Needed journal-name - remote journal link monitor Checks the journal message queue for indications of problems with the remote journal link. A monitor exists for both the local and remote system of the RJ link. Primary node and the current Backup node of application groups which perform logical replication. MMIASPMON - independent ASP threshold monitor Checks the QSYSOPR message queue for indications that the independent ASP threshold has been exceeded. This monitor improves the ability to detect overflow conditions that put your high availability solution at risk due to insufficient storage. On all nodes which control an independent ASP. MMNFYNEWE - monitor for new object notification entries Monitors the source system for the newly created libraries, folders, or directories that are not already included or excluded for replication by a data group configuration. Primary node when the application group is configured for logical replication. 62 Checking application group status Table 10. Possible monitors and the nodes on which they should be active Monitor When and Where Needed short-data-group-name_PAPM - Parallel access path maintenance group monitor. When this monitor exists, there are always associated monitors of one of the following types: • short-data-group-namePAPMnnn - Parallel access path maint monitor nnn • short-data-group-nameJobname - Parallel access path maint monitor job-name Target node of data group replication processes when Parallel access path maintenance policy has been enabled. Note: These monitors and the policy which enables them are only available on systems running software levels earlier than 7.1.15.00. The replacement for this function on systems running 7.1.15.00 or higher does not use monitors. For more information about optimizing access path maintenance, see the MIMIX Administrator Reference book. Resolving problems reported in the Notifications field The Notifications field located in the upper right corner of the Work with Application Groups display summarizes the status of notifications that exist for the MIMIX installation. Notifications are sent by MIMIX processes, such as monitors or audits, to inform you of potential problems. A value of *NONE indicates that no new notifications exist. Table 11 shows possible status values for the Notifications field that require user action. Table 11. Notification field status values that may require user action Notification Status Description *ERROR Action is required. At least one new notification exists with a severity of *ERROR. *WARNING At least one new notification exists with a severity of *WARNING, which indicates that the operation may be successful but an error exists. There are no new notifications with a severity of *ERROR. *INFO At least one new notification exists with a severity of *INFO. There are no new notifications with severity of *ERROR or *WARNING. Do the following: 1. Press F15 (Notifications) to display the list of notifications for the installation on the Work with Notifications display. 2. Use option 5 (Display) to view any notifications with a status of *NEW. 3. Take any further action indicated to resolve the problem. 4. When the problem is resolved, use either option 46 (Acknowledge) or option 4 (Remove) to address the notification itself. Notifications can only be removed from the system on which they originated. 63 Checking application group status Resolving problems reported in Status columns Except for the App Status column, all other columns on the Work with Application Groups display represent summations of status for multiple nodes or multiple data resource groups associated with the application groups. Investigation into lower-level displays may be necessary to determine the cause of the problem. Troubleshooting Tip: When investigating problems, begin with the Proc. Status column. A problem with procedure status can affect values in other columns. When any procedure status problems are resolved, refresh the display. Then check the other columns beginning the left-most column that is reporting a problem. Resolve the most severe problem in that column first, then refresh the display. Investigate problems in the remaining columns from left to right. To address the most common problems with status for application groups, do the following: 1. Resolve any problems reported in the Proc. Status column using Table 12. 2. Resolve any *ATTN status problems first, using Table 13. 3. Then address less severe problems, using Table 14. For a complete list of status values for each column, press F1 (Help). Resolving a procedure status problem The Proc. Status column represents a summary of the most recent run of all procedures defined for the application group. Table 12. Procedure Status values that require attention Column Value Description and Action *ACTIVE One or more of the last started runs of the procedures to run are still active or queued. Wait for the procedure to complete. Do not attempt to correct other status problems reported on the display until the procedure completes. Use option 21 (Procedure status) to view the status of the last started runs of procedures for the application group. *ATTN One or more of the last started runs of the procedures for the application group have a status that requires attention. Use option 21 (Procedure status) to view the status of the last started runs of the procedures for the application group. The resulting procedures shown on the Work with Procedure Status display. which have status values of *ATTN, *CANCELED, *FAILED, *MSGW, or *PENDCNL require user action. Also, it may be necessary to check status of the steps within the procedure to resolve a step problem before the procedure can continue. Do not attempt to correct other status problems reported on the Work with Application Groups display until the procedure problems have been resolved. For detailed information, see “Working with status of procedures and steps” on page 77. Note: The status *COMP indicates that the most recently started run of each procedure for the application group has completed as directed. This includes procedures that completed with errors and cancelled or failed procedures 64 Checking application group status whose status have been acknowledged by user action. For any individual procedure that completed with errors, user action is recommended to investigate the cause of the error and assess its implications. Resolving an *ATTN status for an application group The value *ATTN can appear in each column of the Work with Application Groups display to indicate that user action is required to correct a problem. Important! Check the status of the Proc. Status column and address any problem indicated by *ATTN or *ACTIVE status before attempting to resolve any problem reported in other columns. Use “Resolving a procedure status problem” on page 64. If there are no procedure problems, each of the other columns with an *ATTN status must be addressed individually, starting from the left-most column. Table 13. Resolving *ATTN status for columns (except Proc. Status) on the Work with Application Groups display *ATTN Status in Column Description and Actions for *ATTN Status App Node Status The App Node Status column is a summary of the status of the nodes associated with the application group. The status includes the MIMIX system manager, journal manager, target journal inspection, and collector services jobs for the nodes in the application group. *ATTN indicates that the node status and the MIMIX manager status values do not match. Investigate the status of the associated nodes and MIMIX managers using option 12 (Node entries). For additional information see “Status for Work with Node Entries” on page 66. Replication Status The Replication Status column is a summary status of data replication activity for the data resource groups associated with an application group. *ATTN indicates that data replication for at least one data group for the data resource groups has a status that does not match the status of the appropriate data resource group, has a failed state, an error condition, is active with an incorrect source system, has audit errors, or has pending recoveries. To determine the cause, use option 13 (Data resource groups) to identify the data resource group where the problem exists. For more information, see “Status for Work with Data Resource Group Entries” on page 68. 65 Status for Work with Node Entries Resolving other common status values for an application group Table 14 lists other common problems with application group status and identifies how to begin to their resolution. Table 14. Other problem statuses which may appear in multiple columns on the Work with Application Groups display Column Value Description and Action *ATTN Each column has a unique recovery. See “Resolving an *ATTN status for an application group” on page 65. *INACTIVE The current status of the resource group or node is inactive. This status is possible in the Repl. Status column. • If all columns with a status value are *INACTIVE, the application group may have been ended intentionally. Use option 9 (Start) to start the application group. • If this value appears only in the App Node Status column, the application resource group nodes are all inactive and all MIMIX manager jobs are also inactive. Use option 12 (Node entries) to investigate further. For more information see “Status for Work with Node Entries” on page 66. • If this value appears only in the Repl. Status column, logical replication is not active. Use option 13 (Data resource groups) to investigate. For more information see “Status for Work with Data Resource Group Entries” on page 68. *UNKNOWN The current status is unknown. The local node is a network node in a noncluster application group and does not participate in the recovery domain. Its status cannot be determined. When this status appears in all columns, do one of the following: • Enter the command WRKSYS. On the Work with Systems display, check the status of Cluster Services for the local system definition. If necessary, use option 9 (Start) to start cluster services. • Sign on to a node that is active and use the WRKAG command to check the application group status. If the status is still *UNKNOWN, use option 12 (Node entries) to check the status of Cluster Services on the node. Status for Work with Node Entries The Work with Node Entries displays a list of the nodes associated with an application group or a data resource group. The Resource group and Type fields at the top of the display indicate what the nodes are associated with. Figure 4. Status view of Work with Node Entries display for an application group which does 66 Status for Work with Node Entries not participate in a cluster Work with Node Entries System: Application group . . . . . : Type options, press Enter. 1=Add 2=Change 4=Remove Opt __ __ __ Node ________ SYSB SYSA SYSA SAMPLEAG 5=Display 6=Print 9=Start -------------Current------------Role Sequence Data Provider Manager Status *PRIMARY *BACKUP *ACTIVE *ACTIVE 1 *PRIMARY *PRIMARY 10=End Bottom Parameters or command ===> _________________________________________________________________________ F3=Exit F4=Prompt F5=Refresh F6=Add F7=Systems F9=Retrieve F10=View config F11=Sort by node F12=Cancel F18=Subset F24=More keys For each node listed, check the Manager Status column for status values that require attention. For a complete list of status values for each field and column, press F1 (Help). Manager Status - This column indicates the status of all of the MIMIX system manager, journal manager, target journal inspection, and collector services jobs for the specified node. Table 15. Manager Status values that require user action. Status Value Description and Action *ATTN At least one of the system manager, journal manager, target journal inspection, or collector services jobs for the node has failed. When all the nodes listed do not have the same value, use F7 (Systems) to access the Work with Systems display. • Check the status of the system and journal managers, target journal inspection, and collector services. • Use option 9 (Start) to start the managers and services that are not active on the node. *INACTIVE All system manager, journal manager, target journal inspection, and collector services jobs for the specified node are inactive. This may be intentional when MIMIX is ended to perform certain activities. Use F7 (Systems) to access the Work with Systems display. 67 Status for Work with Data Resource Group Entries Status for Work with Data Resource Group Entries The Work with Data Resource Group Entries display lists the data resource groups associated with an application group. Each entry identifies a data resource group and the summary of the replication status from its associated data groups. Figure 5. The Work with Data Resource Group Entries display for an application group that does not participate in a cluster Work with Data Rsc. Grp. Ent. System: Application group . . . . . : Type options, press Enter. 1=Add 2=Change 4=Remove 10=End 12=Node entries Opt __ __ Resource Group __________ AGRSGRP Type *DTA SYSA SAMPLEAG 5=Display 6=Print 14=Build environment Resource Group Status 8=Data groups 15=Switch Node Status 9=Start Replication Status *ACTIVE Bottom Parameters or command ===> _________________________________________________________________________ F3=Exit F4=Prompt F5=Refresh F6=Add F9=Retrieve F12=Cancel F13=Repeat F18=Subset F19=Load F21=Print list Resource Group Status - This column identifies the status of the data resource group. In environments that do not include IBM i clustering, this column is always blank. Node Status - This column identifies the status of the nodes for the data resource group. In environments that do not include IBM i clustering, this column is always blank Replication Status - The value in this column is a summary status of data replication activity for the data resource group. The status includes the status of all data group processes, replication direction, replicated object and file entries, audits, and recoveries. 68 Status for Work with Data Resource Group Entries Figure 16 identifies status values for replication that require user action. Table 16. Replication Status values that require user action. Status Value Description and Action *ATTN One or more of the following problems exist for data groups within the data resource group. • The source system of an active data group is not the primary node of of its application group. • A data group has a failed state, an error condition, audit errors, or pending recoveries. To prevent damage to data in your environment, it is important that you begin by determining which system should be the source for the data groups. Do the following: 1. From this display, use option 8 (Data groups) to check which system is the current source for the data group. 2. Determine which node has the role of current primary for the application group. From the Work with Application Groups display, use option 12 (Node entries), then check the current node role. If the current primary node is correct and a data group with an incorrect source system is active, end the data group and contact CustomerCare. If the data groups in question have the correct source system but the primary node for the application group is not correct, you need to change the recovery domain for the application group to make the correct node become primary. Use “Changing the sequence of backup nodes” on page 71. Once you have ensured that the data groups have the correct source system, resolve any error conditions reported on the Work with Data Groups display. Note: Not all data groups should necessarily be active. Only the data groups currently being used for data replication should be active. You will need to look at the current node roles and data providers for the node entries to determine which data groups should be active. *INACTIVE All replication in the data resource group is inactive. This may be normal if replication was ended to perform certain activities. Use option 8 (Data groups) to access the Work with Data Groups display. 69 Verifying the sequence of the recovery domain Verifying the sequence of the recovery domain Ensuring that sequence of the current backup nodes is set properly is critical to a successful and predictable switch process. The current sequence of backup nodes should match your recovery guidelines. Do the following to confirm the sequence of the current backup nodes before performing a switch and before removing or restoring a backup node from the cluster. 1. From the MIMIX Intermediate Main Menu, type 5 (Work with application groups) and press Enter. 2. From the Work with Application Groups display, type 12 (Node entries) next to the application group you want and press Enter. 3. The Work with Node Entries display appears, showing current information for the nodes. Confirm that the current backup nodes have the sequence order that you expect. Note: It is important that you are viewing current information on the status view of the display. Figure 6 shows an example of how the resulting Work with Node Entries display appears with current status information. If you see configured information instead, press F10 (View status). 4. If you need to change the sequence of current backup nodes, use “Changing the sequence of backup nodes” on page 71. Figure 6. Example of displaying the current sequence information for backup nodes Work with Node Entries System: Application group . . . . . : Type options, press Enter. 1=Add 2=Change 4=Remove Opt __ __ __ __ __ Node ________ NODEA NODEB NODEC NODED NODED APP1 5=Display 6=Print 9=Start -------------Current------------Role Sequence Data Provider Manager Status *PRIMARY *BACKUP *BACKUP *BACKUP *ACTIVE *ACTIVE *ACTIVE *ACTIVE 1 2 3 *NONE NODEA NODEA NODEA 10=End Bottom Parameters or command ===> _________________________________________________________________________ F3=Exit F4=Prompt F5=Refresh F6=Add F7=Systems F9=Retrieve F10=View config F11=Sort by node F12=Cancel F18=Subset F24=More keys 70 Changing the sequence of backup nodes Changing the sequence of backup nodes Use this procedure if you need to change the sequence of the current backup nodes. This procedure may change the configured sequence for multiple nodes so that you can achieve the desired sequence for backup nodes. The changes are not effective until Step 5 is performed. Do the following from an active application group: 1. From the Work with Application Groups display, type 12 (Node entries) next to the application group you want and press Enter. 2. The Work with Node Entries display appears. Using F10 to toggle between configuration view and status views, confirm that the node with the configured role of *PRIMARY is the same node that is shown as the current *PRIMARY role. • If the same node is identified as *PRIMARY for the current role and the configured role, skip to Step 4. • If the configured *PRIMARY node does not match the current *PRIMARY node, perform Step 3 to correct this situation.before making any changes to the configured sequence of backup nodes. Figure 7 is an example of how configuration information appears on the Work with Node Entries display. 3. Perform this step only if you need to correct the configured primary node to match the current primary node. This step will demote the configured primary node to a backup, then promote the correct node to become the configured primary node. Do the following: a. From the configuration view of the Work with Node Entries display, type 2 (Change) next to the configured primary node and press Enter. b. On the Change Node Entry (CHGNODE) display, specify *BACKUP for Role and press Enter. Then specify *FIRST for List position and press Enter. c. On the Work with Node Entries display, press F5 (Refresh) to view changes. All nodes in the configured view should have *BACKUP roles. d. If necessary toggle to the status view to confirm which node is the current primary node. Type 2 (Change) next to the current primary node and press Enter. e. On the Change Node Entry (CHGNODE) display, specify *PRIMARY for Role and press Enter. Then press Enter two more times. f. On the Work with Node Entries display, press F5 (Refresh) to view changes. You should see the correct node as the configured primary node. Note: The numbering for the backup sequence may not update; however, the relative order for the configured backup sequence remains unchanged. Gaps in configured sequence numbers are ignored when switching to a backup. As long as the relative order is correct, it is not necessary to change the configured sequence of backup nodes just to remove gaps in numbering. 71 Changing the sequence of backup nodes g. If the configured backup sequence is what you expect, skip to Step 5 to make the change effective. 4. To change the sequence of backup nodes, do the following: a. From the configured view of the Work with Node Entries display, type 2 (Change) next to the backup node whose sequence you want to change. b. On the Change Node Entry (CHGNODE) display, specify *BACKUP for Role and press Enter. Then specify either *FIRST or a number for List position and press Enter. Note: If you specify a number, it cannot already be used in the configured sequence list. c. On the Work with Node Entries display, press F5 (Refresh) to view changes. d. Repeat Step 4 until the correct sequence is shown on the configuration view. Note: Gaps in configured sequence numbers are ignored when switching to a backup. For example, in a configuration with two backup nodes, there is no operational difference between a backup sequence of 1, 2 and a backup sequence of 2, 5 as long as the same nodes are specified in the same relative order. 5. To make the changes to the backup order effective, do the following: a. Press F12 (Cancel) to return to the Work with Application Groups display. b. Type 9 (Start) next to the application group you want and press F4 (Prompt). c. On the Start Application Group (STRAG) display, specify *CONFIG for Current node roles and press Enter. d. The Procedure prompt appears. If needed, specify a different value and then press Enter. 6. Confirm that the node entries have changed. Type 12 (Node entries) next to the application group and press Enter. If necessary, use F10 to access the status view. The current backup nodes should be in the new order. 72 Changing the sequence of backup nodes Figure 7. Example of displaying the configured sequence information for backup nodes: Work with Node Entries System: Application group . . . . . : Type options, press Enter. 1=Add 2=Change 4=Remove Opt __ __ __ __ __ Node ________ NODEA NODEB NODEC NODED NODED APP1 5=Display 6=Print 9=Start 10=End -----------Configured-----------Role Sequence Data Provider *PRIMARY *BACKUP *BACKUP *BACKUP 1 4 5 *PRIMARY *PRIMARY *PRIMARY *PRIMARY Bottom Parameters or command ===> _________________________________________________________________________ F3=Exit F4=Prompt F5=Refresh F6=Add F7=Systems F9=Retrieve F10=View status F11=Sort by node F12=Cancel F18=Subset F24=More keys Examples of changing the backup sequence The following examples illustrate problems with the current backup sequence and how to correct them. Example 1 - Changing the backup sequence when primary node is ok Table 17 shows a four-node environment where the current backup sequence does not reflect the desired behavior in the event of a switch. Also, the relative order of the 73 Changing the sequence of backup nodes configured backup sequence does not match the relative order of either the current sequence or the desired sequence. Table 17. Example 1, showing discrepancies in backup sequences Desired Order Initial Values, Example 1 Work with Node Entries Status View Opt __ __ __ __ __ Node ________ NODEA NODEB NODED NODEC -----------Current--------------Role Sequence Data Provider *PRIMARY *BACKUP *BACKUP *BACKUP 1 2 3 *PRIMARY *PRIMARY *PRIMARY *PRIMARY Configured View Opt __ __ __ __ __ Node ________ NODEA NODEC NODEB NODED -----------Configured-----------Role Sequence Data Provider *PRIMARY *BACKUP *BACKUP *BACKUP 1 2 3 *PRIMARY *PRIMARY *PRIMARY *PRIMARY Each row in Table 18 shows a change to be made to the nodes on the configured view of the Work with Node Entries display. The rows are in the order that the changes need to occur to correct this example configuration to the desired order. Table 18. Order in which to change nodes to achieve the desired configuration for example 1 Node to Change Change To NODEB Role = *BACKUP Position = *FIRST Effect on Configured Order, Example 1 Intermediate step Configured View Opt __ __ __ __ __ Node ________ NODEA NODEB NODEC NODED Notes -----------Configured-----------Role Sequence Data Provider *PRIMARY *BACKUP *BACKUP *BACKUP 1 2 3 *PRIMARY *PRIMARY *PRIMARY *PRIMARY 74 Changing the sequence of backup nodes Table 18. Order in which to change nodes to achieve the desired configuration for example 1 Node to Change Change To Effect on Configured Order, Example 1 NODED Role = *BACKUP Position = *FIRST Notes Configured View Opt __ __ __ __ __ Node ________ NODEA NODED NODEB NODEC -----------Configured-----------Role Sequence Data Provider *PRIMARY *BACKUP *BACKUP *BACKUP 1 2 3 *PRIMARY *PRIMARY *PRIMARY *PRIMARY Desired configuration but it is not effective until STRAG ROLE (*CONFIG) is performed. Example 2 - Correcting the configured primary node and changing the backup sequence Table 19 shows a four-node environment where the current backup sequence does not reflect the desired behavior in the event of a switch. Also, the current and configured primary node do not match.The configured primary node must be corrected first, before attempting to correct any backup node sequence problems. Table 19. Example 2, showing discrepancies in primary node and backup sequences Desired Order Initial Values, Example 2 Work with Node Entries Status View Opt __ __ __ __ __ Node ________ NODEA NODEB NODED NODEC -----------Current--------------Role Sequence Data Provider *PRIMARY *BACKUP *BACKUP *BACKUP 1 2 3 *PRIMARY *PRIMARY *PRIMARY *PRIMARY Configured View Opt __ __ __ __ __ Node ________ NODEB NODEC NODEA NODED -----------Configured-----------Role Sequence Data Provider *PRIMARY *BACKUP *BACKUP *BACKUP 1 2 3 *PRIMARY *PRIMARY *PRIMARY *PRIMARY 75 Changing the sequence of backup nodes Each row in Table 20 shows a change to be made to the nodes on the configured view of the Work with Node Entries display. The rows are in the order that the changes need to occur to correct this example configuration to the desired order. Table 20. Order in which to change nodes to achieve the desired configuration for example 2. Node to Change Change To NODEB Role = *BACKUP Position = *FIRST Effect on Configured Order, Example 2 NODEA Node ________ NODEB NODEC NODEA NODED Role = *PRIMARY -----------Configured-----------Role Sequence Data Provider *BACKUP *BACKUP *BACKUP *BACKUP 1 2 3 4 *PRIMARY *PRIMARY *PRIMARY *PRIMARY Configured View Opt __ __ __ __ __ NODED Intermediate step Configured View Opt __ __ __ __ __ Node ________ NODEA NODEB NODEC NODED Role = *BACKUP Position = *FIRST -----------Configured-----------Role Sequence Data Provider *PRIMARY *BACKUP *BACKUP *BACKUP 1 2 3 *PRIMARY *PRIMARY *PRIMARY *PRIMARY Configured View Opt __ __ __ __ __ Node ________ NODEA NODED NODEB NODEC Notes -----------Configured-----------Role Sequence Data Provider *PRIMARY *BACKUP *BACKUP *BACKUP 1 2 3 *PRIMARY *PRIMARY *PRIMARY *PRIMARY Intermediate step, corrects configured *PRIMARY. The sequence number for Backup 3 may appear as 4. The relative order is equivalent. Desired configuration but it is not effective until STRAG ROLE (*CONFIG) is performed. 76 Working with status of procedures and steps CHAPTER 4 This chapter describes how to work with procedures and steps. Procedures are used to perform operations for application groups. All procedures are associated with an application group. This chapter does not apply to configurations that do not use application groups. When working with status of procedures and steps, it is important to understand how multiple jobs are used to process the steps in a procedure. A procedure uses multiple asynchronous jobs to run the programs identified within its steps. Starting a procedure starts one job for the application group and an additional job for each of its data resource groups. These jobs operate independently and persist until the procedure ends. Each persistent job evaluates each step in sequence for work to be performed within its domain. When a job for a data resource group encounters a step that acts on data groups, it spawns an additional job for each subordinate data group. Each spawned data group job performs the work for that step and then ends. This chapter contains the following topics: • “Displaying status of procedures” on page 78 describes how to display the status of procedure runs, including the most recent run as well as runs kept for their status history. • “Resolving problems with procedure status” on page 80 describes the conditions which cause each procedure status value and the actions required to resolve problem statuses. This includes how to resolve procedure inquiry messages and failed or canceled procedures. • “Displaying status of steps within a procedure run” on page 83 describes how to display status of steps within a procedure as well as the differences between the collapsed and expanded views of the Work with Step Status display. • “Resolving problems with step status” on page 85 describes the conditions which cause each step status value and the actions required to resolve problem statuses. This includes how to resolve step inquiry messages and failed or canceled steps. • “Acknowledging a procedure” on page 89 describes how to manually change a procedure with a status of *CANCELED, *FAILED, or *COMPERR to an acknowledged status. • “Running a procedure” on page 90 describes how to start a user procedure and the parameter that controls the step at which the procedure begins. • “Canceling a procedure” on page 92 describes how to cancel an active procedure. 77 Displaying status of procedures Displaying status of procedures You can view the status of runs of procedures from the Work with Procedure Status display. The term “the last run” of a procedure refers to the most recently started run of a procedure, which may be in progress or may have completed. Also, the status of other previously performed runs of procedures may be available, subject to the current settings of the Procedure history retention policy. The Work with Procedure Status display lists procedures in reverse chronological order so that the most recently started procedures are at the top of the list. Procedures that have never been requested to run do not appear on this display. Figure 8 shows an example of the Work with Procedure Status display subsetted to show only runs of a specific procedure and application group. F11 toggles between views that show the Start time column and columns for the Duration of the procedure and the Node on which the procedure was started. Timestamps are in the local job time. If you have not already ensured that the systems in your installation use coordinated universal time, see “Setting the system time zone and time” on page 311. Figure 8. A subsetted view of the Work with Procedure Status display. Work with Procedure Status System: Type options, press Enter. 5=Display 6=Print 8=Step status 12=Cancel 13=Change status Opt __ __ Procedure SWTPLAN SWTPLAN App Group SAMPLEAG SAMPLEAG Type *SWTPLAN *SWTPLAN SYSTEMA 9=Run 11=Display message 14=Resume Status *COMPLETED *COMPLETED ---Start Time---03/01/10 11:25:05 03/01/10 11:04:58 Bottom Parameters or command ===> _________________________________________________________________________ F3=Exit F4=Prompt F5=Refresh F9=Retrieve F11=Duration F12=Cancel F13=Repeat F18=Subset F21=Print list Displaying status of the last run of all procedures To display the status of the last run of all procedures for an application group, do the following: 1. From the MIMIX Basic Main Menu, select option 1 (Work with application groups). 2. The Work with Application Groups display appears. Type 21 (Procedure status) 78 Displaying status of procedures next to the application group you want and press Enter. The last run of all procedures for the application group are listed on the Work with Procedure Status display. 3. Locate the procedure you want and check the value of the Status column. Displaying available status history of procedure runs To display status of all available runs of a selected procedure, do the following: 1. From the MIMIX Basic Main Menu, select option 1 (Work with application groups). 2. The Work with Application Groups display appears. Type 20 (Procedures) next to the application group you want and press Enter. 3. The Work with Procedures display appears, listing all procedures for the selected application group. Type 14 (Procedure status) next to the procedure you want and press Enter. All available runs for the selected procedure are listed on the Work with Procedure Status display. The most recently started procedure runs are at the top of the list, and may still be active. 4. Locate the run of the procedure you want and check the value of the Status column. Note: To view status of all runs of all procedures for all application groups, you can either press F20 (Procedure status) from the Work with Application Groups display, press F14 (Procedure status) from the Work with Procedures display, or enter the command: WRKPROCSTS. 79 Resolving problems with procedure status Resolving problems with procedure status Table 21 identifies the possible status values that can appear on the Work with Procedure Status display and identifies the action to take to resolve reported problems. Table 21. Procedure status values with action required Category Status Value Description and Action Required Active *ACTIVE The procedure is currently running. No steps require attention. *ATTN The procedure requires attention. Either there is a step with a status of *MSGW, or there is an active step and one or more steps with step status values of *ATTN, *CANCEL, *FAILED, or *IGNERR. Action Required: Determine the status of each step and the action required to correct that status. See “Resolving problems with step status” on page 85. *MSGW A step within the procedure is waiting for a response to an inquiry message. The procedure cannot process the step or any subsequent steps without a reply to the message. Action Required: Display and respond to the inquiry message using “Responding to a procedure in *MSGW status” on page 81. *PENDCNL A request to cancel the procedure is in progress. When the activity for the steps in progress at the time of the cancel request ends, the procedure status changes to *CANCELED. *QUEUED A request to run the procedure is currently waiting on the job queue. When the procedure becomes an active job, the procedure status changes to *ACTIVE. *CANCELED Either the procedure was canceled and did not complete, or steps within the procedure were canceled as a response to inquiry messages from the steps. The procedure was partially performed. Action Required: Use “Resolving a *FAILED or *CANCELED procedure status” on page 82 to determine the state of your environment and whether to resume the procedure or to acknowledge its status. *FAILED The procedure failed. Jobs for one or more steps had errors. Those steps were configured to end if they failed. The procedure was partially performed. Action Required: Use “Resolving a *FAILED or *CANCELED procedure status” on page 82 to determine the state of your environment and whether to resume the procedure or to acknowledge its status Resumable 80 Resolving problems with procedure status Table 21. Procedure status values with action required Category Status Value Description and Action Required Acknowledged *ACKCANCEL The procedure was canceled and a user action acknowledged the cancellation so that the procedure can no longer be resumed. *ACKFAILED The procedure failed and a user action acknowledged the failure so that the procedure can no longer be resumed. *ACKERR The procedure completed with errors and a user action acknowledged the procedure. It is assumed that the user reviewed the steps with errors. A status of completed with errors is only possible when the steps with errors had been configured (within the procedure) to ignore errors or a user’s response to a step in message wait status was to ignore the error and continue running the procedure. After the step is acknowledged, the procedure status changes to *ACKERR. *COMPERR The procedure completed with errors. One or more steps had errors and were configured to continue processing after an error. Action Recommended: Investigate the cause of the error and assess its implications. *COMPLETED The procedure completed successfully. Completed Responding to a procedure in *MSGW status A procedure in *MSGW status is effectively paused at a known point in its processing as a result of a runtime attribute on one of its steps. The procedure sent an inquiry message because a step specified *MSGW for its Action before step (BEFOREACT) attribute. All jobs for the procedure have completed processing all previous steps and are waiting to run the step’s program. An operator response is required. To respond to a procedure in *MSGW status, do the following from the Work with Procedure Status display: 1. To see which step is waiting, type 8 (Step status) next to the procedure and press Enter. 2. The Work with Step Status display appears. The information on this display can be used to determine which step is waiting to start. You will see steps with values of *COMP, *IGNERR, or *DSBLD followed by no status for all remaining steps. The first step with no status is the step that is waiting to start. Based on that step, determine how to respond to the message and whether you are ready to respond. 3. You cannot display or respond to the procedure message from the Work with Step Status display. Press F12 to return to the Work with Procedure Status display. 4. Type 11 (Display message) next to the procedure in *MSGW status and press Enter. 5. You will see the message “Procedure name for application group name requires response. (G C).” Do one of the following: 81 Resolving problems with procedure status • A response of G (Go) is required to start processing the step. Type G and press Enter. • A response of C (Cancel) will cancel the procedure. Type C and press Enter. Resolving a *FAILED or *CANCELED procedure status When a procedure fails or is canceled, subsequent attempts to run the same procedure will fail until user action is taken. You need to determine the best course of action for your environment based on the implications of the partially performed procedure. This topic will assist you in evaluating the cause of the failure or cancellation, as well as the state of other steps within the procedure. Important! Steps with failed or canceled jobs need to be resolved. Other asynchronous jobs may have successfully processed the same step and continued on to process other subsequent steps before the procedure ended. The actions taken by those steps as well as by completed steps which preceded the problem are not reversed. Some steps may not have been processed at all. Do the following from the Work with Procedure Status display: 1. Type 8 (Step status) next to the *FAILED or *CANCELED run of the procedure and press Enter. 2. The Work with Step Status display appears. Look for steps with a status of *CANCEL, *FAILED, or *ATTN. Also use F7 (Expand) to see status for the jobs which processed the steps. A procedure with *FAILED status did not complete due to errors. In the collapsed status view, one or more steps will have a status *ATTN or *FAILED. Other jobs may have processed subsequent steps before the procedure ended. In the expanded view, look for one or more jobs with a status of *FAILED. For detailed information use “Resolving problems with step status” on page 85. A procedure with *CANCELED status did not complete due to user action. Any of the following may have occurred: • A user cancelled an inquiry message sent by the procedure because a step was configured to wait for a reply before starting. This scenario is identified by the absence of steps with status values of *FAILED, *CANCEL, or *ATTN. Instead, you will see steps with values of *COMP, *IGNERR, or *DSBLD followed by no status for all remaining steps. The first step with no status is the step that waited to start. Continue with step Step 3. • A user cancelled an inquiry message sent by a step which had a job that ended in error. At least one step in the collapsed view will have a status of *ATTN or *CANCEL. One or more steps will have job with a status of *CANCEL in the expanded view. Other jobs may have processed subsequent steps before the procedure ended. For detailed information use “Resolving problems with step status” on page 85. • A user canceled the procedure by using option 12 (Cancel) from the Work with Procedure Status display or by using the Cancel Procedure (CNLPROC) command. Steps in the collapsed view could have any status except *ACTIVE or *MSGW. Determine if there are any jobs with status values of *FAILED or 82 Displaying status of steps within a procedure run *CANCEL in the expanded view. Other jobs may have processed subsequent steps before the procedure ended. For detailed information use “Resolving problems with step status” on page 85. 3. After you have completed your evaluation and have taken any needed corrective action to resolve why jobs failed or were canceled, determine how to best complete the procedure. Choices are: • Resume the procedure. If you resume a failed procedure, processing will begin with the step that failed. If you resume a canceled procedure, processing will begin with steps following the cancelled step. Optionally, if you were unable to resolve a problem for a step in error, you can override the attributes of that step for when the procedure is resumed. See “Resuming a procedure” on page 91. • Acknowledge the procedure status. Procedures with a status of *CANCELED or *FAILED can be acknowledged (set to *ACKCANCEL or *ACKFAILED, respectively) to indicated you have investigated the problem steps and want to run the procedure again starting at its first step. This option should only be used after you have evaluated the effect of activity performed by the procedure. See “Acknowledging a procedure” on page 89. Displaying status of steps within a procedure run The Work with Step Status display provides access to detailed information about status of steps for a specific run of a procedure for an application group. Timestamps are in the local job time. If you have not already ensured that the systems in your installation use coordinated universal time, see the MIMIX Administrator Reference book for the setting system time topic. To display step status for a procedure run, do the following: 1. Use one of the following to access the run of the procedure you want: • “Displaying status of the last run of all procedures” on page 78 • “Displaying available status history of procedure runs” on page 79 2. From the Work with Procedure Status display, type 8 (Step status) next to the run of the procedure you want and press Enter. 3. Press F7 (Expand) to view status of the individual jobs used to process each step. The steps listed on the Work with Step Status display appear in sequence number order as defined by steps in the procedure. If the procedure is in progress, the display shows status for the steps that have run, the start time and status of the step that is in progress, and blank status and start time for steps that have not yet run. 83 Displaying status of steps within a procedure run Collapsed view - Figure 9 shows the initial collapsed view of the Work with Step Status display. In this view, each step of the procedure is shown as a single row and step status represents the summary of all jobs used by the step. Figure 9. Collapsed view of the Work with Step Status display. Work with Step Status System: SYSTEMA Procedure: SWTPLAN App. group: SAMPLEAG Type: *SWTPLAN Procedure status: *COMPLETED Start time: 03/01/10 11:04:58 Type options, press Enter. 5=Display 6=Print 8=Work with job Opt __ __ __ __ __ __ __ __ Step Program MXCHKCOM MXCHKCFG ENDUSRAPP MXENDDG MXENDRJLNK MXAUDACT MXAUDCMPLY MXAUDDIFF Type *AGDFN *DGDFN *AGDFN *DGDFN *DGDFN *DGDFN *DGDFN *DGDFN Node Type *LOCAL *NEWPRIM *PRIMARY *NEWPRIM *NEWPRIM *NEWPRIM *NEWPRIM *NEWPRIM 11=Display message Start Time 11:05:00 11:05:00 11:05:00 11:05:01 11:05:16 11:05:18 11:05:19 11:06:10 Duration 00:00:01 00:00:01 00:00:03 00:00:05 00:00:01 00:00:01 00:00:01 00:00:54 Status *COMP *COMP *COMP *COMP *COMP *COMP *COMP *COMP Jobs Pend *NO *NO *NO *NO *NO *NO *NO *NO More... Parameters or command ===> _________________________________________________________________________ F3=Exit F4=Prompt F5=Refresh F7=Expand F9=Retrieve F12=Cancel F13=Repeat F15=Cancel proc. F18=Subset F21=Print list Expanded view - Figure 10 shows an example of an expanded view. In the expanded view, step programs of type *AGDFN will have one row for each node on which the step runs. Steps which run step programs at the level of the data resource group or data group are expanded to have multiple rows so that the status of the step for each data resource group or data group is visible. For step programs of type *DTARSCGRP, there will be a summary row for the application group followed by a row for each data resource group within the application group. For step programs of type *DGDFN, there will be a summary row for the application group, then for each data resource group, there is a summary row for the data resource group followed by a row for each of its data groups. Summary rows are identified by a dash (-) in the columns that are being summarized. 84 Resolving problems with step status Also, for step programs of type *AGDFN, the Data Rsc. Grp. column and the Data Group column will always be blank. For step programs of type *DTARSCGRP, the Data Group column will always be blank. Figure 10. Expanded view of the Work with Step Status display. Work with Step Status System: SYSTEMA Procedure: SWTPLAN App. group: SAMPLEAG Type: *SWTPLAN Procedure status: *COMPLETED Start time: 03/01/10 11:04:58 Type options, press Enter. 5=Display 6=Print 8=Work with job Opt __ __ __ __ __ __ __ __ Step Program MXCHKCOM MXCHKCFG MXCHKCFG MXCHKCFG MXCHKCFG MXCHKCFG MXCHKCFG MXCHKCFG Data Rsc. Grp. Data Group DRG1 DRG1 DRG1 DRG1 DRG2 DRG2 DG1A DG1B DG1C DG2A 11=Display message Node LTIAS01 LTIAS02 LTIAS02 LTIAS02 LTIAS02 LTIAS02 LTIAS02 LTIAS02 Start Time 11:05:00 11:05:00 11:05:00 11:05:00 11:05:00 11:05:00 11:05:00 11:05:00 Duration 00:00:01 00:00:01 00:00:01 00:00:01 00:00:01 00:00:01 00:00:01 00:00:01 Status *COMP *COMP *COMP *COMP *COMP *COMP *COMP *COMP More... Parameters or command ===> _________________________________________________________________________ F3=Exit F4=Prompt F5=Refresh F7=Expand F9=Retrieve F12=Cancel F13=Repeat F15=Cancel proc. F18=Subset F21=Print list Resolving problems with step status When working with step status, it is important that you understand how multiple jobs are used to process the steps in a procedure. At any given time, job activity may be in progress for multiple steps. Or, one job may have failed processing a step while other jobs may have already processed that step and continued beyond it. Important! Before you take action to resolve a problem with status for a step, be sure you understand the current state of your environment as a result of completed steps and steps in progress, as well as the effect of any action you take. Table 22 identifies the possible status values that can appear on the Work with Step Status display and the action to take to resolve reported problems. Table 22. Step status values with action required Status Value Description and Action Required blank The procedure has started but processing has not yet started for the step. 85 Resolving problems with step status Table 22. Step status values with action required Status Value Description and Action Required *ATTN The step requires attention. The value *ATTN can only appear in the collapsed view or on a summary row in the expanded view. If the procedure status is considered active, at least one job submitted by this step has a status of *FAILED, *CANCEL or *MSGW. If the procedure status is *FAILED or *CANCELED, this step has at least one job that has not started or has a status of *CANCEL or *FAILED. Action Required: Use F7 to see the expanded view. Determine the specific data resource group or data group for which the problem status exists. Then address the status indicated for that job. *ACTIVE The step is currently running. *COMP The step has successfully completed. *DSBLD The step has been disabled and did not run. *CANCEL or *FAILED One or more jobs used by the step ended in error. In the expanded view of status, the job is identified as *CANCEL or *FAILED. The status is due to the error action specified for the step. • For *CANCEL status, user action canceled the step. The step ran, ended in error, and issued an inquiry message. The user’s response to the message was Cancel. • For *FAILED status, the step ran, one or more jobs ended in error. The Action on error attribute specified to quit the job. The type of step program used by the step determines what happens to other jobs for the step and whether subsequent steps are prevented from starting, as follows: • If the step program is of type *DGDFN, jobs that are processing other data groups within the same data resource group continue. When they complete, the data resource group job ends. Subsequent steps that apply to that data resource group or its data groups will not be started. However, subsequent steps will still be processed for other data resource groups and their data groups. • If the step program is of type *DTARSCGRP, subsequent steps that apply to that data resource group or its data groups will not be started. Jobs for other data resource groups may still be running and will process subsequent steps that apply to their data resource groups and data groups. • If the step program is of type *AGDFN, subsequent steps that apply to the application group will not be started. Jobs for data resource group or data group steps may still be running and will process subsequent steps that apply to their data resource groups and data groups. When all asynchronous jobs for the procedure finish, the procedure status is set to *CANCELED or * FAILED, accordingly. If both canceled and failed steps exist when the procedure ends, the procedure status will be *FAILED. Action Required: Determine the cause of the problem using “Resolving *CANCEL or *FAILED step statuses” on page 88. 86 Resolving problems with step status Table 22. Step status values with action required Status Value Description and Action Required *IGNERR The step ran and an error occurred, but processing ignored the error and continued. Action Recommended: Use option 8 (Work with job) to determine the cause of the failure. Consider whether any changes are needed to your procedure or step or to your operating environment to prevent this error from occurring again. *MSGW The step ran and issued a message that is waiting to be answered. One or more jobs for the step ended in error. Step attributes require that an operator respond to the message. Action Required: Determine which job issued the message, investigate the problem, and then respond to the inquiry message using “Responding to a step with a *MSGW status” on page 87. Responding to a step with a *MSGW status When a step or a job for step has a status of *MSGW, it is the result of an error condition. An inquiry message was sent because the step specified *MSGW for its Action on error attribute. An operator response is required before any additional processing for the job can occur. To respond to a step in *MSGW status, do the following from the Work with Step Status display: 1. To see which job is waiting, use F7 to view the Expanded view. 2. To view information about what caused the job to end in error, type 8 (Work with job) next to job with *MSGW status and press Enter. 3. On the Work with Job display, type 10 (Display job log, if active, on job queue, or pending) and press Enter. 4. The job log is displayed. Use F1 to view details of any of the messages. Find the error that caused the job to end. You will see the inquiry message in the job log; however you cannot respond to it from here. 5. Press F12 twice to return to the Work with Step Status display. 6. Type 11 (Display message) next to the step job in *MSGW status and press Enter. 7. You will see the message “Error in step at sequence number number in procedure name. (R C I).” Do one of the following: • A response of R (Retry) will retry processing the step program within the same job. Type R and press Enter. • A response of C (Cancel) will set the job status to *CANCEL as indicated in the expanded view of step status. Subsequent steps are handled in the same manner as if the Action on error has specified the value *QUIT. Type C and press Enter. 87 Resolving problems with step status • A response of I (Ignore) will set the job status to *IGNERR as indicated in the expanded view of step status, and processing continues as if the job had not ended in error. Type I and press Enter. Resolving *CANCEL or *FAILED step statuses Evaluate the cause of the failure or cancellation, as well as the state of other steps within the procedure. All steps with failed or canceled jobs need to be resolved. Important! For any step which ended in error, other asynchronous jobs may have successfully processed the same step and continued on to process other subsequent steps. The actions taken by those steps as well as by completed steps which preceded the problem cannot be reversed. Do the following from the Work with Step Status display: 1. Use F7 to view the Expanded view. 2. All steps which have a job that has a step status of *CANCEL or *FAILED must be evaluated and the cause of the problem must be resolved. To view information about why a job had an error processing a step, do the following: a. Type 8 (Work with job) next to the job you want and press Enter. b. On the Work with Job display, type 4 (Work with spooled file) and press Enter. c. Display the spooled file for the job and check for the cause of the error. d. Evaluate whether any immediate action is needed due to the condition which caused the error. Consider the nature and severity of the error. 3. If the procedure is still active and you need to take corrective action or perform additional investigation, cancel the procedure using F15 (Cancel proc.). Any steps that are currently running will complete, then the procedure status is set to *CANCELED. 4. Check which steps have completed, failed, were canceled, or have not yet started. Then evaluate the current state of your environment as a result. If needed, take corrective action that is appropriate for the extent of the errors and the extent to which steps completed. Note: It is strongly recommended that you cancel the procedure, if it is active, before attempting any corrective action. 5. Determine how to best complete the procedure in the current state of your environment. When the procedure is *FAILED or *CANCELED, your choices are: • Resume the procedure from the point where the procedure ended. If you resume a failed procedure, processing will begin with the step that failed. If you resume a canceled procedure, processing will begin with steps following the cancelled step. Optionally, if you were unable to resolve a problem for a step in error, you can override the attributes of that step for when the procedure is resumed. See “Resuming a procedure” on page 91. • Acknowledge the procedure status allowing the procedure for *CANCELED or *FAILED to be resumed starting with the first step. This choice indicates you have investigated the problem steps and want to run the procedure again 88 Acknowledging a procedure starting at its first step. This option should only be used after you have evaluated the effect of activity performed by the procedure. See “Acknowledging a procedure” on page 89. Acknowledging a procedure Acknowledging a procedure allows you to manually change the status of procedures that either failed or have errors in order to control where the next attempt to run the procedure will start. Procedures with a status of *CANCELED, *FAILED, or *COMPERR can be acknowledged (set to *ACKCANCEL, *ACKFAILED, or *ACKERR, respectively) to indicated you have investigated the problem steps. A procedure of *CANCELED or *FAILED allows you to rerun the procedure from its first step. Once acknowledged, a procedure with either of these statuses cannot be resumed from the point where the procedure ended. This is appropriate when you have determined that your environment will not be harmed if the next attempt to run starts at the first step. A *COMPERR procedure that is acknowledged (*ACKERR) can never be resumed because the procedure completed. By acknowledging a procedure with this status, you are confirming the problems have been reviewed. The last run of a procedure with a status of *ACKCANCEL or *ACKFAILED and the last run of the set of start/end/switch procedures can be returned to their previous status (*CANCELED or *FAILED, respectively). The next attempt to run the procedure will resume at the failed or canceled step or at the first step that has not been started. Note: Acknowledging the last run of a failed or canceled procedure will acknowledge all previous failed or canceled runs of the procedure. Important! Before changing status of a procedure, it is important that you evaluate and understand the effect of the partially performed procedure on your environment. Changing procedure status does not reverse the actions taken by preceding steps that completed or the actions performed by other asynchronous jobs which did complete the same step and then processed subsequent steps. It may not be appropriate for the next run of the procedure to begin with the first step, for example, if the failure occurred in a step which synchronizes data or changes states of MIMIX processes. Likewise, it may not be appropriate to return to the previous status to resume a procedure run was not recently run. To change the status of a procedure, do the following: 1. From the Work with Procedure Status display type 13 (Change status) next to the failed or canceled procedure you want and press Enter. 2. The Change Procedure Status (CHGPROCSTS) display appears. Specify the value you want for the Status prompt and press Enter. 3. If you specified *ACK in Step 2, the Start time prompt appears, displaying the timestamp of the selected procedure run. Do one of the following: • To acknowledge only the selected failed or canceled run, press Enter. • To acknowledge all previously failed or canceled runs of the selected procedure, specify *ALL for Start time and press Enter. 89 Running a procedure Running a procedure The procedure type determines what command to use to run the procedure. For an application group, multiple procedures of type *USER can run at the same time if they have unique names. Only one run of a uniquely named procedure of type *USER can occur at a time. All other procedure types must be invoked by the application group command associated with the procedure type. For example a procedure of type *START can only be invoked by the Start Application Group (STRAG) command. Where should the procedure begin? The value specified for the Begin at step (STEP) parameter on the request to run the procedure determines the step at which the procedure will start. The status of the last run of the procedure determines which values are valid. The default value, *FIRST, will start the specified procedure at its first step. This value can be used when the procedure has never been run, when its previous run completed (*COMPLETED or *COMPERR), or when a user acknowledged the status of its previous run which failed, was canceled, or completed with errors (*ACKFAILED, *ACKCANCEL, or *ACKERR respectively). Other values are for resolving problems with a failed or canceled procedure. When a procedure fails or is canceled, subsequent attempts to run the same procedure will fail until user action is taken. You will need to determine the best course of action for your environment based on the implications of the canceled or failed steps and any steps which completed. The value *RESUME will start the last run of the procedure beginning with the step at which it failed, the step that was canceled in response to an error, or the step following where the procedure was canceled. The value *RESUME may be appropriate after you have investigated and resolved the problem which caused the procedure to end. Optionally, if the problem cannot be resolved and you want to resume the procedure anyway, you can override the attributes of a step before resuming the procedure. The value *OVERRIDE will override the status of all runs of the specified procedure that did not complete. The *FAILED or *CANCELED status of these procedures are changed to acknowledged (*ACKFAILED or *ACKCANCEL) and a new run of the procedure begins at the first step. . To run a procedure of type *USER, do the following: 1. From the Work with Procedures or Work with Procedure Status display type 9 (Run) next to the user procedure you want and press F4 (Prompt). 2. Specify the value you want for Begin at step and press Enter. To run a procedure type other than *USER, do the following: From a command line, enter the application group command associated with the procedure type. For example a procedure of type *START can only be invoked by the Start Application Group (STRAG) command. 90 Running a procedure To resume a procedure with a status of *CANCELED or *FAILED, see “Resuming a procedure” on page 91 . Resuming a procedure To resume a procedure of with a status of *CANCELED or *FAILED, do the following: 1. Investigate and resolve problems for steps with errors. See “Resolving problems with step status” on page 85. 2. Optional: If the problem cannot be resolved, and you want to resume the procedure anyway, use the Override Step (OVRSTEP) command to change the configured value of the step for when the procedure is resumed. See “Overriding the attributes of a step” on page 91. 3. For a procedure of type *USER, from the Work with Step Status display use F14 (Resume proc.). For all other procedure types, from a command line, enter the appropriate application group command and specify *RESUME as the value for Begin at step (STEP). Overriding the attributes of a step The attributes of a step can be overridden to change the configured value of the step for the current run of the procedure by using the Override Step (OVRSTEP) command. The attributes determine whether the step is run or actions if the step errors for the current run of the procedure when it is resumed. The OVRSTEP command can be used for a procedure that has a status of active (*ACTIVE, *ATTN, *MSGW, *PENDCNL, or *QUEUED), *CANCELED or *FAILED and steps that have a status of *CANCEL or *FAILED. The overridden values apply only for the current run of the procedure when it is resumed. Note: Regardless of procedure status, attributes cannot be overridden for a required MIMIX step or any step with a step status of *COMP or *IGNERR. A procedure with a status of *CANCELED or *FAILED requires user action to resolve a problem. If the problem cannot be resolved and you want to resume the procedure anyway, you can use the OVRSTEP command to disable the step in error or specify the error action to occur when the step is retried. Important! Overriding the attributes of a step should only be done after you have considered how rerunning the step impacts your environment. It is important that you understand the implications for steps which preceded the cancellation or failure in the last run of the procedure. Processing for steps that completed is not reversed. The changes made when using the OVRSTEP command will only apply to the current run of the procedure. The attributes that can be changed will vary depending on the statuses of the specified procedure and step. Consider the following: • When the specified procedure has a status of *ACKCANCEL, *ACKFAILED, *ACKERR, *COMPLETED, or *COMPERR, no attributes can be overridden on any step in the procedure. 91 Canceling a procedure • When the specified procedure has a status that is considered active (*ACTIVE, *ATTN, *MSGW, *PENDCNL, or *QUEUED), only the Action on error (ERRACT) can be overridden. • When the specified procedure has a status that can be resumed (*CANCELED or *FAILED), the Action before step (BEFOREACT), Action on error (ERRACT), or State (STATE) can be overridden only on steps that have not yet run, that failed, or that were canceled. Do the following from the Work with Step Status display: 1. Press F7 (Expand) to view status of the individual jobs used to process each step. 2. Type 13 (Override step) next to the step you want and press Enter. 3. On the Override Step (OVRSTEP) display, specify the values you want and press Enter. From the Work with Step Status display, use F14 (Resume proc.) to resume the procedure. See “Resuming a procedure” on page 91. Canceling a procedure Use this procedure to cancel a procedure with a status that is considered active. This includes procedure statuses of: *ACTIVE, *ATTN, *MSGW, *PENDCNL, and *QUEUED. Important! Use this command with caution. Processing ends without reversing any actions performed by completed steps, which may leave your environment in an undesirable state. For example, ending a switch procedure could result in partially switched data. The status of the procedure will be changed to *PENDCNL. If there are any inquiry messages waiting for an operator response, they are processed as if the response was Cancel. When all activity for currently running steps end, the status of the procedure will be automatically changed to *CANCELED. To cancel an active procedure, do one of following: • From the Work with Procedure Status display, type 12 (Cancel) next to the procedure you want and press Enter. • From the Work with Step Status display, press F15 (Cancel proc.). A procedure that has been canceled can be resumed later, as long as its status has not been changed to *ACKCANCEL. When a canceled procedure is resumed, processing begins immediately after the point where it was ended. 92 Monitoring status with MIMIX Availability Status CHAPTER 5 The MIMIX Availability Status display is useful in environments that do not use application groups. Note: The MIMIX Availability Status should not be used in environments that use application groups. The MIMIX Availability Status display, shown in Figure 11, provides one location for quickly assessing the overall state of an entire MIMIX installation, reflecting both source and target systems. The status values are prioritized and are a composite view reflecting both source and target systems. In addition to determining status, unique features of this display enable its use as the starting point for performing routine actions and resolving problems. To access this display, do one of the following: • Select option 1 on the MIMIX Basic Main Menu • Enter the command WRKMMXSTS and press Enter. Figure 11 shows the MIMIX Availability Status display. Figure 11. MIMIX Availability Status window. This example shows that MIMIX is active but the installation is not complying with best practices for switching (red) and audits (yellow). Additional fields - In the upper right corner of the display, additional fields report information that is relevant to maintaining the installation. Recoveries - Identifies the total number of recoveries in progress for the 93 installation. Active recoveries represent problems detected and being corrected by MIMIX AutoGuard. Before certain activity, such as ending MIMIX, it is important that there are no recoveries in progress in the installation. If more than 9999 recoveries exist, the field displays ++++. Last switch - This field is only displayed when there is a value specified for the Default model switch framework policy. The date indicates when the last completed switch was performed using the switch framework specified in the policy. If you have not yet performed a switch using the switch framework defined in policies, this date is when the MIMIX environment was first started or when the system managers were started and explicitly reset the configuration. Activity/Status - The main area of the display provides a reporting area for status of activity in key areas. Replication, Audits and notifications, and Services. For each activity area, status represents a summation of multiple processes. The text shown within each activity area changes to identify the most severe problem within its processes. Text, as well as background color, also identify the summarized status and indicate what action is appropriate. Blue indicates there are no problems with the activity and that no action is required. Yellow indicates warnings that may need your attention. Red indicates errors or inactive processes that require immediate action. Options - On this display, the activity you select with an option and the status of the activity determines what you see as the result of using the option. This behavior is unlike that of options on other MIMIX displays. The following subtopics describe the results of using the available options. Option 5 (Display details) from the MIMIX Availability Status display results in a display showing detailed status for the selected activity. Take option 5 next to the item to access detailed information for the activity. • For Replication, the result is the Work with Data Groups display. • For Audits and notifications, the result is the Summary view of the Work with Audits display. (To see details for notifications, press F20 (Command line), then enter the command WRKNFY.) • For Services, the result is the Work with Systems display for status of the MIMIX managers. (To see details for monitors, press F4 (MIMIX Menu), then use option 12 (Work with monitors).) Option 9 (Troubleshoot) from the MIMIX Availability Status display results in the appropriate display to use as a starting point for troubleshooting the stated problem for the selected activity. The stated problem reflects the highest severity problem present. Other less severe problems may exist, they may be reflected on the subsequent display but will not be reflected on the MIMIX Availability Status display until higher severity problems are resolved.Take option 9 next to the item to access detailed information for the activity. • For Replication, the result is the Work with Data Groups display. 94 Checking replication status from the MIMIX Availability Status display • For Audits and notifications, the result is dependent on the severity of the stated problem. All auditing conditions are prioritized before any notifications. For audits with status conditions, the result is the Summary view of the Work with Audits display. For audits with compliance conditions, the result is the Compliance view of the Work with Audits display. For notifications with errors, the result is the Work with Notifications display. • For Services, the result is dependent on the severity of the stated problem. All system manager, journal manager, and target journal inspection errors are prioritized before any monitor errors. For system manager, journal manager, and target journal inspection errors, the result is the Work with Systems display. For monitor errors, the result is the Work with Monitors display. Checking replication status from the MIMIX Availability Status display The first activity listed on the MIMIX Availability Status display is Replication, as shown in Figure 11. The replication area summarizes status of replication activity for all data groups in the installation. This includes processes required for replication and also reflects potential problems. Status values are shown by color while message text within the highlighted area indicates the nature of any problem. Blue - There are no problems with replication processes and no action is required. Yellow - Warnings exist that may need your attention. Possible causes include: • A file is being synchronized by MIMIX AutoGuard. This condition usually resolves itself. • A process has a backlog which has reached its threshold. • An object on the target system is not journaled as expected. • Journal state or cache are not as expected. Red - Conditions exist that require immediate action or a switch is in progress. Possible scenarios that require immediate action include: • Error conditions • Processes required for replication are not active • Some objects are not journaled and therefore cannot be replicated • Journal state or cache is not as expected. Status may change due to warnings or problems with any of the replication processes, with replication errors associated with data group entries (file, object, IFS tracking, and object tracking), or with a change in switch status. To begin resolving problems, use option 9 (Troubleshoot) to access the Work with Data Groups display, from which you can view detailed information and take action. See “The Work with Data Groups display” on page 99 for more information. 95 Checking audit and notification status from the MIMIX Availability Status display Note: Replication status can indicate action required (red) while a switch is in progress. When you are ready to switch from the backup system to the production system, press F4 (MIMIX Menu). From there, use option 5 to continue switching. Checking audit and notification status from the MIMIX Availability Status display The middle activity listed the MIMIX Availability Status display is Audits and notifications, as shown in Figure 11. This activity area summarizes status of all audit activity, problems with audit results, audit compliance, and new notifications for a MIMIX installation. Status values are shown by color while message text within the highlighted area indicates the nature of any problem. Blue - No action is required. No audits are active, have differences, or are out of compliance, and there are no new error or warning notifications. Yellow - An audit or notification may need your attention. An out-of-compliance audit is running its compare phase, an audit is approaching an out-of-compliance state, or a new warning notification exists. Red - A condition exists that requires immediate action. An audit has failed, had unresolved differences, is out-of-compliance, was prevented from running because of policy values, or a new error notification exists. Status may change due to the highest severity condition with audits, audit results, audit compliance, or new notifications. To begin resolving problems, use option 9 (Troubleshoot) to access the appropriate display for the indicated problem. • For audit status problems, see “Resolving audit problems” on page 133. • To resolve audit compliance problems, the audits must be run. See “Running an audit immediately” on page 131 and “Displaying audit compliance” on page 144. • For additional information about notifications see “Displaying notifications” on page 160. Checking status of supporting services from the MIMIX Availability Status display The last activity listed on the MIMIX Availability Status display is Services, as shown in Figure 11. This area summarizes status and also reflects potential problems with system managers, journal managers, target journal inspection, collector services, and all enabled monitors for the installation. Status values are shown by color while message text within the highlighted area indicates the nature of any problem. Blue - There are no problems for the managers, target journal inspection, collector 96 Checking status of supporting services from the MIMIX Availability Status display services, and monitors. No action is required. Red - A system manager, journal manager, target journal inspection, collector service, or a monitor is in a state that requires immediate action. The status text indicates which problem occurred and where you can see detailed information. To begin resolving problems, use option 9 (Troubleshoot) to access the appropriate display. When the text in the Services area indicates a problem with system managers, journal managers, target journal inspection, or collector services option 9 will access the Work with Systems display, from which you can view detailed information and take action. See “Working with system-level processes” on page 149 for more information. When the text in the Services area indicates a problem with a monitor, option 9 will access the Work with Monitors display. For more information about working with monitors, see the Using MIMIX Monitor book. 97 CHAPTER 6 Working with data group status This chapter describes common MIMIX operations that help keep your MIMIX environment running. In order for MIMIX to provide a hot backup of your critical information, all processes associated with replication must be active at all times. Supporting service jobs must also be active. MIMIX allows you to display and monitor the statuses of these processes. The topics included in this chapter are: • “The Work with Data Groups display” on page 99 describes the errors reported on this display and provides procedures for resolving them. • “Working with the detailed status of data groups” on page 105 describes how to access detailed status for a data group. • “Identifying replication processes with backlogs” on page 115 describes what fields to check for detailed status of a data group. 98 Running H/F 1 The Work with Data Groups display From the Work with Data Groups display you can start and end replication, track replication status, perform a data group switch, as well as work with files, objects, and tracking entries in error and access displays for data group entries and tracking entries. Do one of the following to access the Work with Data Groups display: • From the MIMIX Intermediate Main menu, select option 1 (Work with data groups) and press Enter. • From the MIMIX Availability Status display, type 5 (Display details) next to Replication and press Enter. Figure 12. Sample Work with Data Groups display. The display uses letters and colored highlighting to call your attention to warning and problem conditions. This example shows items in color which would appear with color highlighting on the display. If you are viewing this page in printed form, the color may not be shown. CHICAGO 11:02:05 Type options, press Enter. Audits/Recov./Notif.: 001 / 002 / 003 5=Display definition 8=Display status 9=Start DG 10=End DG 12=Files needing attention 13=Objects in error 14=Active objects 15=Planned switch 16=Unplanned switch ... ---------Source----------------Target--------ErrorsOpt Data Group System Mgr DB Obj DA System Mgr DB Obj DB Obj __ APP1 LONDON A I CHICAGO A I __ APP2 LONDON A A A CHICAGO A A A __ APP3 LONDON A I CHICAGO A I 2 __ CRITICALAP LONDON A R A A CHICAGO A A A 1 4 __ RJAPP4 LONDON A L CHICAGO A I Work with Data Groups F3=Exit F10=Legend F5=Refresh F13=Repeat Bottom F7=Audits F8=Recoveries F9=Automatic refresh F16=DG definitions F23=More options F23=More keys For each data group listed, you can see the current source system and target system processes, and the number of errors reported. The following fields and columns are available. Audit/Recov./Notif. -This field is located in the upper right corner of the Work with Data Groups display. The first number is the total number of audits that require action to correct a problem or that require your attention to prevent a situation from becoming a problem. The second number indicates the number of active recoveries, including those resulting from audits.The third number indicates the number of new notifications that require action or attention. If more than 999 items exist in any field, the field will display +++. When a field is highlighted in red, a problem exists. When a field is 104 Running H/F 1 highlighted in yellow, at least one out-of-compliance audit is currently active or an audit is approaching out of compliance. For details, see “Problems reflected in the Audits/Recov./Notif. field” on page 101. Data group - When a data group name is highlighted, a problem exists. For details, see “Problems reflected in the Data Group column” on page 101 Source - The following columns provide summaries of processes that run on the source system. For details about status values, see “Replication problems reflected in the Source and Target columns” on page 103. Mgr - Represents a summation of the system manager and the journal manager processes on the source system of the data group. DB - Represents the status of the remote journal link. It is possible to have an active status in this column even though the data group has not been started. When the RJ link is active, database changes will continue to be sent to the target system. MIMIX can read and apply these changes once the data group is started. For data groups configured for source-send replication, this represents the status of the database send process. Obj - Represents a summation of the object processes that run on the source system. These include the object send, object retrieve and container send processes. DA - This column represents the status of the data area polling process when the data group replicates data areas through the data area poller. This column does not contain data when data areas are replicated through the user journal with advanced journaling or through the system journal. Target - The following columns provide summaries of processes that run on the target system. For details about status values, see “Replication problems reflected in the Source and Target columns” on page 103. Mgr - Represents a summation of the system manager, journal manager, and target journal inspection processes on the target system of the data group. Target journal inspection status includes status of inspection jobs for both target journals (user and system) for the data group. DB - Represents the summation of status for the database reader process, the database apply process, and access path maintenance jobs1. For data groups configured for source-send replication, this column represents the summation of the status of database apply processes and access path maintenance jobs. Obj - Represents the object apply processes. Errors - When any errors are indicated in the following columns (DB and Object), they are highlighted in red. DB - Represents the sum of the number for database files, IFS objects, *DTAARA and *DTAQ objects that are on hold due to errors plus the number of logical (LF) and physical (PF) files that have access path maintenance1 failures for the data 1. Access path maintenance status and errors are reported on the Work with Data Groups display only in installations running MIMIX 7.1.15.00 or higher. Access path maintenance jobs run only if the access path maintenance (APMNT) policy is enabled. 104 Running H/F 1 group. To work with a subsetted list of file errors and access path errors, use option 12 (Files needing attention). For a subsetted list of IFS object errors, use option 51 (IFS tracking entries not active), For a subsetted list of *DTAARA and *DTAQ errors, use option 53 (Object tracking entries not active). Obj - Represents a count of the number of objects for which at least one activity entry is in a failed state. To work with a subsetted list, use option 13 (objects in error). For additional information, see “Working with files needing attention (replication and access path errors)” on page 210, “Working with tracking entries” on page 219, and “Working with objects in error” on page 224. Problems reflected in the Audits/Recov./Notif. field When the Audits field is highlighted in reverse red, at least one audit has failed, has unresolved differences, is out of compliance, or was not run due to a policy. When it is highlighted in reverse yellow, at least one out-of-compliance audit is currently active or an audit is approaching out of compliance. For more information about audits, see “Displaying audit runtime status” on page 129. The Recov. (recoveries) field indicates the number of active recoveries, including those resulting from audits. Active recoveries are an indication of problems detected by MIMIX AutoGuard which is attempting to correct them. For more information about recoveries, see “Displaying recoveries” on page 164. When the Notif. (notifications) field is highlighted in reverse red, at least one new notification with a severity of *ERROR exists. When it is highlighted in reverse yellow, at least one new notification with a severity of *WARNING exists. For more information about notifications, see “Displaying notifications” on page 160. Problems reflected in the Data Group column When a data group name is highlighted in color, journaling problems exist that affect replication of one or more types of data. Table 23. Conditions which highlight the data group name in color. Color Possible Problems Red One of the following conditions exists: • FIles, IFS tracking entries, or object tracking entries defined to the data group are not journaled or not journaled correctly on the source system. • The source side journal is in standby or inactive state. 104 Running H/F 1 Table 23. Conditions which highlight the data group name in color. Color Possible Problems Yellow One of the following conditions exists: • Files, IFS tracking entries, or object tracking entries defined to the data group are not journaled or journaled correctly on the target system. This is only enforced if the data group is set up to journal on the target system as defined in the data group definition. • Data group file entries, IFS tracking entries, or object tracking entries are on hold for reasons other than an error. • The journal cache value for the source journal does not match the configured value in the journal definition. • The journal cache value for the target journal does not match the expected cache value and the database apply session is active. If another data group is using the journal definition as a source journal, the actual journal cache value may be different than the configured value. • The target journal state value for the target journal does not does not match the expected state value and the database apply session is active. If another data group is using the journal definition as a source journal, the actual state may be different than the configured value. Note: In a cooperative processing environment, files, IFS tracking entries, or object tracking entries being added dynamically to the configuration for user journal replication may reflect an intermediate state of not journaled until they have been synchronized and become active to MIMIX. Resolving problems highlighted in the Data Group column In most environments, the most likely causes indicated in Table 23 are problems with journaling. Problems associated with journal state or journal cache are only reported in data groups which are configured to use those high availability journal performance enhancements. Journaling problems: If the data group name is highlighted in red or yellow, do the following to check for and resolve journaling problems: 1. Check for not journaled conditions for each of the following: • To determine which files are not journaled, use option 17 (File entries) for the data group. Then press F10 (journaled view) to see journaling status. • To determine which IFS tracking entries are not journaled, use option 50 (IFS tracking entries) for the data group. Then press F10 (journaled view) to see journaling status. • To determine which object tracking entries are not journaled, use option 52 (object tracking entries) for the data group. Then press F10 (journaled view) to see journaling status. 2. To start journaling for a file or a tracking entry, use option 9 (Start journaling) to start journaling. 3. You can use option 11 (Verify journaling) to verify that journaling has started. 104 Running H/F 1 Journal cache or journal state problems: If the data group name is highlighted in red or yellow, do the following to check for and resolve problems: 1. From the Work with Data Groups display, use option 8 (Display status). 2. From the Data Group Status display, press F8 (Database). 3. The Jrn State and Cache Src and Tgt fields are located In the upper left corner of the Data Group Database Status display. For each system (Src or Tgt) status of the journal state is shown first, followed by the status of the journal cache. The example below shows v for value in all for status positions. If any of these fields are highlighted, there is a problem. Use “Resolving a problem with journal cache or journal state” on page 119. Jrn State and Cache Src: v v Tgt: v v Manager problems reflected in the Source and Target columns The status of needed system-level processes is reflected in the Mgr column for the source and target system. The managers must be active for replication to occur. For any status other than A (active), use “Working with system-level processes” on page 149. Replication problems reflected in the Source and Target columns The status of each process is represented by a status letter and the color of the box surrounding the letter. Table 24 describes the letters and colors used for status of the replication process summaries shown in the Source and Target columns. Table 24. Possible status values for source and target process summaries I Inactive (highlighted red) – The process is currently not active. L Inactive RJ link (highlighted red) – The RJ link is currently not active. This status is only displayed in the database source column when a data group uses MIMIX RJ support. A Active (highlighted blue) – The process is currently active. For the database source column, this value indicates that the send/receive processes are active. C RJ Catch-up mode (highlighted blue) – The remote journal is currently in catch-up mode. This status can only be displayed in the database source column for data groups that use remote journaling. Catch-up mode indicates that the operating system is transferring journal entries from the source system journal to the remote journal as quickly as possible. When the database reader process is active, MIMIX processes the journal entries as they reach the target system. R Active RJ link (highlighted blue) – The RJ link is currently active. This status is only displayed in the database source column when a data group uses MIMIX RJ support. U Unknown (highlighted white) – The status of the process cannot be determined possibly because of an error or communications problem. 104 Running H/F 1 Table 24. Possible status values for source and target process summaries J RJ Link in Threshold (highlighted turquoise) – The RJ link has fallen behind its configured threshold. View detailed status to determine the extent of the backlog. T Threshold reached (highlighted turquoise) – A process has fallen behind a configured threshold. View detailed status to determine which process has exceeded its backlog threshold and to determine the extent of the backlog. See “Working with the detailed status of data groups” on page 105 X Switch mode (highlighted red) – The data group is in the middle of switching the data source system and status may not be retrievable or accurate. P Partially active (highlighted red) - At least one subprocess is active, but one or more subprocesses is not active. This status is only displayed in process columns that represent multiple processes. The data group name may also be shown in a highlighted field of red. In the Target DB column, partial status is also possible when all other processes, including database apply, are active but access path maintenance1 is enabled and does not have at least one active job. D Disabled – The process is currently not active and the data group is disabled. Note: The status value for a disabled data group is the letter D displayed in standard format. No colored blocks are used. W 1. Waiting at a recovery point (highlighted red) - The process is currently suspended at a recovery point. Access path maintenance is available only on installations running 7.1.15.00 or higher. Note: Use F10 (Legend) to view a pop-up window that displays the status values and colors. To remove the pop-up window, press Enter or F12 (Cancel). Setting the automatic refresh interval You can control how frequently the data shown on the Work with Data Groups display is refreshed by doing the following: 1. Press F9 (Automatic refresh). 2. The Automatic Refresh Value pop-up appears. Specify how long you want the system to wait before refreshing the information and press Enter. The status displayed will automatically refresh when the specified interval passes. To end the automatic refresh process, press Enter. 104 Working with the detailed status of data groups Working with the detailed status of data groups Basic support for detailed data group status is available in the 5250 emulator interface. The Data Group Status display (DSPDGSTS command) uses multiple views to present status of a single data group. The views identify and provide status for each of the processes used by the data group. Error conditions for the data group as well as process statistics and information about the last entry processed by each replication process are included. Some fields are repeated on more than one view. The data group configuration determines what fields are visible. If the data group is database only, the object fields are not shown. Similarly, if the data group is object only, the database fields are not shown. Displaying data group detailed status Detailed status is available for one data group at a time. There are multiple ways of locating and subsetting to the data group. Do the following to access detailed status for a data group: 1. Use one of the following to locate the data group you want: • To select a data group from a list of all data groups in the installation, select option 6 (Work with data groups) on the MIMIX Basic Main Menu and press Enter. • To select a data group from a subsetted list for an application group, from the Work with Application Groups display use option 13 (Data resource groups) to select a resource group. On the resulting display use option 8 (Data groups). 2. The Work with Data Groups display appears. Type an 8 (Display status) next to the data group you want and press Enter. 3. The Data Group Status display shows a merged view of data group activity on the source and target systems. (See Figure 13.) Only fields for the type of information replicated by the data group are displayed. For example, if the data group replicates only objects from the system journal, you will only see fields for system journal replication. If the data group replicates from both the system journal and the user journal, you will see fields for both. To see additional status information for object processes or database processes, do the following: • If the data group contains object information, press F7 (Object) to view additional object status displays. The Data Group Object Status display appears. • If the data group contains database information, press F8 (Database) to view additional database status displays. The Data Group Database Status display appears. Tracking entry information for advanced journaling is also available. 4. For object information, there are three views. For database information, there are four views available. Use F11 to change between views. 105 Working with the detailed status of data groups Note: If the data group contains both database and object information, you can toggle between object details and database details by using the F7 and F8 keys. Merged view The initial view displayed is the merged view. This view summarizes status for the replication paths configured for the data group. The status of each process is represented by the color of the box surrounding the process and a status letter. Table 25 shows possible status values. Figure 13 shows a sample of the merged view of the Data Group Status display. The data group in this view is configured for user journal replication using remote journaling and for system journal replication. Also, access path maintenance is enabled. Figure 13. Merged view of data group status. The inverse highlighted blocks are not shown in this example. Data Group Status 17:39:36 Data group . . . . : CRITICALAP Database errors . . . . : 1 Elapsed time . . . : 00:52:51 Objects in error/active : 4 / 0 Transfer definition: PRIMARY-A State. . . . . . . . . : *ASYNCPEND --------------------------- Source Statistics --------------------------System: LONDON-A Jrn Mgr-A RJLNK Mon-A Receiver Sequence # Date Time Trans/Hour Database Source Jrn. LONDN0002 >0,000,002,591 4/20/08 11:02:35 Link-A RJ Tgt Jrn. LONDN0002 >0,000,002,591 4/20/08 11:02:35 Last Read . LONDN0002 >0,000,002,591 4/20/08 11:02:35 Entries not read: 0 Est. time to read: Object Current . . AUDRCV0108 22,314,732 4/22/08 17:37:13 748 Send-I Last Read . AUDRCV0103 22,175,464 4/21/08 11:05:56 *SHARED Entries not read : 139,268 Est. time to read: --------------------------- Target Statistics --------------------------System: CHICAGO-A Jrn Mgr-A DB Rdr- A AP Maint-A RJLNK Mon-A Sys Jrn Insp -A Last Received Unprocessed Entry Count Est Time User Jrn Insp-A Sequence # Entry Count Trans/Hour To Apply DB Apply-A >0,000,002,590 Obj Apply-A 22,023,868 4 F3=Exit F5=Refresh F10=Restart statistics F7=Object view F12=Cancel F8=Database F14=Start DG F9=Automatic refresh F24=More keys Note: Journal sequence numbers shown in the Source Statistics and Target Statistics areas may be truncated if the journal supports *MAXOPT3 for the receiver size and the journal sequence number value exceeds the available display field. When truncation is necessary, the most significant digits (left- 106 Working with the detailed status of data groups most) are omitted. Truncated journal sequence numbers are prefixed by '>'. This is shown in Figure 13. Table 25. Possible values for detailed status. Not all statuses are used by each process. Color and Status Description Red When displayed on the Data group, Database errors, or Objects in error fields, a problem exists that requires action. Red - I The process is inactive. Red - W The process is suspended at a recovery point. This status is only available for apply processes. Yellow When displayed on the Data group field, a problem exists that may require attention. Yellow - P One or more of the processes is active but others are inactive. On the merged view, this status is only possible for the Object Send field. Turquoise - T The process has a backlog which exceeds its configured threshold. On fields which summarize status for multiple processes, use F7 and F8 to view the specific threshold. The -T is not shown in statistical fields. If a threshold condition persists over time, refer to the MIMIX Administrator Reference book for information about possible resolutions. White - U The status of the process is unknown. Blue - A The process is active. Blue - C The RJ Link is in catch-up mode.This status is only possible for the Database Link process in the merged view and the RJ link field in some database views. Green - D The data group is disabled. This also means the data group is currently inactive. Top left corner: The top left corner of the Data Group Status display identifies the data group, the elapsed time, and the status of the transfer definition in use. The elapsed time is the amount of time that has elapsed since you accessed this display or used the F10 (Restart statistics) key. Top right corner: The top right corner of the display identifies the number of errors identified by MIMIX. If the workstation supports colors, the number files and objects in error will be displayed in red. • The Database errors field identifies the number of errors in user journal replication processes. This includes all file entries, IFS tracking entries, and object tracking entries in error. When access path maintenance1 is enabled, this also includes the number of logical and physical files that have access path maintenance failures for the data group. 1. Access path maintenance is available only on installations running MIMIX 7.1.15.00 or higher. 107 Working with the detailed status of data groups • The Objects in error/active fields indicate the number of objects that are failed and the number of objects with pending activity entries. The first number in these fields indicates the number of objects defined to the data group that have a status of *FAILED. The second number indicates the number of objects with active (pending) activity entries. • The State field identifies the state of the remote journal link. The values for the state field are the same as those which appear on the Work with RJ Links display. This field is not shown if the data group uses source-send processes for user journal replication. Source statistics: The middle of the display shows status and summarized statistics for the journals being used for replication and the processes that read from them. The following process fields are possible: System - Identifies the current source system definition. The status value is an indication of the success in communicating with that system. Jrn Mgr - Displays the status of the journal manager process for the source system. DA Poll - Displays the status of the data area poller. This field is present only if the data group replicates data areas using this process. RJLNK Mon - Displays status of the RJLNK monitor on the source system. This field is present only for data groups that use remote journaling. Database (Link or Send) - Identifies the status of the process which transfers user journal entries from the source system to the target system. Link - Displayed when the data group is configured for remote journaling. The status is that of the of the RJ link. Send -Displayed when the data group id configured for MIMIX source-send processes. The status is that of the database send process. Object Send - Displays a summation of status from the object send, object retrieve, and container send processes. The highest priority status from each process determines the status displayed. Use F7 (Object view) to see the individual processes. When the data group uses a shared object send job, either the value *SHARED or a three-character job prefix is displayed below the Send process status, The value *SHARED indicates that the data group uses the MIMIX generated shared object send prefix for this source system. A three-character prefix indicates this data group uses a shared object send job on this system that is shared only with other data groups which specify the same prefix. For the Database and Object processes, additional fields identify current journal information, the last entry that has been read by the process, and statistics related to arrival rate, entries not read, and estimating the time to read. Current - For the Database Send and Object Send processes, this identifies the last entry in the currently attached journal receiver. This information is used to show the arrival rate of entries to the journals. Note: If the data group uses remote journaling, current information is displayed in two rows, Source jrn and RJ tgt jrn. The source journal sequence number refers to the last sequence number in the local journal on the 108 Working with the detailed status of data groups source system. The remote journaling target journal sequence number refers to the last sequence number in the associated remote journal on the target system. Transactions per hour - For current journal information, this is based on the number of entries to arrive on the journal over the elapsed time the statistics have been gathered. For last read information, this is based on the actual number of entries that have been read over the elapsed time the statistics have been gathered. Last Read - Identifies the journal entry that was last read and processed by the object send, database send, or database reader. Transactions per hour - For current journal fields, this is based on the number of entries to arrive on the journal over the elapsed time the statistics have been gathered. For last read fields, this is based on the actual number of entries that have been read over the elapsed time the statistics have been gathered and will change due to elapsed time and the rate at which entries arrive in the journal. Entries not read - This a calculation of the number of journal entries between the last read sequence number and the sequence number of the last entry in the current receiver for the source journal. An asterisk (*) preceding this field indicates that the journal receiver sequence numbers have been reset between the last entry in the current receiver and the last read entry. Estimated time to read - This is a calculation using the entries not read and the transactions per hour rate. This calculation is intended to provide an estimate of the length of time it may take the process (database reader, database send, or object send) to complete reading the journal entries. Target statistics: The lower part of the display shows status and summarized statistics for all target system processing. The following process fields are possible: System - Identifies the current target system definition. The status value is an indication of the success in communicating with that system. Jrn Mgr - Displays the status of the journal manager process for the target system. DB Rdr - Displays status of the database reader. This field is present only for data groups that use remote journaling. AP Maint - Displays status of the access path maintenance1 processes. This field is only present when optimized access path maintenance has been enabled. RJLNK Mon - Displays status of the RJLNK monitor on the target system. This field is present only for data groups that use remote journaling. Sys Jrn Insp - Displays the status of target journal inspection for the system journal (QAUDJRN) on the target system of the data group. This field is displayed when the journal definition for the system journal on the current target system permits target journal inspection and the data group is enabled and has been started at least once. 1. Access path maintenance is available only on installations running MIMIX 7.1.15.00 or higher. In earlier levels of MIMIX, if parallel access path maintenance is enabled, its status is displayed in the Prl AP Mnt field that appears in this location. 109 Working with the detailed status of data groups User Jrn Insp - Displays the status of target journal inspection for the user journal on the target system of the data group. This field is displayed when the journal definition for the user journal on the current target system permits target journal inspection and the data group is enabled, performs user journal replication, permits journaling on target, and has been started at least once. DB Apply and Obj Apply - Each field displays the combined status for the apply jobs in use by the process. For each process, additional fields show statistics for the last received journal sequence number, number of unprocessed entries, approximate number of transactions per hour being processed, and the approximate amount of time needed to apply the unprocessed transactions for all database or object apply sessions. Object detailed status views Figure 14, Figure 15, and Figure 16 show samples of the information available when you use F7 (Object) to view the detailed object information. Use F11 to move between the three views of detailed object status. On each view, you can use the F1 (Help) key to see a description of that view’s contents. In all object views, journal sequence numbers may be truncated if the journal supports *MAXOPT3 for the receiver size and the journal sequence number value exceeds the available display field. When truncation is necessary, the most significant digits (leftmost) are omitted. Truncated journal sequence numbers are prefixed by '>'. The possible status values are indicated in Table 25, with the following additional status values that are unique to several system journal replication processes. The Min, Act, and Max fields for the Retrieve, Send, and Apply processes indicate the minimum, active, and maximum number of jobs for each process. The number of active jobs vary based on the work load. The active count is highlighted with color for the following conditions: Red - The number of active jobs is zero (0). Yellow - The number of active jobs is greater than zero (0) but less than the minimum number of processes. Turquoise - The process has a backlog that exceeds its configured threshold. When this occurs, the backlog field for the process is also highlighted in the color turquoise. Blue - The number of active jobs is equal to or greater than the minimum number of processes. 110 Working with the detailed status of data groups Figure 14 and Figure 17 show the active count highlighted. Figure 14. Data group detail status, object view 1. Data Group Object Status Data group . . . . : Elapsed time . . . : CRITICALAP 00:52:51 System: Objects in error . . CHICAGO 17:50:00 4 Send Process -I *SHARED Jrn Manager -A Receiver Sequence # Date Time Trans/Hour Current . . AUDRCV0108 10,022,314,732 4/22/08 17:37:13 748 Last Read . AUDRCV0103 10,022,175,464 4/21/08 11:05:56 Entries not read: 139,268 Est. time to read: --------------------- Object Retrieve/Container Send ---------------------Retrievers Retrieve Senders Send Containers Containers Min Act Max Backlog Min Act Max Backlog Sent Per Hour 1 0 5 1 0 5 1,145 ------------------------------- Object Apply ------------------------------Applies Apply Active Entries Entries Min Act Max Backlog Objects Sequence # Applied Per Hour 1 1 5 4 >0,022,023,871 1,133 F3=Exit F5=Refresh F9=Automatic refresh F7=Merged view F11=View 2 F8=Database view F12=Cancel F24=More keys Figure 15. Data group detail status, object view 2. Data Group Object Status Data group . . . . : Elapsed time . . . : CRITICALAP 00:52:51 System: Objects in error . . CHICAGO 17:57:31 4 Send Process -I *SHARED Jrn Manager -A Receiver Sequence # Date Time Trans/Hour Current . . AUDRCV0108 10,022,314,732 4/22/08 17:37:13 748 Last Read . AUDRCV0103 10,022,175,464 4/21/08 11:05:56 Entries not read: 139,268 Est. time to read: --------------------- Object Retrieve/Container Send ---------------------Retrievers Retrieve Senders Send Containers Containers Min Act Max Backlog Min Act Max Backlog Sent Per Hour 1 0 5 1 0 5 1,145 ------------------------------- Object Apply ------------------------------Applies Apply ------------- Last Applied ------------Min Act Max Backlog Sequence # Type Object 1 1 5 0 >0,022,023,871 *DOC BVT#I/PBBDOCXX.002 F3=Exit F5=Refresh F9=Automatic refresh F7=Merged view F11=View 3 F8=Database view F12=Cancel F24=More keys 111 Working with the detailed status of data groups Figure 16. Data group detail status, object view 3. DG Object Journal Entry Detail Data group . . . . : Source system: Current entry Last read entry Last received System: CHICAGO 18:01:20 CRITICALAP LONDON-A Entry TSF - Sequence # 10,022,314,732 10,022,175,464 Receiver AUDRCV0108 AUDRCV0103 - Date Time 4/22/08 17:37:13 4/21/08 11:05:56 Target system: CHICAGO-A ------------------------------- Object Send ------------------------------Entry Sequence # Date Time Type Object Active TCO >0,022,023,868 4/20/08 13:59:23 *DOC BVT#I/PBBDOCXX.002 Processed TCO >0,022,023,868 4/20/08 13:59:23 *DOC BVT#I/PBBDOCXX.002 ------------------------------- Object Apply ------------------------------Entry Sequence # Date Time Type Object Processed TCA >0,022,023,871 4/20/08 13:59:23 *DOC BVT#I/PBBDOCXX.002 F3=Exit F5=Refresh F9=Automatic refresh F7=Merged view F11=View 1 F8=Database view F12=Cancel F24=More keys Database detailed status views Figure 17, Figure 18, Figure 19, and Figure 20 show samples of the information available when you use F8 (Database) to view the detailed database information. On each view, you can use the F1 (Help) key to see a description of that view’s contents. In database views that include sequence numbers, the journal sequence numbers may be truncated if the journal supports *MAXOPT3 for the receiver size and the journal sequence number value exceeds the available display field. When truncation is necessary, the most significant digits (left-most) are omitted. Truncated journal sequence numbers are prefixed by '>'. Most fields that display status of a process have some or all of the possible values indicated in Table 25. Possible values for the Jrn State and Cache (Src and Tgt) fields are indicated in Table 27 (journal state) and Table 28 (journal cache). The data group configuration determines whether the Send process field is replaced by the RJ Link field. When remote journaling is configured, the RJ Link and DB Rdr fields are shown. The AP Maint field is displayed on views 1 and 2 (Figure 17 and Figure 18) only when the access path maintenance1 policy is enabled. When present, this field displays the status of the access path maintenance job that persists while the database apply process is active. 1. Access path maintenance is available only on installations running 7.1.15.00 or higher. 112 Working with the detailed status of data groups In the top right corner of database views 1 and 2 (Figure 17 and Figure 18), these fields display combined counts of replicated entries and errors for file entries, IFS tracking entries, and object tracking entries: • File and Tracking entries • Not journaled Src Tgt - If the number of not journaled errors on either system exceeds 99,999, that system’s field displays +++++. • Held due to error • Access path maint. errors • Held for other reasons Database view 4 (Figure 20) separates this information into columns for file entries, IFS tracking entries, and object tracking entries. If a data group has multiple database apply sessions you will see an entry for each session in the Apply Status column on database views 1, 2, and 3 (Figure 17, Figure 18, and Figure 19). Each session has its own status value. In these sample figures there is only one apply session (A) which is active (-A). Figure 17. Data group detail status—database view 1. In this example, the Link status of -A and the presence of the Reader status indicate that the data group uses remote journaling and access path maintenance. The display also shows that journal standby state is active and journal caching is not active. The unprocessed entry count indicates that the final journal entry has not been applied. The > character preceding sequence numbers for the apply session indicate truncated sequence numbers that are associated with *MAXOPT3 support. CHICAGO 18:07:02 Data group . . . . : CRITICALAP File and Tracking entries : 12 Elapsed time . . . : 00:52:51 Not journaled Src: 1 Tgt: 1 Jrn State and Cache Src: A N Tgt: A N Held due to error . . . . : 1 RJ Link-A AP Maint-A Access path maint. errors : 1 Jrn Mgr-A DB Rdr- A Held for other reasons . : 0 Receiver Sequence # Date Time Trans/Hour Source Jrn. LONDN0002 12,345,678,900,000,002,591 4/20/08 11:02:35 Rj Tgt Jrn. LONDN0002 12,345,678,900,000,002,591 4/20/08 11:02:35 Last Read . LONDN0002 12,345,678,900,000,002,591 4/20/08 11:02:35 Entries not read: 0 Est. time to read: ------------------------------- Database Apply --------------------------Apply Received Processed Unprocessed Entry Count Est Time Open Status Sequence # Sequence # Entry Count Trans/Hour To Apply Commit A-A >0,000,002,593 >0,000,002,592 1 *NO Data Group Database Status F3=Exit F5=Refresh F9=Automatic refresh F7=Object view F11=View 2 System: F8=Merged view F12=Cancel F24=More keys 113 Working with the detailed status of data groups Figure 18. Data group database status—view 2. In this example, the Link status of A and the presence of the Reader status indicates that the data group uses remote journaling. The display also shows that access path maintenance is used and active, and that journal standby state is active and journal caching is not active. CHICAGO 16:07:03 Data group . . . . : CRITICALAP File and Tracking entries. : 12 Elapsed time . . . : 00:52:51 Not journaled Src: 1 Tgt: 1 Jrn State and Cache Src: A N Tgt: A N Held due to error . . . . : 1 RJ Link-A AP Maint-A Access path maint. errors : 1 Jrn Mgr-A DB Rdr- A Held for other reasons . : 0 Receiver Sequence # Date Time Trans/Hour Source Jrn. LONDN0002 12,345,678,900,000,002,591 4/20/08 11:02:35 Rj Tgt Jrn. LONDN0002 12,345,678,900,000,002,591 4/20/08 11:02:35 Last Read . LONDN0002 12,345,678,900,000,002,591 4/20/08 11:02:35 Entries not read: 0 Est. time to read: ------------------------------- Database Apply --------------------------Apply Received Apply point Clock Time Hold MIMIX Log Open Status Sequence # Sequence # Difference Sequence # Commit Id A-A >0,000,002,590 >0,000,002,590 Data Group Database Status F3=Exit F5=Refresh F9=Automatic refresh F7=Object view F11=View 3 System: F8=Merged view F12=Cancel F24=More keys Figure 19. Data group database status, view 3. System: DG Database Jrn Entry Detail Data group . . . . : Source system: Current entry RJ target entry Last read entry Last received CHICAGO 18:16:04 CRITICALAP LONDON-A Entry Sequence # UMX 12,345,678,900,000,002,591 UMX 12,345,678,900,000,002,591 UMX 12,345,678,900,000,002,591 - 12,345,678,900,000,002,590 Receiver LONDN0002 LONDN0002 LONDN0002 - Date 4/20/08 4/20/08 4/20/08 4/20/08 Time 11:02:35 11:02:35 11:02:35 11:01:04 Target system: CHICAGO-A ------------------------------- Database Apply ----------------------------Apply Entry Sequence # Date Time Object Library Member A-A UMX >0,000,002,590 4/20/08 11:01:04 F3=Exit F5=Refresh F9=Automatic refresh F7=Object view F11=View 1 F8=Merged view F12=Cancel F24=More keys 114 Identifying replication processes with backlogs Figure 20. Data group detail status—database view 4. In this example, the combined number of file and tracking entries shown in Figure 17 and Figure 18 are separated into separate columns for file entries, IFS tracking entries, and object tracking entries. File and Tracking Entry Status Data group . . . . : System: CHICAGO 16:07:03 CRITICALAP Number of entries . . . . : Not journaled on source . : Not journaled on target . : Held due to error . . . . : Access path maint. errors : Held for other reasons . .: F3=Exit F5=Refresh F9=Automatic refresh File Entries 7 1 0 0 1 0 IFS Trk Entries 3 0 1 1 0 F7=Object view F11=View 1 Obj Trk Entries 2 0 0 1 0 F8=Merged view F12=Cancel F24=More keys Identifying replication processes with backlogs If replication processes are active and have no reported error conditions, a replication process that has exceeded its backlog threshold will have a status that reflects this condition. However, if a replication process is inactive or has an error condition with a higher priority status, the threshold condition will not be visible in the process status until the process is started or the problem is resolved. Also, a backlog may exist but not be large enough to exceed the threshold setting, or the threshold warning setting may have been disabled (set to *NONE). Do the following to check for a backlog condition: 1. To access the details for a data group, use the procedure in “Displaying data group detailed status” on page 105. 2. Use F7 or F8 on the Data Group Status display to locate the appropriate view for the process you want to check. Table 26 identifies this information and the 115 Identifying replication processes with backlogs appropriate fields for each process. Table 26. Location of fields which identify backlogs and threshold conditions for replication processes Process Description View RJ Link For remote journaling configurations, differences between journal entries identified by Source Jrn and Last read. For MIMIX source-send configurations, differences between journal entries identified by Current and Last Read. • Entries not read Sequence # 1 • Last Read Date and Time 2 Unprocessed Entry Count • Apply Status • Unprocessed Entry Count The backlog is the quantity of journal entries that have not been read from the system journal. The time difference between the last entry that was read by the process and the last entry in the system journal can also be an indication of a backlog. Multiple data groups sharing the object send job is one possible cause of a persistent backlog. Merged view, Object views 1, 2, and 3 Object Retrieve • RJ tgt jrn Sequence # 1 • RJ tgt jrn Date and Time 2 The backlog is the number of entries waiting to be applied to the target system. Each apply session is listed as a separate entry with its own backlog. Database views 1, 2, and 3 Object Send Differences between journal entries identified by Source Jrn and RJ Tgt jrn for the database link. The backlog is the quantity of journal entries that are waiting to be read by the process. The time difference between the last entry that was read by the process and the last entry in the journal on the source system can also be an indication of a backlog. This may be a temporary condition due to maximized log space capacity. If the log space capacity was reached, the database reader job will be idle until the database apply job is able to catch up. If the condition is unable to resolve itself, action may be required. Merged view, Database views 1 and 2 DB Apply Fields Highlighted When Threshold Exceeded The backlog is the quantity of source journal entries that have not been transferred from the local journal on the source system to the remote journal on the target system. The time difference between the last entry in each journal can also be an indication of a backlog. Merged view, Database views 1 and 2 DB Reader or DB Send Fields to Check for Backlog Differences between transactions identified for Object Current and Last Read • Entries not read Sequence # 1 • Last Read Date and Time 2 The backlog is the number of entries for which MIMIX is waiting to retrieve objects. Object views 1 and 2 Retrieve Backlog • Retrievers, Act column • Retrieve Backlog 116 Data group status in environments with journal cache or journal state Table 26. Process Location of fields which identify backlogs and threshold conditions for replication processes Description View Container Send Fields Highlighted When Threshold Exceeded The backlog is the number of packaged objects for entries that are waiting to be sent to the target system. Object views 1 and 2 Object Apply Fields to Check for Backlog Container Send Backlog • Senders, Act column • Container Send Backlog The backlog is the number of entries waiting to be applied to the target system. Object views 1 and 2 Apply Backlog • Applies, Act column • Apply Backlog Notes: 1. When highlighted, the threshold journal entry quantity criterion is exceeded. 2. When highlighted the threshold time criterion is exceeded. Data group status in environments with journal cache or journal state Additional information is reported within data group status configured to use MIMIX support for IBM’s High Availability Journal Performance IBM i option 42, Journal Standby feature and Journal caching. When these high availability journal performance enhancements are in use, conditions that require action or attention are reflected in these locations: • The data group name is highlighted on the Work with Data Groups display. The possible problems associated with journal cache or journal state are identified Table 23 in topic “Problems reflected in the Data Group column” on page 101. • Jrn State and Cache (Src and Tgt) fields within the data group detailed status are highlighted. These fields are on the database views 1 and 2 of the Data Group Database Status display (Figure 17, Figure 18 respectively, shown in “Database detailed status views” on page 112). The possible values for the Jrn State and Cache (Src and Tgt) fields are indicated in Table 27 (journal state) and Table 28 (journal cache). The Jrn State and Cache (Src and Tgt) fields reflect journal standby state and journal caching actual values for the journals when the IBM high availability performance enhancements are installed on the systems defined to the data group. These fields appear on database views 1 and 2 (Figure 17 and Figure 18). The target journal state and cache values are set on the journal when the database apply session is started. Journal State - The status values indicate the actual state value for the source and 117 Data group status in environments with journal cache or journal state target journals. Table 27 shows the possible values for each field. Journal Cache - The status indicate the actual cache value for the source and target journals. Table 28 shows the possible values for each field. For each system (Src or Tgt) status of the journal state is shown first, followed by the status of the journal cache. If a problem exists with journal state or journal cache, the data group name is also highlighted with the same color. For information about resolving journal cache or journal state problems, see “Resolving a problem with journal cache or journal state” on page 119. Table 27. Possible status values for Journal State fields Field Color and Status Either system White U Unknown. MIMIX was not able to retrieve values, possibly because the journal environment has not yet been built. No color A Journal state is active No color X The required IBM feature, IBM i option 42 - High Availability Journal Performance, is not installed on this system No color S Journal is in standby state as expected Red S Source journal is in standby state but that state is not expected. Red I Source journal in inactive state but that state is not expected. Yellow S Target journal state or cache is not as expected and the database apply session is active Yellow I Target journal state is inactive but that state is not expected. Source Target Description blank blank Table 28. Field The IBM feature is installed but the data group is configured to not journal on the target system. Possible status values for Journal Cache fields Color and Status White Description U Unknown. MIMIX was not able to retrieve values, possibly because the journal environment has not yet been built 118 Data group status in environments with journal cache or journal state Table 28. Possible status values for Journal Cache fields Field Color and Status Either System No color X The required IBM feature, IBM i option 42 - High Availability Journal Performance, is not installed on this system No color Y Caching is active No color N Caching is not active. Yellow Y Source journal cache value is not as expected. Yellow N Source journal cache value is not as expected. Yellow Y Target journal cache value not as expected and the database apply session is active. Yellow N Target journal cache value not as expected and the database apply session is active. Source Target Description blank blank The IBM feature is installed but the data group is configured to not journal on the target system. Resolving a problem with journal cache or journal state Problems with journal state or journal cache can cause the name of a data group to be highlighted on the Work with Data Groups display. If the data group name is highlighted in red or yellow, do the following to check for and resolve problems: 1. From the Work with Data Groups display, use option 8 (Display status). 2. From the Data Group Status display, press F8 (Database). 3. The Jrn State and Cache Src and Tgt fields are located In the upper left corner of the Data Group Database Status display. For each system (Src or Tgt) status of the journal state is shown first, followed by the status of the journal cache. The example below shows v for value in all for status positions. Based on the status displayed in these fields, you can take the actions described in the following steps to correct the problem: Jrn State and Cache Src: v v Tgt: v v 4. Source system journal state (first Src: value) - If the source system state is red and the value for the journal state is standby (S) or inactive (I), the journal state must be changed and all data replicated through the user journal must be synchronized. Do the following: a. Press F12 (Cancel) to return to the Work with Data Groups display. Note which system is specified as the source system for the data group. b. Use option 45 (Journal Definitions) to view the journal definitions used for the data group in error. 119 Data group status in environments with journal cache or journal state c. On the Work with Journal Definitions display, determine the journal name and library specified for the system that is the source system for the data group. d. Specify the name and library of the source system journal in the following command: CHJRN CHGJRN JRN(library/name) JRNSTATE(*ACTIVE) e. All data replicated through the user journal must be synchronized. For detailed information about synchronizing a data group, refer to your Runbook or to the MIMIX Administrator Reference book. 5. Source system journal cache (second Src: value) - If the source system cache is yellow, the actual status does not match the configured value in the journal definition used on the source system. Do the following: a. Press F12 (Cancel) to return to the Work with Data Groups display. Note which system is specified as the source system for the data group. b. Use option 45 (Journal Definitions) to view the journal definitions used for the data group in error. c. On the Work with Journal Definitions display, use option 5 (Display next to the journal definition listed for the source system. d. Check the value of the Journal caching (JRNCACHE) parameter. e. Determine which value is appropriate for journal cache, the configured value or the actual status value. Once you have determined this, either change the journal definition value or change the journal cache (CHGJRN command) so that the values match. 6. Target system state (first Tgt: value) or Target system cache (second Tgt: value) - If the target system state or cache is yellow, the actual value for state or cache does not match the configured value. Do the following: a. Press F12 (Cancel) to return to the Work with Data Groups display. Note which system is specified as the target system for the data group. b. Use option 45 (Journal Definitions) to view the journal definitions used for the data group in error. c. On the Work with Journal Definitions display, use option 5 (Display next to the journal definition listed for the target system. d. Check the value of the following parameters, as needed: • Target journal state (TGTSTATE) • Journal caching (JRNCACHE) e. Determine why the actual status of the journal state or journal cache does not match the configured value of the journal definition used on the target system. f. Determine which values are appropriate for journal state and journal cache, the configured value or the actual status value. Once you have determined this, either change the journal definition value or change the journal state or cache (CHGJRN command) so that the values match. 120 CHAPTER 7 Working with audits Audits are defined by and invoked through rules and influenced by policies. Aspects of audits include schedules, status, reported results, and their compliance status. MIMIX is shipped so that auditing can occur automatically. For day-to-day operations, auditing requires minimal interaction to monitor audit status and results. MIMIX user interfaces separate audit runtime status, compliance status, and scheduling information onto different views to simplify working with audits. Compliance errors and runtime errors require different actions to correct problems. This chapter provides information and procedures to support day-to-day operations as well as to change aspects of the auditing environment. The following topics are included. • “Auditing overview” on page 122 describes concepts associated with auditing and describes the differences between automatic priority audits and automatic scheduled audits. • “Guidelines and considerations for auditing” on page 126 identifies considerations for specific audits, auditing best practices, and recommendations for checking the audit results. • “Displaying audit runtime status” on page 129 identifies the Audit Summary interfaces and provides procedures for common activities with audits, such as running audits immediately and resolving reported problems. • “Displaying audit history” on page 137 describes how to display history for specific audits of a data group. • “Working with audited objects” on page 139 describes how to display a list of objects compared by one or more audits. • “Working with audited object history” on page 142 describes how to access the audit history for a specific object. • “Displaying audit compliance” on page 144 identifies the Audit Compliance interfaces and describes how to determine if an audit is audit has a compliance problem. • “Displaying scheduling information for automatic audits” on page 147 describes how to access the Audit Schedule interfaces, how to display when prioritized audits will run, and how to display when scheduled audits will run. 121 Auditing overview Auditing overview All businesses run under rules and guidelines that may vary in the degree and in the methods by which they are enforced. In a MIMIX environment, auditing provides rules and enforcement of practices that help maintain availability and switch-readiness at all times. Not using or limiting audit use does little to confirm the integrity of your data. These approaches can mean lost time and issues with data integrity when you can least afford them. In reality, successful auditing means finding the right balance somewhere between these approaches: • Audit your entire replication environment every day. The benefit of this approach is knowing that your data integrity exposure is limited to data that changed since the last audit. The trade-off with this approach can be time and resources to perform audits. • Audit only replicated data that “needs” auditing. This approach can be faster and use fewer resources because each audit typically has fewer objects to check. The trade-offs are determining what needs auditing and knowing when objects were last audited. MIMIX makes auditing easy by automatically auditing all objects periodically and auditing a subset of objects every day. MIMIX also provides the ability to fine-tune aspects of auditing behavior and their automatic submission and the ability to manually invoke an audit at any time. Components of an audit Together, three components identify a unique audit. Each component must exist to allow an audit to run. Rule - A program by which an audit is defined and invoked. Each rule shipped with MIMIX pre-defines a compare command to be invoked and the possible actions that can be initiated, if needed, to correct detected problems. When invoked, each rule can check only the class of objects associated with its compare command. Names of rules shipped with MIMIX begin with the pound sign (#) character. Data group - A data group provides the context of what to check and how results are reported. Multiple audits (rules) exist for each data group. Note: Audits are not allowed to run against disabled data groups. Schedule - Each unique combination of audit rule and data group has its own schedule, by which it is automatically submitted to run. MIMIX ships default scheduling information associated with each shipped rule. Scheduling can be adjusted for individual audits through policies. A manually invoked audit can be thought of as an immediate override of scheduling information. Although people use the terms “audit” and “rule” interchangeably, a rule is a component of an audit. The process of auditing runs a rule program. 122 Auditing overview Phases of audit processing The process of auditing consists of a compare phase and a recovery phase. In the compare phase of an audit, the identified audit rule initiates a specific compare command against the data group. The Audit level policy determines if an audit is allowed to run and how aggressively an audit checks your environment during its compare phase. If a shipped audit rule provides more than one audit level, each level provides increasingly more checking capability. If there are detected differences when the compare phase completes, the audit enters its recovery phase to start automatic recovery actions as needed. MIMIX attempts to correct the differences and sends generated reports, called recoveries, to the user interface. MIMIX removes these generated reports when the recovery action completes successfully. If the recovery job fails to correct the problem, MIMIX removes the recovery and sends an error notification to the user interface. Most audit rules support a recovery phase. MIMIX is shipped with defaults that enable audits to enter the recovery phase automatically when needed. The recovery phase can be optionally disabled in the Automatic audit recovery policy. Object selection methods for automatic audits MIMIX provides two approaches to performing audits automatically. The biggest difference between these approaches is how objects are selected to be audited. The other significant difference is when each type of audit is allowed to run. • In scheduled object auditing, an audit run selects all objects that are configured for the data group and within the class of objects checked by the audit. MIMIX automatically runs an audit according to its specified scheduling criteria. Each time a scheduled audit runs, all eligible configured objects are selected. • In prioritized object auditing, an audit run selects replicated objects according to their internally assigned priority category and an auditing frequency assigned to the category. The result is often a subset of the objects replicated by the data group. Each time a prioritized audit runs, its subset of objects selected to check may be unique. MIMIX automatically runs a prioritized audit periodically within its specified time range every day. It may run approximately once per hour or more often during its time range. An audit that is manually invoked from the Work with Audits display in a 5250 emulator is an immediate run of a scheduled audit. Priority audits cannot be manually invoked from this display. From Vision Solutions Portal, you have the ability to perform an immediate run of either method of auditing. Prioritized auditing can reduce the impact of auditing on resources and performance. This benefits customers who cannot complete IFS audits, cannot audit every day, or do not audit at all because of time or resource issues. When both types of auditing are used, you can achieve a balance between verifying data integrity and resources. Either or both types of automatic auditing can be disabled, although that is not recommended. 123 Auditing overview How priority auditing determines what objects to select MIMIX determines the auditing priority of each replicated object based on its most recent change, most recent audit, and the frequency specified for auditing priority categories. At any time, every replicated object falls within one of several predetermined categories. Objects in each category are eligible for selection according to the frequency assigned to their category. Each prioritized audit runs approximately once per hour, or more often, every day during its time range specified in the Priority audit policy. Each time the audit starts, it selects only the objects eligible in each category. Table 29. Priority auditing categories Category Description Eligibility Frequency Objects not equal Objects that had any value other than equal (*EQ) in their most recent audit. This includes objects for which a detected difference was automatically resolved. Objects in this category have the highest priority and are always selected. New objects A new object is one that has not been audited since it was created. Changed objects A changed object is one that has been modified since the last time it was audited. Unchanged objects An unchanged object is one that has not been modified since the last time it was audited. Objects in these categories are eligible for selection according to the category frequency specified in the Priority audit policy. Audited with no differences An object with no differences is one that has not been modified since the last time it was audited and has been successfully audited with no changes on at least three consecutive audit runs. Objects remain in this category until a change occurs. The #FILDTA audit always selects all members of a file for which auditing is less than 100 percent complete. The occurs in all of the above object selection categories. Initially, the objects selected by a prioritized audit may be nearly the same as those selected by a scheduled audit. However, over time the number of objects selected by a prioritized object stabilizes to a subset of those selected by a scheduled audit. When both scheduled and priority audits are allowed for the same rule and data group, MIMIX may not start a prioritized audit if the scheduled audit will start in the near future. How audits are submitted automatically When MIMIX is started (STRMMX command), all system-level processes necessary for replication and auditing are started, including the master monitor. On each system, the master monitor starts job scheduling activities for auditing. This ensures that 124 Auditing overview audits are submitted automatically according to the polices in effect for when to run priority audits and scheduled audits. The time specified in policies is local to each system. At the appropriate time for each audit, a job is initiated on each system in the data group. MIMIX uses the Run rule on system policy to determine where the audit should run and immediately ends the audit job if it is not on the appropriate system. For a scheduled audit, the Audit schedule policy determines the time and frequency of when the audit runs. A scheduled audit can be set to run on specific dates or days of the week, or on relative days of the month. For a prioritized audit, the Priority audit policy determines the range of time during which the audit can start each day. A prioritized audit can run multiple times during the specified range, approximately once per hour or more often. If you start replication through procedures or processes that invoke the Start Data Group (STRDG) command, you also need to ensure that the master monitor is started on all systems in your installation (STRMSTMON command) so that automatic auditing can occur. Audit status and results When audits complete or end in error, their status is reported in the audit summary user interfaces In a 5250 emulator, this is on the Work with Audits display (WRKAUD command). In Vision Solutions Portal, this is the Audits portlet. A summary of all audit status also “bubbles up” to the level of data group interfaces. The information available about each audit identifies the status of actions performed by its rule, how the audit selected objects for comparison, the audit’s compliance status, policy values which affect the actions of each phase, and scheduling information. When a phase completes, its timestamps and statistics are also available. When audit recoveries are enabled, you can control the severity level of the notifications that are returned when the rule ends in error with the Notification severity policy. You can also view job logs associated with notifications and recoveries. Job logs are accessible from the system on which the audit comparison or recovery job ran. Audit compliance Compliance is an indication of whether an audit ran within the time frame of the compliance thresholds set in auditing policies. For audits configured for scheduled object auditing or both scheduled and prioritized object auditing, compliance status is based on the last run of a scheduled audit or a user-invoked audit. For audits configured for only prioritized object auditing, compliance status is based on the last run, which may have been a prioritized audit or a user-invoked audit. A user-invoked audit or a scheduled audit checked all objects that are configured for the data group and within the class of objects checked by the audit whereas a prioritized audit may have checked only a subset of those objects. 125 Guidelines and considerations for auditing Guidelines and considerations for auditing Auditing is most effective when it is performed regularly and you take action to investigate and resolve any reported differences that cannot be automatically corrected. Auditing best practices Regular auditing helps you detect problems in a timely manner and can help you to address detected problems during normal operations instead of during a crisis. Policy values for auditing are shipped with defaults set to values that Vision Solutions recommends as best practice. New data groups and new installations will automatically use these policy values. If you determine that default policy values do not meet your auditing needs, you can customize the policy settings. Auditing best practices include: Automatically auditing: MIMIX is shipped so that auditing occurs automatically. • Allow both priority audits and scheduled audits to run automatically. This provides a balance between checking all objects periodically and checking a subset of objects every day. You can adjust the Priority audit and Audit schedule policies that control when each type of audit is automatically submitted to meet the needs of your environment. • Allow audits to perform the most extensive comparison possible. The shipped value (level 30) for the Audit level policy enables this. If you choose to run audits at a lower audit level, be aware of the risks, especially when switching. • Allow audits to perform automatic recovery actions. This provides automatic correction of detected problems. Recovery is possible when the Automatic audit recovery policy is enabled. • Allow MIMIX to run all audits even if you do not replicate certain object types (such as DLOs). This ensures that if you add new objects in the future, you will be automatically auditing them. Audits that do not have any objects to check complete quickly with little use of system resources. Manually auditing: In addition, manually invoke audits in these conditions: • Before switching, run all audits at audit level 30. Click this link to see additional information about the audit level policy. • If you make configuration changes, run the #DGFE audit to check actual configuration data against what is defined to your configuration. Click this link to see additional information about when to run the #DGFE audit. Where to run audits: Run audits from a management system. For most environments, the management system is also the target system. If you cannot run rules from the management system due to physical constraints or because of complex configurations, you can change the Run rule on system policy to meet your needs. Click this link to see additional information about the Run rule on system policy. 126 Guidelines and considerations for auditing Considerations for specific audits #DGFE audit - This audit is not eligible for prioritized auditing because it checks configuration data, not objects. As a result, configuration problems for a data group can only be detected when a scheduled audit or a manually invoked audit runs. Run the #DGFE audit during periods of minimal MIMIX activity to ensure that replication is caught up and that added or deleted objects are reflected correctly in the journal. If the command is run during peak activity, it may contain errors or indicate that files are in transition. In addition to regularly scheduled audits, check your configuration using the #DGFE audits for your data groups whenever you make configuration changes, such as adding an application or creating a library. Running the audit prior to audits that compare attributes ensures that those audits will compare the objects and attributes you expect to be present in your environment. #DLOATR audit - This audit supports multiple levels of comparisons. The level used is controlled by the value of the Audit level policy in effect when the audit runs. The #DLOATR audit compares attributes as well as data for objects defined to a data group when audit level 20 or 30 is used. Audit level 10 compares only attributes. When data is compared the audit may take longer to run and may affect performance. #FILDTA audit - This audit supports multiple levels of comparisons. The level used is is controlled by the value of the Audit level policy in effect when the audit runs. The #FILDTA audit compares all file member data defined for file members defined to a data group only when audit level 30 is used. Level 10 and level 20 compare 5 percent and 20 percent of data, respectively. Lower audit levels may take days or weeks to completely audit file data. New files created during that time may not be audited. Regardless of the audit level you use for regular auditing, Vision Solutions strongly recommends running a level 30 audit before switching. #IFSATR audit - This audit supports multiple levels of comparisons. The level used is controlled by the value of the Audit level policy in effect when the audit runs. The #IFSATR audit compares data when audit level 20 or 30 is used. At level 10, only attributes are compared. Regardless of the audit level you use for regular auditing, Vision Solutions strongly recommends running a level 30 audit before switching. #MBRRCDCNT audit - This audit compares the number of current records (*CURRDS) and the number of deleted records (*NBRDLTRCDS) for physical files that are defined to an active data group. Equal record counts suggest but do not guarantee that files are synchronized. The #MBRRCDCNT audit does not have a recovery phase. Differences detected by this audit appear as not recovered in the Audit Summary. In some environments using commitment control, the #MBRRCDCNT audit may be long-running. Refer to the MIMIX Administrator Reference book for information about improve performance of this audit. Recommendations when checking audit results Consider these recommendations when you check results of audits: • Always review the results of the audits. Audit results reflect only what was 127 Guidelines and considerations for auditing actually compared. Some objects may not have been compared due to object activity or due to the audit level policy value in effect, even when no differences (*NODIFF) are reported. You may need to take actions other than running an audit to correct detected issues. For example, you may need to change a procedure so that target system objects are only updated by replication processes. • Be aware of priority auditing behavior. Priority audits differ from other audits in how they select objects to audit and in the number of objects selected. Be aware of the implications of those differences when checking audit results. Priority audits select replicated objects based on their auditing eligibility. As a result, priority audits cannot check newly created source objects until after their create transactions have been replicated. Priority audits can return results indicating that zero (0) objects were selected. This occurs when no objects were eligible for selection by an audit. • Deleted objects reported as not found. Audits can report not found conditions for objects that have been deleted. A not found condition is reported when a delete transaction is in progress for an object eligible for selection when the audit runs.This is more likely to occur when there are replication errors or backlogs at the time the audit runs. • Fixing one error may expose another. It may take multiple iterations of running audits with recoveries before the results are clean. Recovering from one error may result in a different error surfacing the next time the audit is performed. For example, a recovery that adds data group file entries may result in detecting a database relationship difference (*DBRIND) error the next time the audit is performed, where the root problem is that a library of logical files is not identified for replication. • Watch for trends in the audit results. Trends may indicate situations that need further investigation. For example, objects that are being recovered for the same reason every time you run an audit can be an indication that something in your environment is affecting the objects between audits. In this case, investigating the environment for the cause may determine that a change is needed in the environment, in the MIMIX configuration, or in both. Trends may also indicate a MIMIX problem, such as reporting an object as being recovered when it was not. Report these scenarios to MIMIX CustomerCare. You can do this by creating a new case using the Case Management page in Support Central. 128 Displaying audit runtime status Displaying audit runtime status The audit summary view of the Work with Audits display shows audit runtime status in the Audit Status column. F11 toggles between variations of audit summary views. Do the following: 1. Do one of the following to access the Summary view of the Work with Audits display: • From the MIMIX Intermediate Main Menu, select option 6 (Work with audits) and press Enter. Then use F10 as needed to access the Audit summary view. • Enter the command: installation-library/WRKAUD VIEW(*AUDSTS) 2. The Work with Audits display appears. If audit compliance problems exist, you may see a different view of the Work with Audits display. Use F10 to access the Summary view. 3. Check the value shown in the Audit Status column. Press F1 (Help) for a description of status values. 4. To view additional information about an audit, use option 5 (Display). On the summary view of the Work with Audits display, audits are sorted and displayed so that the highest severity item is at the top of the list. In addition to audit runtime status, the initial summary view (Figure 21) also includes the full name of the data group and the following information: The Object Diff column identifies the number of audited objects with differences remaining after the audit completed. The Objects Selected column indicates how objects were selected for auditing in 129 Displaying audit runtime status the most recent run of the audit. Figure 21. Audit Summary, view - data group definition columns Work with Audits System: Type options, press Enter. 5=Display 6=Print 7=History 8=Recoveries 14=Audited objects 46=Mark recovered ... Opt __ __ __ __ __ __ __ __ Audit Status *NOTRUN *CMPACT *CMPACT *CMPACT *RCYACT *QUEUED *QUEUED *NODIFF 9=Run rule Audit ---------Definition--------Rule DG Name System 1 System 2 #OBJATR EMP AS01 AS02 #DLOATR EMP AS01 AS02 #FILATR EMP AS01 AS02 #FILATRMBR EMP AS01 AS02 #FILDTA EMP AS01 AS02 #IFSATR EMP AS01 AS02 #MBRRCDCNT EMP AS01 AS02 #DGFE EMP AS01 AS02 AS01 10=End Object Diff 0 0 0 0 0 0 0 0 Objects Selected *PTY *PTY *PTY *PTY *PTY *PTY *PTY *ALL Bottom Parameters or command ===> _________________________________________________________________________ F3=Exit F4=Prompt F5=Refresh F10=Compliance summary F11=Last run F14=Audited objects F16=Inst. policies F23=More options F24=More key Note: Audit runtime status and compliance status values are prioritized and are also “bubbled up” to the next higher level in the user interface, which is the installation. In a 5250 emulator, audit status is included in the summarized replication status displayed on the Work with Application Groups display. The Work with Data Groups display provides an indication of the number of audits that require action or attention. 130 Displaying audit runtime status The additional view of audit summary information (Figure 22) displays policies in effect when the audit was last run Figure 22. Audit Summary view - last run columns. Work with Audits System: Type options, press Enter. 5=Display 6=Print 7=History 8=Recoveries 14=Audited objects 46=Mark recovered ... Opt __ __ __ __ __ __ __ __ Audit Status *NOTRUN *CMPACT *CMPACT *CMPACT *RCYACT *QUEUED *QUEUED *NODIFF Audit Rule #OBJATR #DLOATR #FILATR #FILATRMBR #FILDTA #IFSATR #MBRRCDCNT #DGFE DG Name EMP EMP EMP EMP EMP EMP EMP EMP ------Last Recovery *ENABLED *ENABLED *ENABLED *ENABLED *ENABLED *ENABLED *ENABLED *ENABLED 9=Run rule Run------Level *LEVEL30 *LEVEL30 *LEVEL30 *LEVEL30 *LEVEL30 *LEVEL30 *LEVEL30 *LEVEL30 AS01 10=End Object Diff 0 0 0 0 0 0 0 0 Bottom Parameters or command ===> _________________________________________________________________________ F3=Exit F4=Prompt F5=Refresh F10=Compliance summary F11=Last run F14=Audited objects F16=Inst. policies F23=More options F24=More key The Last Run columns show the values of policies in effect at the time the audit was last run through its compare phase. Recovery identifies the value of the automatic audit recovery policy. When this policy is enabled, after the comparison completes, MIMIX automatically starts recovery actions to correct differences detected by the audit. Recovery may also indicate a value of *DISABLED if a condition checked by the Action for running audits (RUNAUDIT) policy existed and the policy value for that condition specified *CMP, preventing audit recoveries from running. Level identifies the value of the audit level policy. The audit level determines the level of checking performed during the compare phase of the audit. If an audit was never run, the value *NONE is displayed in both columns. Running an audit immediately You always have the option of running an audit immediately. You can do this by running the MIMIX rule associated with the audit. From a 5250 emulator, audits invoked in this manner always select all replication-eligible objects associated with the class of object for the audit. When running an audit immediately from Vision Solutions Portal, you have the ability to select whether the audit will select all replication-eligible objects or only prioritized objects. In most cases, you want to run the audit from the management system. Policies determine whether a request to run an audit can be performed on the requesting system. Most users should perform this procedure form the management system. 131 Displaying audit runtime status To run a rule immediately, do the following: 1. From the MIMIX Intermediate Main Menu, select option 6 (Work with audits) and press Enter. 2. Type option 9 (Run rule) next to the audit you want and press Enter. Note: Audits are not allowed to run against disabled data groups. For more information, see, “Resolving audit problems” on page 133. 132 Resolving audit problems When viewing results of audits, the starting point is the Summary view of the Work with Audits display. You may also need to view the output file or the job log, which are only available from the system where the audits ran. In most cases, this is the management system. Do the following from the management system: 1. Do one of the following to access the Work with Audits display. • From the MIMIX Intermediate Main Menu, select option 6 (Work with audits) and press Enter. Then use F10 as needed to access the Audit summary view. • From a command line, enter WRKAUD VIEW(*AUDSTS) 2. Check the Audit Status column for values shown in Table 30. Audits with potential problems are at the top of the list. Take the action indicated in Table 30. Table 30. Addressing audit problems Status Action *FAILED If the failed audit selected objects by priority and its timeframe for starting has not passed, the audit will automatically attempt to run again. The audit failed for these possible reasons. Reason 1: The rule called by the audit failed or ended abnormally. • To run the rule for the audit again, select option 9 (Run rule). This will check all objects regardless of how the failed audit selected objects to audit. • To check the job log, see “Checking the job log of an audit” on page 135. Reason 2: The #FILDTA audit or the #MBRRCDCNT audit which required replication processes that were not active. 1. From the command line, type WRKDG and press Enter. • If all processes for the data group are active, skip to Step 2. • If processes for the data group show a red I, L, or P in the Source and Target columns, use option 9 (Start DG). 2. When the data group is active, return to the Work with Audits display and use option 9 (Run rule) to run the audit. This will check all objects regardless of how the failed audit selected objects to audit. 3. If the audit fails again, check the job log using “Checking the job log of an audit” on page 135. 133 Table 30. Addressing audit problems Status Action *DIFFNORCY The comparison performed by the audit detected differences. No recovery actions were attempted because of a policy in effect when the audit ran. Either the Automatic audit recovery policy is disabled or the Action for running audits policy prevented recovery actions while the data group was inactive or had a replication process which exceeded its threshold. If policy values were not changed since the audit ran, checking the current settings will indicate which policy was the cause. Use option 36 to check data group level policies and F16 to check installation level policies. • If the Automatic audit recovery policy was disabled, the differences must be manually resolved. • If the Action for running audits policy was the cause, either manually resolve the differences or correct any problems with the data group status. You may need to start the data group and wait for threshold conditions to clear. Then run the audit again. To manually resolve differences do the following: 1. Type 7 (History) next to the audit with *DIFFNORCY status and press Enter. 2. The Work with Audit History display appears with the most recent run of the audit at the top of the list. Type 8 (Display difference details) next to an audit to see its results in the output file. 3. Check the Difference Indicator column. All differences shown for an audit with *DIFFNORCY status need to be manually resolved. For more information about the possible values, see “Interpreting audit results - supporting information” on page 299. To have MIMIX always attempt to recover differences on subsequent audits, change the value of the automatic audit recovery policy. *NOTRCVD The comparison performed by the audit detected differences. Some of the differences were not automatically recovered. The remaining detected differences must be manually resolved. Note: For audits using the #MBRRCDCNT rule, automatic recovery is not possible. Other audits, such as #FILDTA, may correct the detected differences. Do the following: 1. Type 7 (History) next to the audit with *NOTRCVD status and press Enter. 2. The Work with Audit History display appears with the most recent run of the audit at the top of the list. Type 8 (Display difference details) next to an audit to see its results in the output file. 3. Check the Difference Indicator column. Any differences with values other than *RECOVERED must be manually resolved. For more information about the possible values, see “Interpreting audit results - supporting information” on page 299. *NOTRUN The audit was prevented from running by the Action for running audits policy. Either the data group was inactive or a replication process exceeded its threshold. This may be expected during periods of peak activity or when data group processes have been ended intentionally. However, if the audit is frequently not run due to this policy, action may be needed to resolve the cause of the problem. For more information about the values displayed in the audit results, see “Interpreting audit results - supporting information” on page 299. 134 Checking the job log of an audit An audit’s job log can provide more information about why an audit failed. If it still exists, the job log is available on the system where the audit ran. Typically, this is the management system. You must display the notifications from an audit in order to view the job log. Do the following: 1. From the Work with Audits display, type 7 (History) next to the audit and press Enter. 2. The Work with Audit History display appears with the most recent run of the audit at the top of the list. 3. Use option 12 (Display job) next to the audit you want and press Enter. 4. The Display Job menu opens. Select option 4 (Display spooled files). Then use option 5 (Display) from the Display Job Spooled Files display. 5. Look for messages from the job log for the audit in question. Usually the most recent messages are at the bottom of the display. Message LVE3197 is issued when errors remain after an audit completed. Message LVE3358 is issued when an audit failed. Check for following messages in the job log that indicate a communications problem (LVE3D5E, LVE3D5F, or LVE3D60) or a problem with data group status (LVI3D5E, LVI3D5F, or LVI3D60). 135 Ending audits Only active or queued audits can be ended. This includes audits with the following statuses: Currently comparing (*CMPACT), Currently recovering (*RCYACT), or Currently waiting to run (*QUEUED). You must end active or queued audits from the system that originated the audit. You can end active or queued audits from any view of the Work with Audits display. This procedure uses the Status view. To end an active or queued audit, do the following: 1. From the MIMIX Intermediate Main Menu, select option 6 (Work with audits) and press Enter. Then use F10 as needed to access the Audit summary view. 2. Check the value shown in the Audit Status column. Press F1 (Help) for a description of status values. 3. Type option 10 (End) next to the active or queued audit you want to end and press Enter. 4. Audits in *CMPACT or *QUEUED status are set back to their previous status values. Audits in *RCYACT status are set according to the completed comparison result as well as the results of any completed recovery actions. 136 Displaying audit history Displaying audit history The Work with Audit History display lists the available history for completed runs of a specific combination of audit rule and data group. Each item listed is a history of a completed audit run, shown in reverse chronological order so that the completed audit with the most recent start time is at the top of the list. Audits that are new or that have an active status are not included in this list. Do the following to access retained history for a specific audit and data group combination: 1. From the MIMIX Intermediate Main Menu, type 6 (Work with audits) and press Enter. 2. From the Work with Audits display, type 7 (History) next to the audit and data group you want and press Enter. The amount of history information available is determined by how frequently an audit runs and the settings of the Audit history retention policy. Having retained audit history enables you to look for trends across multiple runs of an audit that may be an indication of a configuration problem or some other issue with an object. For example, the Work with Audit History display makes it easy to notice that particular audit of one data group always has a similar number of recovered objects or always has differences that cannot be recovered automatically. The initial view shows (Figure 23) the final audit status and recovery phase statistics. Figure 23. Work with Audit History - view of recovery results. Work with Audit History SYSTEM: Audit rule . . . . . . : Data group definition . : AS01 #FILATR EMP AS01 AS02 Type options, press Enter. 5=Display 6=Print 8=View difference details 12=Display job 14=Audited objects 46=Mark recovered ------------------Objects----------------Audit Total Not Not Opt Compare Start Status Selected Compared Recovered Recovered __ 01/02/10 15:25:31 *NODIFF 91 0 0 0 __ 12/31/09 09:06:04 *NODIFF 0 0 0 0 __ 12/30/09 08:50:29 *AUTORCVD 4 0 0 3 BOTTOM Parameters or command ===> _________________________________________________________________________ F3=Exit F4=Prompt F5=Refresh F9=Retrieve F11=Summary results F12=Cancel F13=Repeat F14=Audited objects F21=Print list F11 toggles between this view and additional views. 137 Displaying audit history The summary results view (Figure 24) shows the total number of objects selected by the audit and whether the objects selected were the result of a priority audit or a scheduled audit. Figure 24. Work with Audit History - view of summary results. Work with Audit History SYSTEM: Audit rule . . . . . . : Data group definition . : Type options, press Enter. 5=Display 6=Print 8=View difference details 14=Audited objects 46=Mark recovered Opt __ __ __ Compare 01/02/10 12/31/09 12/30/09 AS01 #FILATR EMP AS01 AS02 Audit Start Status 15:25:31 *NODIFF 09:06:04 *NODIFF 08:50:29 *AUTORCVD Total Selected 91 0 4 12=Display job Objects Selected *ALL *PTY *PTY BOTTOM Parameters or command ===> _________________________________________________________________________ F3=Exit F4=Prompt F5=Refresh F9=Retrieve F11=Compare results F12=Cancel F13=Repeat F14=Audited objects F21=Print list The compare results view (Figure 25) shows the duration of the audit as well as statistics for the compare phase of the audit. Figure 25. Work with Audit History - view of compare results. Work with Audit History SYSTEM: Audit rule . . . . . . : Data group definition . : AS01 #FILATR EMP AS01 AS02 Type options, press Enter. 5=Display 6=Print 8=View difference details 12=Display job 14=Audited objects 46=Mark recovered ------------Objects------------Audit Audit Not Detected Opt Compare Start Status Duration Compared Compared Not Equal __ 01/02/10 15:25:31 *NODIFF 00:00:04 91 0 0 __ 12/31/09 09:06:04 *NODIFF 00:00:01 0 0 0 __ 12/30/09 08:50:29 *AUTORCVD 00:00:01 4 0 3 BOTTOM Parameters or command ===> _________________________________________________________________________ F3=Exit F4=Prompt F5=Refresh F9=Retrieve F11=Recovery results F12=Cancel F13=Repeat F14=Audited objects F21=Print list 138 Working with audited objects When viewing the Work with Audit History display from the system on which the audit request originated, you can use options to view the object difference details detected by the audit (option 8), the job log for the audit (option 9), and a list of objects that were audited (option 14). Audits with no selected objects On the Work with Audit History display, it is possible to see repeated audit runs that have zero (0) objects selected during the time frame that prioritized audits are allowed to run each day. Zero objects selected means that no objects matched the frequency specified for criteria for selecting objects at the time when the prioritized audit ran. Consider this example of how prioritized audits operate. Audit #FILATR is set to run priority audits using its shipped default values for priority auditing. This means the audit will run approximately once per hour between 3 and 8 a.m. every day. Each audit run will select the following: • Any replicated objects that were not equal in their last audit. • Any new replicated objects had never been audited. • Any replicated objects that changed in the past 24 hours. • Any replicated objects that did not change since they were audited a week ago. • Any replicated objects that did not change since their last audit a month (30 days) ago and have a history of repeated consecutive successful audits. For the first run (between 3 and 4 a.m.) of a normal work day, it is likely the audit selected objects in the new and changed in the past day categories, and may have selected some objects in other categories as well. The second run is likely to have selected fewer objects, and may have selected only objects that had differences from the earlier run. If those differences were resolved, then the subsequent runs that day are likely to have selected no objects because none were eligible. While such a daily pattern may repeat, it is also subject to replication and other auditing activity within your environment. Working with audited objects The Work with Audited Objects display shows a list of objects compared by one or more audits. This information is available only on the originating system for audits performed when the Audit history retention (AUDHST) policy in effect specified to keep details relevant to the type of audit and those audits have not exceeded the current policy's retention criteria. The list of objects is sorted by severity of their final audit status (the status after comparisons and any recovery actions complete), with the most severe status first. Because this display lists audited object history, the #DGFE rule, which compares configuration data, is not included. When the objects listed are for only one audit, the display appears as shown in Figure 26. This layout is used when the display is invoked by option 14 (Audited objects) on the Work with Audits display or the Work with Audit History display. Note that the Audit 139 Working with audited objects start field is located at the top of the display in this case. If the selected audit is the audit run with the latest start date, (*LAST) will also appear in the Audit start field. Figure 26. Work with Audited Objects display for a single audit. Work with Audited Objects SYSTEM: Data group: Audit rule: EMP AS01 AS02 #FILATR Audit start: AS01 06/17/09 15:01:34 (*LAST) Type options, press Enter. 5=Display 6=Print 9=Object history Opt _ _ _ _ Audited Status *NE *EQ *EQ *RCVD Type *FILE *FILE *FILE *FILE Object Name L00SAMPLEA/RJFILE1 L00SAMPLEA/RJFILE2 L00SAMPLEA/RJFILE3 L00SAMPLEA/RJFILE4 BOTTOM Parameters or command ===> _________________________________________________________________________ F3=Exit F4=Prompt F5=Refresh F9=Retrieve F12=Cancel F13=Repeat F18=Subset F21=Print list F22=Display entire field When the list includes objects from multiple audits, the display appears as shown in Figure 27, with the specific audit rule and start time displayed in columns. A > symbol next to an object name indicates a long object path name exists which can be viewed with F22. File member information is not automatically displayed. However, you can use F18 to change subsetting criteria to include members. When member information is displayed, the name is in the format: library/file(member). Also, the information displayed for file members may not be from the most recently performed audit. Because members can be compared by several audits, the most recent run of each of those audits is evaluated. The evaluated audit run with most severe status is displayed, even if it is not the most recently performed audit of the evaluated audit 140 Working with audited objects runs. For all other objects, the information displayed is from the most recent audit run that compared the object. Figure 27. Work with Audited Objects display with all audits displayed. Work with Audited Objects System: Data group: Audit rule: AS01 EMP AS01 AS02 *ALL Type options, press Enter. 5=Display 6=Print 9=Object history Opt _ _ _ _ _ _ _ _ Audited Status *NE *NE *NE *EQ *EQ *EQ *EQ *EQ Type *DTAARA *DTAARA *STMF *DTAARA *DTAARA *FILE *FILE *FILE Object Name L00SAMPLEA/AJDTAARA1 L00SAMPLEA/AJDTAARA2 /L00DIR/ALPHA.STM L00SAMPLEA/DTAARA1 L00SAMPLEA/DTAARA2 L00SAMPLEA/RJFILE1 L00SAMPLEA/RJFILE2 L00SAMPLEA/RJFILE3 -----------Audit-----------Rule Date Time #OBJATR 12/11/09 09:40:27 #OBJATR 12/11/09 09:40:27 #IFSATR 12/11/09 09:47:57 #OBJATR 12/11/09 09:40:27 #OBJATR 12/11/09 09:40:27 #FILATR 12/11/09 09:43:13 #FILATR 12/11/09 09:43:13 #FILATR 12/11/09 09:43:13 More... Parameters or command ===> _________________________________________________________________________ F3=Exit F4=Prompt F5=Refresh F9=Retrieve F12=Cancel F13=Repeat F18=Subset F21=Print list F22=Display entire field You can select option 5 to view the details of the audit in which the object was compared, such as the audit compare and recovery timestamps, and option 9 to view auditing history for a specific object. Displaying audited objects from a specific audit run Use this procedure to display the list of objects compared by a specific audit run. For prioritized audits, not every object is audited in every audit run. From Work with Audits display or the Work with Audit History display, do the following: 1. Ensure that you are on the system where the audit originated. The originating system is included in the audit details, which you can view using option 5 (Display). 2. Type 14 (Audited objects) next to audit run that you want and press Enter. 3. If necessary, press F18 (Subset) to specify criteria for filtering the list by object type, name, or audited status. Displaying a customized list of audited objects Use this procedure to list all objects compared by a data group or to specify filtering criteria such as object type, name, or audited status. From Work with Audits display or the Work with Audit History display, do the following: 1. Ensure that you are on the system where the audit originated. The originating 141 Working with audited object history system is included in the audit details, which you can view using option 5 (Display). 2. Press F14 (Audited objects). The Work with Audited Objects (WRKAUDOBJ) command appears. 3. Specify the Data group definition for which you want to see audited objects. 4. Specify the value you want for Object type and press Enter. 5. Additional fields appear based on the value specified in Step 5. Specify values to define the criteria for selecting the objects to be displayed. Note: The value specified for Member (MBR) determines whether member-level objects are selected for their object history. The members selected are not automatically displayed in the list. To include any selected members, press F10 (Additional parameters), then specify *YES for Include member (INCMBR). 6. Press Enter to display the list of objects from the retained history details. Working with audited object history The Work with Audited Obj. History display lists the available audit history for a single object compared by the indicated audit rules within the indicated data group. This capability provides the ability to check for trends for a specific object such as repeated automatic recovery of a difference. The audit history for an object is available only on the originating system for audits performed when the Audit history retention (AUDHST) policy in effect specified to keep details relevant to the type of audit and those audits have not exceeded the current policy's retention criteria. The list is sorted in reverse chronological order so that the audit history having the most recent start date is at the top of the list. When the displayed object history is for a file member, the member is represented as object type *FILE with its name formatted as library/file(member). The Audit Rule column appears in the list to identify which audit rule compared the member in the audit run, as shown in Figure 28. When the audit history for any other object type is 142 Working with audited object history displayed, there is only one possible audit rule so the Audit rule field is located at the upper right of the display. Figure 28. Work with Audited Obj. History display showing audit history for a file member Work with Audited Obj. History System: AS01 Data group: EMP AS01 AS02 Type: *FILE Name: ABCLIB/PF1(MBR1) Type options, press Enter. 5=Display 6=Print 8=View difference details Audit Opt Rule _ #FILDTA _ #MBRRCDCNT _ #FILATRMBR ----Compare Information--- -------Recovery Information-----Date Time Status Date Time Status 06/17/09 15:22:48 *EQ 06/17/09 15:01:34 *EQ 06/16/09 15:01:27 *NE 06/16/09 15:04:25 *RECOVERED Bottom Parameters or command ===> _________________________________________________________________________ F3=Exit F4=Prompt F5=Refresh F9=Retrieve F12=Cancel F13=Repeat F21=Print list F22=Display entire field From this display you can use option 5 to view details of the audit in which the object was compared, such as its audit compare and recovery timestamps, and option 8 to view object difference details that were detected by the audit. Displaying the audit history for a specific object Use this procedure to display the retained audit histories for a specific object. From Work with Audits display or the Work with Audit History display, do the following: 1. Display a list of objects audited for a data group using either of the following procedures: • “Displaying audited objects from a specific audit run” on page 141 • “Displaying a customized list of audited objects” on page 141 2. From the Work with Audited Objects display, type 9 (Object history) next to the object you want and press Enter. 143 Displaying audit compliance Displaying audit compliance The audit compliance view of the Work with Audits display (Figure 29) shows audit compliance status in the Compliance column. F11 toggles between variations of audit compliance views. Note: If other audit problems exist, you may see a different view of the Work with Audits display. Use F10 to access the Compliance view. On the compliance view of the Work with Audits display, the list is initially sorted by compliance status. To sort the list by scheduled time, use F17. In addition to audit compliance status, the initial compliance view (Figure 29) shows the timestamp of when the compare phase ended in the Compare End column. Compliance is checked based on the last completed compare date. Compliance determines whether the date of the last completed compare completed by an audit is within the range set by policies. The Audit warning threshold policy and the Audit action threshold policy define when to indicate that an audit is approaching or exceeding that range. For audits configured for scheduled object auditing or both scheduled and prioritized object auditing, compliance status is based on the last run of a scheduled audit or a user-invoked audit. For audits configured for only prioritized object auditing, compliance status is based on the last run, which may have been a prioritized audit or a user-invoked audit. A user-invoked audit or a scheduled audit checked all objects that are configured for the data group and within the class of objects checked by the audit whereas a prioritized audit may have checked only a subset of those objects. Figure 29. Audit Compliance, view - data group definition columns. Work with Audits System: Type options, press Enter. 5=Display 6=Print 7=History 8=Recoveries 14=Audited objects 36=Change DG policies Opt __ __ __ __ __ __ __ __ Compliance *OK *OK *OK *OK *OK *OK *OK *OK AS01 9=Run rule 10=End 37=Change audit schedule Audit ---------Definition--------Rule DB Name System 1 System 2 #DGFE EMP AS01 AS02 #DLOATR EMP AS01 AS02 #FILATR EMP AS01 AS02 #FILATRMBR EMP AS01 AS02 #FILDTA EMP AS01 AS02 #IFSATR EMP AS01 AS02 #MBRRCDCNT EMP AS01 AS02 #OBJATR EMP AS01 AS02 ---Compare End--Date Time 09/25/08 12:15:34 09/25/08 12:15:34 09/25/08 12:15:34 09/25/08 12:15:35 09/25/08 12:15:38 09/25/08 12:15:36 09/25/08 12:15:38 09/25/08 12:15:37 Bottom Parameters or command ===> _________________________________________________________________________ F3=Exit F4=Prompt F5=Refresh F10=Schedule summary F11=Next scheduled F14=Audited objects F17=Sort sched. time F24=More keys 144 Displaying audit compliance The additional view of audit compliance information (Figure 30) displays when the scheduled audit run will occur. The scheduled date and time in this view do not apply to prioritized audit runs. Figure 30. Audit Compliance, view 2 - next scheduled time columns. Work with Audits System: Type options, press Enter. 5=Display 6=Print 7=History 8=Recoveries 14=Audited objects 36=Change DG policies Opt __ __ __ __ __ __ __ __ Compliance *OK *OK *OK *OK *OK *OK *OK *OK Audit Rule #DGFE #DLOATR #FILATR #FILATRMBR #FILDTA #IFSATR #MBRRCDCNT #OBJATR DG Name EMP EMP EMP EMP EMP EMP EMP EMP AS01 9=Run rule 10=End 37=Change audit schedule -Scheduled Time-Date Time 09/26/08 02:00:00 09/26/08 02:25:00 09/26/08 02:10:00 09/26/08 02:20:00 09/26/08 02:35:00 09/26/08 02:15:00 09/26/08 02:30:00 09/26/08 02:05:00 ---Compare End--Date Time 09/25/08 12:15:34 09/25/08 12:15:34 09/25/08 12:15:34 09/25/08 12:15:35 09/25/08 12:15:38 09/25/08 12:15:36 09/25/08 12:15:38 09/25/08 12:15:37 Bottom Parameters or command ===> _________________________________________________________________________ F3=Exit F4=Prompt F5=Refresh F10=Schedule summary F11=DG definition F14=Audited objects F17=Sort sched. time F24=More keys Note: Audit runtime status and compliance status values are prioritized and are “bubbled up” to the next higher level in the user interface, which is the installation. In a 5250 emulator, audit status is included in the summarized replication status displayed on the Work with Application Groups display. The Work with Data Groups display provides an indication of the number of audits that require action or attention. Determining whether auditing is within compliance Regular auditing detects and often repairs problems in the replication environment. Compliance with the best practice of regular auditing is determined for each individual audit based on the date when the audit last completed its compare phase. Audit compliance problems are identified by the following a status values *ATTN -The audit is approaching an out of compliance state as determined by the Audit warning threshold policy. Attention is required to prevent the audit from becoming out of compliance. *ACTREQ - The audit is out of compliance with the Audit action threshold policy. Action is required. Perform an audit of the data group. An audit with a compliance problem must be run to resolve the problem. Do the following to check for compliance problems: 1. Do one of the following to access the Compliance view of the Work with Audits display: 145 Displaying audit compliance • From the MIMIX Intermediate Main Menu, select option 6 (Work with audits) and press Enter. Then use F10 as needed to access the Compliance view • Enter the command: installation-library/WRKAUD VIEW(*COMPLY) 2. Check the Compliance column for values of *ATTN and *ACTREQ. 3. To resolve a problem with audit compliance, the audit in question must be run and complete its compare phase. • To see when the scheduled run of the audit will occur, press F11. To see when both scheduled and prioritized audits will run, press F10 to access the Audit summary view, then use F11 to toggle between views. • To run the audit now, select option 9 (Run rule) and press Enter. This action will select all replicated objects associated with the class of the audit. For more Information, see “Running an audit immediately” on page 131. 146 Displaying scheduling information for automatic audits Displaying scheduling information for automatic audits An audit can be configured to run by schedule, by priority, by schedule and priority, or not at all. The schedule summary views of the Work with Audits display allow you to see scheduling information for each audit. Do the following to view when an audit can occur for a specific audit and data group combination: 1. From the MIMIX Intermediate Main Menu, type 6 (Work with audits) and press Enter. 2. The Work with Audits display appears, showing either the audit summary or compliance summary view. Press F10 as needed to access the Schedule summary view. 3. The initial view of the Schedule summary is displayed. Use F11 to toggle between additional variations of audit schedule views. • The initial view (Figure 31) shows the date and time of the next scheduled audit run. You cannot view the exact time of when the next prioritized audit will run. • To view current scheduled auditing settings, press F11 (Figure 32). • To view current priority auditing settings, press F11 twice (Figure 33). Prioritized audit runs are allowed to start every day only during the specified time range. Multiple runs of an audit may occur during that time. The list is initially sorted by rule and data group name.To sort the list by scheduled time, use F17. Figure 31. Audit Schedule Summary, view - next scheduled time. Work with Audits System: Type options, press Enter. 5=Display 6=Print 7=History 8=Recoveries 14=Audited objects 36=Change DG policies Opt __ __ __ __ __ __ __ __ Audit Rule #DGFE #DLOATR #FILATR #FILATRMBR #FILDTA #IFSATR #MBRRCDCNT #OBJATR ---------Definition--------DG Name System 1 System 2 EMP AS01 AS02 EMP AS01 AS02 EMP AS01 AS02 EMP AS01 AS02 EMP AS01 AS02 EMP AS01 AS02 EMP AS01 AS02 EMP AS01 AS02 AS01 9=Run rule 10=End 37=Change audit schedule Frequency *WEEKLY *WEEKLY *WEEKLY *WEEKLY *WEEKLY *WEEKLY *WEEKLY *WEEKLY -Scheduled Time-Date Time 09/25/08 02:00:00 09/25/08 02:25:00 09/25/08 02:10:00 09/25/08 02:20:00 09/25/08 02:35:00 09/25/08 02:15:00 09/25/08 02:30:00 09/25/08 02:05:00 Bottom Parameters or command ===> _________________________________________________________________________ F3=Exit F4=Prompt F5=Refresh F10=Audit summary F11=Schedule settings F14=Audited objects F17=Sort sched. time F24=More keys 147 Displaying scheduling information for automatic audits Figure 32. Audit Schedule Summary, view - schedule settings. Work with Audits System: Type options, press Enter. 5=Display 6=Print 7=History 8=Recoveries 14=Audited objects 36=Change DG policies Opt __ __ __ __ __ __ __ __ Audit Rule #DGFE #DLOATR #FILATR #FILATRMBR #FILDTA #IFSATR #MBRRCDCNT #OBJATR DG Name EMP EMP EMP EMP EMP EMP EMP EMP Frequency *WEEKLY *WEEKLY *WEEKLY *WEEKLY *WEEKLY *WEEKLY *WEEKLY *WEEKLY Date *NONE *NONE *NONE *NONE *NONE *NONE *NONE *NONE AS01 9=Run rule 10=End 37=Change audit schedule Weekday SMTWTFS SMTWTFS SMTWTFS SMTWTFS SMTWTFS SMTWTFS SMTWTFS SMTWTFS SMTWTFS Rel.Day 12345L Time 02:00:00 02:25:00 02:10:00 02:20:00 02:35:00 02:15:00 02:30:00 02:05:00 Bottom Parameters or command ===> _________________________________________________________________________ F3=Exit F4=Prompt F5=Refresh F10=Audit summary F11=Priority settings F14=Audited objects F17=Sort sched. time F24=More keys Figure 33. Audit Schedule Summary, view - priority settings. Work with Audits System: Type options, press Enter. 5=Display 6=Print 7=History 8=Recoveries 14=Audited objects 36=Change DG policies Opt __ __ __ __ __ __ __ __ Audit Rule #DGFE #DLOATR #FILATR #FILATRMBR #FILDTA #IFSATR #MBRRCDCNT #OBJATR DG Name EMP EMP EMP EMP EMP EMP EMP EMP -Start RangeAfter Until *NONE 03:00 08:00 03:00 08:00 03:00 08:00 03:00 08:00 03:00 08:00 03:00 08:00 03:00 08:00 AS01 9=Run rule 10=End 37=Change audit schedule ----Priority Objects Selected---New Chg Unchg No Diff *DAILY *DAILY *DAILY *DAILY *DAILY *DAILY *DAILY *DAILY *DAILY *DAILY *DAILY *DAILY *DAILY *DAILY *WEEKLY *WEEKLY *WEEKLY *WEEKLY *WEEKLY *WEEKLY *WEEKLY *MONTHLY *MONTHLY *MONTHLY *MONTHLY *MONTHLY *MONTHLY *MONTHLY Bottom Parameters or command ===> _________________________________________________________________________ F3=Exit F4=Prompt F5=Refresh F10=Audit summary F11=Priority settings F14=Audited objects F17=Sort sched. time F24=More keys 148 Displaying status of system-level processes Working with system-level processes CHAPTER 8 MIMIX uses several processes that run at the system level to support the replication environment and provide additional functionality. System-level processes include the system manager, journal manager, target journal inspection, collector services, and if needed, cluster services. These processes can be accessed from the Work with Systems display (WRKSYS command). Typically, these processes are automatically started and ended when MIMIX is started or ended. However, you may need to start or end individual processes when resolving problems. The following topics are included in this chapter to help you resolve problems with system level processes: • “Displaying status of system-level processes” on page 149 describes how to check for expected status values and resolve problems with system-level processes. This includes procedures for starting and ending managers, target journal inspection, and collector services. • “Resolving *ACTREQ status for a system manager” on page 151 describes how to resolve a status of action required. • “Checking for a system manager backlog” on page 151 describes how to check if there a backlog of unprocessed entries that require action. • “Displaying status of target journal inspection” on page 155 describes how to display the status of a single inspection job on a system and how to resolve problems with its status. • “Displaying results of target journal inspection” on page 156 describes where to find information about the objects identified by target journal inspection. • “Identifying the last entry inspected on the target system” on page 158 describes how to determine the last entry in the target journal and the last entry processed by target journal inspection. Displaying status of system-level processes Status of processes that run at the system level can be viewed from the Work with Systems display. 1. Do one of the following: • From MIMIX Intermediate Main Menu, select option 2 (Work with Systems) and press Enter. • From the Work with Application Groups display, use option 12 (Node entries). On the resulting Work with Node Entries display, press F7 (Systems). 2. The Work with Systems display appears. The first system definition in the list is the local system. Figure 34 shows expected status values for most two-system environments. For any other status values, continue with the next step. 149 Displaying status of system-level processes Expected Status Values: • System managers and journal managers have an expected status of *ACTIVE on all systems. • For target journal inspection, the expected status is that systems that are currently the target for replication have a status of *ACTIVE and other systems have a status of *NOTTGT. • For cluster services, most installations are not licensed for MIMIX® Global™ and have an expected status of *NONE. If a system participates in an IBM i cluster, the expected value is *ACTIVE. For more information about operation for MIMIX® Global™, see the MIMIX Operations with IBM i Clustering book. Figure 34. Expected status on the Work with Systems display for a two-system environment. Work with Systems OSCAR Local system definition . . : Cluster . . . . . . . . . . : OSCAR *NONE Type option, press Enter 4=Remove cluster node 5=Display 6=Print 7=System manager status 8=Work with data groups 9=Start 10=End 11=Jrn inspection status System ----- Managers ----- Journal ----- Services -----Opt Definition Type System Journal Inspect. Collector Cluster ___ OSCAR *MGT *ACTIVE *ACTIVE *ACTIVE *ACTIVE *NONE ___ HENRY *NET *ACTIVE *ACTIVE *NOTTGT *ACTIVE *NONE F3=Exit F5=Refresh F9=Automatic refresh F13=Repeat F16=System definitions F10=Legend Bottom F12=Cancel 3. If one or more processes are *INACTIVE, do one of the following: • Type a 9 (Start) next to the system you want and press Enter. The Start MIMIX Managers display appears. Any processes except cluster services that are not active on the system are preselected. (To start cluster services, MIMIX® Global™ users must specify *YES for the Start cluster services prompt.) Press Enter. 4. For any other status values on a system, do the following; • If one or more processes are *UNKNOWN, use the procedure in “Verifying all communications links” on page 282. • For a system manager status of *ACTREQ, use “Resolving *ACTREQ status for a system manager” on page 151. • To check for a system manager backlog, use “Checking for a system manager backlog” on page 151. 150 Displaying status of system-level processes • For target journal inspection status values other than *ACTIVE or *NOTTGT, see “Displaying status of target journal inspection” on page 155. Resolving *ACTREQ status for a system manager A system manager status of *ACTREQ indicates that at least one of the system manager pairs in which the system is a participant has failed. The system manager must be started. To start the system manager, type a 9 (Start) next to the system and press Enter. Checking for a system manager backlog The Work with System Pair Status panel includes the count of unprocessed entries for the source system job of the system manager process along with the timestamp of the oldest unprocessed entry. A count of unprocessed entries means that a backlog exists and action may be required. A status of *INACTIVE indicates the system manager needs to be started. Type a 9 (Start) next to the system and press Enter. A status of *ACTIVE with unprocessed entries indicates further action may be required. Since this data is a snapshot of work currently being done, it is important to refresh this panel (F5) to ensure data is up to date. Evaluate data for unprocessed entries with a status of *ACTIVE as follows: • If the status is *ACTIVE and there are a high number of unprocessed entries for your environment or the timestamp is not changing when data is refreshed (F5), contact CustomerCare. • If the status is *ACTIVE and there is a low number of unprocessed entries for your environment, refresh data (F5) and check whether the timestamp is changing. If the timestamp changes, the entries are being processed. 151 Starting a system manager or a journal manager To selectively start a system manager or journal manager for a system, do the following 1. Do one of the following: • From MIMIX Intermediate Main Menu, select option 2 (Work with Systems) and press Enter. • From the Work with Application Groups display, use option 12 (Node entries). On the resulting Work with Node Entries display, press F7 (Systems). 2. The Work with Systems display appears. Type a 9 (Start) next to the system definition you want and press Enter. 3. The Start MIMIX Managers display appears. By default, any manager that is not running will be selected to start. Specify the value for the type of manager you want to start at the Manager prompt and press Enter. Ending a system manager or a journal manager To end a system manager or journal manager, do the following: 1. Do one of the following: • From MIMIX Intermediate Main Menu, select option 2 (Work with Systems) and press Enter. • From the Work with Application Groups display, use option 12 (Node entries). On the resulting Work with Node Entries display, press F7 (Systems). 2. The Work with Systems display appears with a list of the system definitions defined for the MIMIX installation. Type a 10 (End) next to the system definition you want and press Enter. 3. The End MIMIX Managers display appears. Specify the value for the type of manager you want to end at the Manager prompt and press Enter. The selected managers are ended. Starting collector services To start collector services for a system, do the following 1. Do one of the following: • From MIMIX Intermediate Main Menu, select option 2 (Work with Systems) and press Enter. • From the Work with Application Groups display, use option 12 (Node entries). On the resulting Work with Node Entries display, press F7 (Systems). 2. The Work with Systems display appears. Type a 9 (Start) next to the system definition you want and press Enter. 3. The Start MIMIX Managers display appears. At the Collector services prompt, verify the value is *YES and press Enter. 152 Ending collector services To end collector services for a system, do the following 1. Do one of the following: • From MIMIX Intermediate Main Menu, select option 2 (Work with Systems) and press Enter. • From the Work with Application Groups display, use option 12 (Node entries). On the resulting Work with Node Entries display, press F7 (Systems). 2. The Work with Systems display appears. Type a 10 (End) next to the system definition you want and press Enter. 3. The End MIMIX Managers display appears. At the Collector services prompt, type *YES and press Enter. Starting target journal inspection processes These instructions will start target journal inspection processes on a selected system. If the system is the target system of one or more data groups whose journal definitions are configured for target journal inspection, a journal inspection job is started for the system journal and for each user journal on the system. If the system is the target system for replication, an inspection job is started for the system journal and for each user journal on the system that is identified within data groups replicating to the system. Target journal inspection processes start at the last sequence number in the currently attached journal receiver in the following cases: • When it is the first time a target journal inspection process is started • When starting after being ended and the last processed receiver is no longer available • When starting after enabling target journal inspection in a journal definition where it was previously disabled When starting target journal inspection after it was previously ended, processing begins with the next sequence number after the last processed sequence number. To start target journal inspection processes for a system, do the following 1. Do one of the following: • From MIMIX Intermediate Main Menu, select option 2 (Work with Systems) and press Enter. • From the Work with Application Groups display, use option 12 (Node entries). On the resulting Work with Node Entries display, press F7 (Systems). 2. The Work with Systems display appears. Type a 9 (Start) next to the system definition you want and press Enter. 3. The Start MIMIX Managers display appears. At the Target journal inspection prompt, verify the value is *YES and press Enter. 153 Ending target journal inspection processes These instructions will end target journal inspection processes on a selected system. If the system is the target system for replication, the inspection process for the system journal is ended and all inspection processes are ended for the user journals identified as the target journal in data groups replicating to the system. To end target journal inspection processes for a system, do the following 1. Do one of the following: • From MIMIX Intermediate Main Menu, select option 2 (Work with Systems) and press Enter. • From the Work with Application Groups display, use option 12 (Node entries). On the resulting Work with Node Entries display, press F7 (Systems). 2. The Work with Systems display appears. Type a 10 (End) next to the system definition you want and press Enter. 3. The End MIMIX Managers display appears. At the Target journal inspection prompt, verify the value is *YES and press Enter. 154 Displaying status of target journal inspection Displaying status of target journal inspection Target journal inspection consists of a set of jobs that read journals on the target system to check for people or processes other than MIMIX that have modified replicated objects on the target system. Best practice is to allow target journal inspection for all systems in your replication environment. Each target journal inspection process runs on a system only when that system is the target system for replication. The number of inspection processes depends on how many journals are used by data groups replicating to that system. On a target system, there is one inspection job for the system journal and one job for each target user journal identified in data groups replicating to that system. Because target journal inspection processes run at the system-level, the best location to begin checking status is from the Work with Systems display. 1. Do one of the following: • From MIMIX Intermediate Main Menu, select option 2 (Work with Systems) and press Enter. • From the Work with Application Groups display, use option 12 (Node entries). On the resulting Work with Node Entries display, press F7 (Systems). 2. The Work with Systems display appears. The Journal Inspect. column shows the summarized status of all journal inspection processes on a system. • Expected values are either *ACTIVE or *NOTTGT. • For all other status values, type 11 (Jrn inspection status) next to the system you want and press Enter. 3. The Work with Journal Inspection Status display appears, listing the subset of journal definitions for the selected system. The status displayed is for target journal inspection for the journal associated with a journal definition. Note: Journal definitions whose journals are not eligible for target journal inspection are not displayed. This includes journal definitions that identify the remote journal used in RJ configurations (whose names typically end with @R) as well as journal definitions JRNMMX and MXCFGJRN which are for internal use. Table 31 identifies the status for the inspection job associated with a journal and how to resolve problems. Table 31. Status values for a single target journal inspection process. Journal Inspection Status Description and Action *INACTIVE (inverse red) Journal inspection is not active. Use option 9 (Start) to start all eligible target journal inspection processes on the system identified in the selected journal definition. 155 Displaying results of target journal inspection Table 31. Status values for a single target journal inspection process. Journal Inspection Status Description and Action *UNKNOWN (inverse white) The status of the process on the system cannot be determined, possibly because of an error or communications problem. Use the procedure in “Verifying all communications links” on page 282. *ACTIVE (inverse blue) Target journal inspection is active for the journal identified in the journal definition. *NEWDG Target journal inspection has not run because all enabled data groups that use the journal definition as a target journal have never been started. The inspection process will start when one or more of the data groups are started. *NOTCFG Either the journal definition does not allow target journal inspection or all enabled data groups that use the journal definition (user journal) prevent journaling on the target system. Target journal inspection is not performed for the journal. For instructions for configuring target journal inspection, see topics “Determining which data groups use a journal definition” and “Enabling target journal inspection” in the MIMIX Administrator Reference book. *NOTTGT The journal definition is not used as a target journal definition by any enabled data group. Target journal inspection is not performed for the journal. This is the expected status when the journal definition is properly configured for target journal inspection but the system is currently a source system for all data groups using this journal definition. Displaying results of target journal inspection Target journal inspection sends a warning notification for each user other than MIMIX who changed objects on the target system since the inspection job started. Because inspection jobs restart daily with other system level processes, a notification would typically be sent once per day per user. The notification identifies only the first object changed by the user. Note: The MIMIX portal application for Vision Solutions Portal provides enhanced capabilities for displaying target journal inspection results. Notifications from target journal inspection processes are identified as originating from TGTJRNINSP in the Notifications portlet on the Summary page. Actions available for these notifications include displaying notification details as well as displaying a list of the objects changed on the target node by the user identified in the notification. Also, you can access a list of all objects changed on the target node by all users from the Replicated Objects portlet on the Analysis page. 156 Displaying results of target journal inspection Displaying details associated with target journal inspection notifications This procedure displays notifications sent by target journal inspection and describes how to display related information from a 5250 emulator. To check for notifications for target journal inspection, do the following: 1. Do one of the following: • On the Work with Application Groups display, the Notifications field indicates whether any warning notifications exist. Press F15 (Notifications). • On the Work with Data Groups display, the third number in the Audits/Recov./Notif. field displays the number of new notifications. Press F8 (Recoveries), then press F10 (Work with Notifications). • From a command line, enter: WRKNFY. 2. The Work with Notifications display appears. Notifications from target journal inspection are identified by the name TGTJRNINSP in the Source column. 3. Type a 5 (Display) to view the notifications details. 4. On the Display Notification Details display, check these fields: • The Originating system field on the Display Notification Details display identifies the system on which target journal inspection ran and sent the notification. • The Notification details field identifies the user or program that made the change, the first object changed, the location it was found in the inspected journal, and a command string to run to see journal entries generated by the user. Note: If the text of the Notification details field is truncated, you can view the full text of the message associated with the notification from the MIMIX message log. Use “Displaying messages for TGTJRNINSP notifications” on page 157. 5. Investigate why the identified user changed objects on the target system. Objects may need to be repaired. Displaying messages for TGTJRNINSP notifications The text of notifications by target journal inspection vary slightly with the object type of the reported object. When a notification is sent, an associated message is sent to the MIMIX message log. You can use the following commands to view the full text of notification messages, Use the name of the originating system (Step 4 in previous procedure) as the name of the originating system (ORGSYS) in these commands: • For library-based objects, Enter: WRKMSGLOG MSGID(LVE3902) PRC(TGTJRNINSP) ORGSYS(name) • For IFS objects, Enter: WRKMSGLOG MSGID(LVE3903) PRC(TGTJRNINSP) ORGSYS(name) 157 Identifying the last entry inspected on the target system • For DLO objects, Enter: WRKMSGLOG MSGID(LVE3904) PRC(TGTJRNINSP) ORGSYS(name) Identifying the last entry inspected on the target system For each target journal inspection process, you can view details that identify the last journal entry inspected and identify the last entry in the current journal receiver. Do the following: 1. From MIMIX Intermediate Main Menu, select option 2 (Work with Systems) and press Enter. 2. The Work with Systems display appears. Type 11 (Jrn inspection status) next to the system you want and press Enter. 3. The Work with Journal Inspection Status display appears. Type 5 (Display) next to the journal definition on the system you want and press Enter. 4. The Display Journal Inspection Status Details display appears. • The following fields identify the currently attached journal receiver and the last entry in the current receiver: Journal, Journal receiver, Last journal entry sequence. and Last journal entry time. • The Target journal inspection fields identify the last entry processed by target journal inspection. 158 What are notifications and recoveries Working with notifications and recoveries CHAPTER 9 This topic describes what notifications and recoveries are and how to work with them. This chapter includes the following topics: • “What are notifications and recoveries” on page 159 defines terms used for discussing notifications and recoveries and identifies the sources that create them. • “Displaying notifications” on page 160 identifies where notifications are viewed in the user interfaces and how to work with them. • “Notifications for newly created objects” on page 163 describes the MIMIX® AutoNotify™ feature which can be used to monitor for newly created libraries, folders, or directories. • “Displaying recoveries” on page 164 identifies where recoveries in progress are viewed in the user interfaces and how to work with them. What are notifications and recoveries A notification is the resulting automatic report associated with an event that has already occurred. The severity of a notification is reflected in the overall status of the installation. Notifications can be generated in a variety of ways: • Target journal inspection processes generate notifications when users or programs other than MIMIX have changed objects on the target node. • Rules that are not associated with the audits provided by MIMIX also generate notifications to indicate that rule processing either ended in error or, if requested, completed successfully. • Shipped monitors, such as the MMNFYNEWE monitor for the MIMIX® AutoNotify™ feature, generate notifications. • Custom automation may initiate user-generated notifications when user-defined events are detected. User-generated notifications can be set to indicate a failure, a warning, or a successful operation. • Audits generate notifications as a secondary mechanism for reporting when the activities performed by an audit complete or end in error. These notifications are automatically marked as acknowledged. (The primary mechanism is to report errors through replication processes and the audit summary.) Policies provide considerable control over notifications generated by audits. Because the manner in which notifications are generated can vary, it is important to note that notifications can represent both real-time events as well as events that occurred in the past but, due to scheduling, are being reported in the present. For 159 Displaying notifications example, the ownership of a file is changed on the target system at 8:00 PM. If your audit (CMPFILA) is scheduled to run at 1:00 AM, MIMIX will detect the change and push a notification to the user interface when the audit completes. Previously, detection of the change was contingent upon you viewing a report after the audit completed and noticing the difference. Recoveries - The term recovery is used in two ways. The most common use refers to the recovery action taken by audits or replication processes to correct a detected difference when automatic recovery polices are enabled. The second use refers to a temporary report that provides details about a recovery action in progress. The report is automatically created when the recovery action starts and is removed when it completes. While it exists, the report identifies what originated the action and what is being acted upon, and may include access to an associated output file (outfile) and the job log for the associated job. The action which generated a report may also generate a notification when the recovery action ends. Displaying notifications Do one of the following to check for notifications: Note: Notifications from audits are automatically set to a status of acknowledged. Audit status and results should be checked from the Work with Audits (WRKAUD) display. • If there are no audit problems in the installation, the MIMIX Availability Status display will indicate whether there are any notifications requiring attention or immediate action that are from sources other than audits. From the MIMIX Availability Status display, type a 5 (Display details) next to Audits and notifications and press Enter. • Notifications from all sources are listed on the Work with Notifications display. To access the Work with Notifications display, enter the command WRKNFY. The list is sorted so that new notifications appear at the top. To see details for a notification, type a 5 (Display) next to the notification you want and press Enter. • The Work with Data Groups display includes the number of new notifications that require action or attention. From the MIMIX Basic Main Menu type 6 (Work with data groups) and press Enter. The Work with Data Groups display appears. The Audit/Recov./Notif. fields are located in the upper right corner. What information is available for notifications The following information is available for notifications listed on the Work with Notifications display. The F11 key toggles between views of status, timestamp, and text of the notification. Additional details are available for each notification through the Display Notification Details display. Status - The Work with Notifications display lists notifications grouped by their status. *NEW - New notifications have not been acknowledged or removed and their status is reflected in higher level status. 160 Displaying notifications *ACK - Acknowledged notifications are archived as viewed and their status is no longer reflected in higher level status. Severity - Identifies the severity level of the notification. *ERROR - An error occurred that requires immediate action. *WARNING - Investigation may be necessary. An operation completed but an error may exist. For example, the MIMIX AutoNotify feature issues notifications with this severity that identify newly created objects that are not identified for replication. *INFO - No user intervention is required. Notification - Displays the notification text sent by audits, automatic recoveries, target journal inspection, monitors, user-defined or MIMIX rules, or a user-generated notification. To view the full text, use option 5 to display the notification details. Data group - Identifies the data group associated with the notification. User-defined notifications and notifications from monitors or user-defined rules may indicate that there is no associated data group. Note: On the Status view of the Work with Notifications display, the F7 key toggles between the Source column and the Data Group column. The full three-part name is available in the Timestamp view (F11). Date - Indicates the date the notification was sent. Time - Indicates the time the notification was sent. Source - Identifies the process, program, or command that generated the recovery. Names that begin with the character # are generated by automatic recovery actions for audits or database replication or by a MIMIX rule. Names that begin with the characters ## are generated by automatic recovery actions for object replication. From System - Identifies the name of the system on which the notification was generated. The name From System is used on the Timestamp view (F11) of the Work with Notifications display. When you display the notification details from the 5250 emulator, this is called the Originating system. Detailed information When you display a notification, you see its description, status, severity, data group, source, and sender as described above. You also have access to the following information: Details - When the source of the notification is a rule, this identifies the command that was initiated by the rule. When the source of the notification is user-generated, this indicates the notification detail text specified when the notification entry was added. When the source of the notification is a monitor, this describes the events that resulted in the notification. Output File - If available, this identifies an associated output file. Output file information associated with a notification is only available from the sender system. For user-generated notifications, output file information is available only if it was specified when the notification was added. 161 Displaying notifications Job - If available, this identifies the job that generated the notification. Job information associated with a notification is only available from the sender system. For usergenerated notifications, this information is available only if it was specified when the notification was added. Options for working with notifications Table 32 identifies the possible actions you can take for a notification. From the Notifications window, the Actions list for each notification contains only the actions possible for the selected notification. Table 32. Options available for notifications Option Description 4=Remove Deletes the notification. You are prompted to confirm this choice. For a notification generated by an audit or a MIMIX rule, the associated job and output files are also deleted. This must be performed from the system on which the notification originated. 5=Display Displays available additional information associated with the notification. For notifications generated by rules, this includes the details of the rule that generated the notification, including the substitution variables for the command the rule initiated 6=Print Prints the information associated with the notification. 8=View results When the information is available, this provides the Name and Library of the output file (outfile) associated with the notification. This option is only available from the system on which the notification originated.1 12=Display job Displays the job log for the job which generated the notification, if it is available. This option is only available from the system on which the notification originated 46=Acknowledge Sets the selected notification status to *ACK (Acknowledged). 47=Mark as new Sets the selected notification status to *NEW (New). 1. MIMIX manages an output file associated with a notification from an automatically recovery action or a MIMIX rule when the output file exists in a specific library. The format of the library name for such an output file is MIMIX-installation-library_0. 162 Notifications for newly created objects Notifications for newly created objects The MIMIX® AutoNotify™ feature can be used to monitor for newly created libraries, folders, or directories. The AutoNotify feature uses a shipped journal monitor called MMNFYNEWE to monitor for new objects in an installation that are not already included or excluded for replication by a data group. The AutoNotify feature monitors the security audit journal (QAUDJRN), and when new objects are detected, issues a warning notification. The MMNFYNEWE monitor is shipped in a disabled state. In order to use this feature, the MMNFYNEWE monitor must be enabled on the source system within your MIMIX environment. Once enabled, this monitor will automatically start with the master monitor. Notifications will be sent when newly created objects meet the following conditions: • The installation must have a data group configured whose source system is the system the monitor is running on. • The journal entry must be a create object (T-CO) or object management change (T-OM). • If the journal entry is a create object (T-CO), then the type must be new (N). • The journal entry must be for a library, folder, or directory. • If the journal entry is for a library, it cannot be a MIMIX generated library since MIMIX generated libraries are not replicated by MIMIX. • If the journal entry is for a directory, it cannot be the /LAKEVIEWTECH directory, or any directory under /LAKEVIEWTECH. • If the journal entry is for a directory, it must be a directory that is supported for replication by MIMIX. • The object is not already known (included or excluded) in the installation. Notifications can be viewed from the Work with Notifications (WRKNFY) display. The notification message will indicate required actions. 163 Displaying recoveries Displaying recoveries Active recoveries are an indication of problems detected and being corrected by MIMIX AutoGuard. Before certain activity, such as ending MIMIX, it is important that no recoveries are in progress in the installation. You can check for recoveries from either user interface. You can see how many recoveries are in progress from the MIMIX Availability Status display or the Work with Data Groups display. The Work with Recoveries display lists recoveries and provides options for working with held recoveries associated with an audit or a MIMIX rule. To see a count of recoveries in progress, do one of the following • To access the MIMIX Availability Status display, enter the command WRKMMXSTS. The Recoveries field in the upper right corner of the display shows the number of active recoveries in progress for the installation. • To access the Work with Data Groups display, use option 5 (Display details) next to the Replication area. Figure 35 shows the Audits/Recov./Notif. fields in the upper right corner of the Work with Data Groups display. The first number is the total number of audits that require action to correct a problem or require your attention to prevent a situation from becoming a problem. The second number indicates the number of active recoveries, including those resulting from audits. The third number indicates the number of new notifications that require action or attention. If more than 999 items exist in any field, the field will display +++. A consistently high number of recoveries suggests that there may be configuration issues with one or more data groups. To select a recovery to view or work with a held recovery, do the following: 1. To access the Work with Recoveries display, do one of the following: • From the Work with Audits display, use option 8 (Recoveries) to see a list of recoveries associated with an audit. • From the Work with Data Groups display, use F8 to see all recoveries. • On a command line, enter the command WRKRCY. 2. To see details for a recovery, type a 5 (Display) next to the recovery you want and 164 Displaying recoveries press Enter. Figure 35. Work with Data Groups display showing recoveries in progress CHICAGO 10:49:06 Type options, press Enter. Audits/Recov./Notif.: 001 / 002 / 003 5=Display definition 8=Display status 9=Start DG 10=End DG 12=Files not active 13=Objects in error 14=Active objects 15=Planned switch 16=Unplanned switch ... ---------Source----------------Target--------ErrorsOpt Data Group System Mgr DB Obj DA System Mgr DB Obj DB Obj __ TESTDG34 LONDON A A A CHICAGO A A A __ TESTDG43 LONDON A A A CHICAGO A A A Work with Data Groups F3=Exit F10=Legend Bottom F5=Refresh F7=Audits F8=Recoveries F9=Automatic refresh F16=DG definitions F23=More options F24=More keys What information is available for recoveries The following information is available for recoveries listed on the Work with Recoveries display. The F11 key toggles between views of status, timestamp, and text of the recoveries. Additional details are available for each recovery through the Display Recovery Details display. Each recovery provides a brief description of the recovery process taking place as well as its current status. Status - Shows the status of the recovery action. *ACTIVE - The job associated with the recovery is active. *ENDING - The job associated with the recovery is ending. *HELD - The job associated with the recovery is held. A recovery whose source is a replication process cannot be held. Data group - Identifies the data group associated with the recovery. Note: On the Status view of the Work with Recoveries display, the F7 key toggles between the Source column and the Data Group column. The full three-part name is available in the Timestamp view (F11). Date - Indicates the date the recovery process started. Time - Indicates the time the recovery process started. Source - Identifies the process, program, or command that generated the recovery. Names that begin with the character # are generated by automatic recovery actions 165 Displaying recoveries for audits or database replication or by a MIMIX rule. Names that begin with the characters ## are generated by automatic recovery actions for object replication. Sender or From System - Identifies the system from which the recovery originated. Detailed information When you display a recovery, you see its description, status, data group, source, and sender as described above. You also have access to the following information. Details - When the source of the recovery is a rule, this identifies the command run by the rule in an attempt to recover from the detected error. Output File - If available, this identifies an associated output file that lists the detected errors the recovery is attempting to correct. Output file information associated with a recovery is only available from the sender system. Job - If available, this identifies the job that is performing the recovery action. Job information associated with a recovery is only available from the sender system. Options for working with recoveries Table 33 identifies the possible actions you can take for a recovery. From the Recoveries window, the Actions list for each recovery contains only the actions possible for the selected recovery. Table 33. Options available for recoveries WRKRCY Option Description 4=Remove Removes the specified recovery, if it is not held or active. A confirmation panel is displayed after pressing Enter. Use this option to remove orphaned recoveries whose associated recovery job ended. This option is only available from the system on which the recovery job ran. 5=Display Displays available additional information associated with the recovery. 6=Print Prints the information associated with the recovery. 8=View progress Displays a filtered view of the output file associated with the recovery. MIMIX updates the output file while the recovery is in progress, identifying the detected errors it is attempting to correct and marking corrected errors as being recovered.This option is only available from the system on which the recovery job is running. 10=End job Ends an active recovery job. This action is valid for recoveries with names that begin with # and is only available from the system on which the recovery job is running. 12=Display job Displays the job log for the recovery job associated in progress. This option is only available from the system on which the recovery job is running. 166 Displaying recoveries Table 33. Options available for recoveries WRKRCY Option Description 13=Hold job Places an active recovery job on hold. This action is valid for recoveries with names that begin with # and is only available from the system on which the recovery job is running. 14=Release job Releases a held recovery job. This action is valid for recoveries with names that begin with # and is only available from the system on which the recovery job is held. Orphaned recoveries There are times when recoveries exist but are no longer associated with a job. The following conditions could cause recoveries to become orphaned: • An unplanned switch has occurred • The MIMIX subsystem was ended unexpectedly • A recovery job was ended unexpectedly When automatic audit recovery is enabled, orphaned recoveries are converted to error notifications during system cleanup. If the orphaned recovery is older than the cleanup time specified in the system definition, it is deleted. When automatic database recovery or automatic object recovery is enabled, orphaned recoveries are deleted, when possible. Because recoveries are displayed on both systems, but jobs associated with them are only accessible from the originating system, you need to verify that the recovery is orphaned before removing it. Determining whether a recovery is orphaned Do the following to determine whether a recovery is orphaned: 1. From a command line, type WRKRCY and press Enter. 2. Press F11 to display the Timestamp view. This view allows you to see the From System column which lists the system from which the recovery originated. 3. Ensure you are operating from the originating system. Then type a 12 next to the recovery. 4. Do one of the following: • If an error message is displayed indicating that the job associated with the recovery is not found, follow the steps in “Removing an orphaned recovery” on page 168. • When the Display Job display appears, type a 10 in the Selection field and press Enter. The status of the job is displayed. If the job associated with the recovery is no longer valid, follow the steps in “Removing an orphaned recovery” on page 168. 167 Displaying recoveries Removing an orphaned recovery These procedures assume that you have already confirmed that the recovery is orphaned using the procedures in “Determining whether a recovery is orphaned” on page 167. Do the following to remove an orphaned recovery: 1. From the originating system, type WRKRCY on the command line and press Enter. 2. After you have ensured that the recovery is orphaned, type a 4 next to the orphaned recovery you wish to remove and press Enter. 3. Press Enter to confirm your request to remove the recovery. 168 CHAPTER 10 Starting and ending replication MIMIX uses a number of processes to perform replication. These processes, along with a number of supporting processes must be active to enable MIMIX to function. These pairs of commands will start and end replication: • The Start MIMIX (STRMMX) and End MIMIX (ENDMMX) commands will start or stop replication processes as well as all supporting processes for the products in a MIMIX installation library in a single operation. These commands are the preferred method for starting and ending MIMIX. • The Start Application Group (STRAG) and End Application Group (ENDAG) commands will start or stop replication processes in environments configured with application groups. Each command calls a default procedure with steps to perform its operations and can be customized. • The Start Data Group (STRDG) and End Data Group (ENDDG) commands will start or stop data group replication processes. These commands are the basis for controlling replication processes and are invoked programmatically by the previously identified commands. This chapter provides information about and procedures for use each set of commands. The following topics are included: • “Before starting replication” on page 171 applies to all methods of starting replication. • “Commands for starting replication” on page 171 describes the STRMMX, STRAG, and STRDG commands and considerations for their use. • “What occurs when a data group is started” on page 174 describes what the STRDG command does in addition to starting replication, choices for specifying a journal starting point, and options for clearing pending and error entries. • “Starting MIMIX” on page 179 provides a procedure for using the STRMMX command. • “Starting an application group” on page 180 provides a procedure for using the STRAG command. • “Starting selected data group processes” on page 181 provides a procedure for using the STRDG command and identifies when the start request should include clearing pending entries. • “Starting replication when open commit cycles exist” on page 183 describes when MIMIX cannot start replication due to open commit cycles and how to resolve them and start replication. • “Before ending replication” on page 184 to all methods of ending replication. • “Commands for ending replication” on page 184 describes the ENDMMX, ENDAG, and ENDDG commands and considerations for their use, such as when to perform a controlled end or when to end the RJ link. 169 • “What occurs when a data group is ended” on page 190 describes the behavior of the ENDDG command. • “Ending MIMIX” on page 179 provides procedures for using the ENDMMX command and describes when you may also need to end the MIMIX subsystem. • “Ending an application group” on page 194 provides a procedure for using the ENDAG command. • “Ending a data group in a controlled manner” on page 195 provides procedures for preparing to end, ending, and confirming that the end completed without problems. • “Ending selected data group processes” on page 198 provides a procedure using the ENDDG command. • “What replication processes are started by the STRDG command” on page 199 describes which replication processes are started with each possible value of the Start processes (PRC) parameter. Both data groups configured for remote journaling and data groups configured for MIMIX source-send processing are addressed. • “What replication processes are ended by the ENDDG command” on page 203 describes what replication processes are ended with each possible value for the End Options (PRC) parameter. Both data groups configured for remote journaling and data groups configured for MIMIX source-send processing are addressed. 170 Before starting replication Before starting replication Consider the following: • Before starting replication, the database files and objects to be replicated by a data group must be synchronized between the systems defined to the data group. For more information about performing the initial synchronization, see the MIMIX Administrator Reference book. • If you are using the MIMIX for MQ function, you must use the procedures in the MIMIX for IBM WebSphere MQ book for initial synchronization and initial start of data groups that replicate data for IBM WebSphere MQ. • Data groups that are in a disabled state are not started. Only data groups that have been enabled can be started. Commands for starting replication These commands start replication processes. The significant differences between these commands are: Start MIMIX (STRMMX) – The STRMMX command will start all MIMIX processes in a MIMIX installation, including those used for replication, in a single operation regardless of how replication is configured. This is the preferred method of starting MIMIX. Optionally, this command can be used to start all MIMIX processes on the local system only. Start Application Group (STRAG) – The STRAG command will start replication processes for data groups that are part of an application group. This is the preferred method of starting replication in application groups. The command invokes a procedure which performs the operations to start replication for the participating data groups. Start Data Group (STRDG) – The STRDG command will start replication processes for a data group and the remote journal link, if necessary. This command is the basis for all other methods of starting replication. Optionally, this command can specify a starting point in the journals, clear any pending or error entries, set object auditing levels, and start a subset of the replication processes. What is started with the STRMMX command The STRMMX command is shipped with default values that will start all MIMIX processes on all systems in the installation. Optionally, the command can be used to start MIMIX processes on only the local system. Processes are started in the following order: MIMIX managers and services - All jobs for the system managers, journal managers, target journal inspection, and collector services are started on the specified systems. If you are using MIMIX with IBM i clustering, Cluster Services are started for all specified systems that are configured for clustering. Data groups - For enabled data groups, starts the replication processes, remote 171 Commands for starting replication journal links, and automatic recovery processes on the specified systems. Each data group starts from the journal receiver in use when the data group ended and with the sequence number following the last sequence number processed. Master monitor - Starts the master monitor on each of the specified systems. Monitors - On each of the specified systems, the master monitor starts monitors that are not disabled and which are configured to start with the master monitor. Application groups - If all systems are specified, all application groups and any associated data resource groups are started. If IBM i clustering is used, default processing will start the IBM application CRG. Note: The STRMMX command does not start promoter group activity. Start promoter group activity using procedures in the Using MIMIX Promoter book. STRMMX and ENDMMX messages Once you have run the STRMMX or ENDMMX command, one of the following messages is displayed: • Completion LVI0902 – This message indicates that all MIMIX products were started or ended successfully. • Escape LVE0902 – This message indicates one or more MIMIX products failed to start or end. What is started by the default START procedure for an application group When an application group is created, a default procedure named START is created for it from a shipped default procedure. The Start Application Group (STRAG) command automatically uses the application group’s default START procedure unless you specify a different procedure. Steps in the shipped default START procedure are described in the MIMIX Administrator Reference book. Choices when starting or ending an application group For the purpose of describing their use, the Start Application Group (STRAG) and End Application Group (ENDAG) commands are quite similar. This topic describes their behavior for application groups that do not participate in a cluster controlled by the IBM i operating system (*NONCLU application groups). What is the scope of the request? The following parameters identify the scope of the requested operation: Application group definition (AGDFN) - Specifies the requested application group. You can either specify a name or the value *ALL. Resource groups (TYPE) - Specifies the types of resource groups to be processed for the requested application group. 172 Commands for starting replication Data resource group entry (DTARSCGRP) - Specifies the data resource groups to include in the request. The default is *ALL or you can specify a name. This parameter is ignored when TYPE is *ALL or *APP. What is the requested behavior? The following parameters, when available, define the expected behavior: Current node roles (ROLE) - Only available on the STRAG command, this parameter is ignored for non-cluster application groups. What procedure will be used? The following parameters identify the procedure to use and its starting point: Begin at step (STEP) - Specifies where the request will start within the specified procedure. This parameter is described in detail below. Procedure (PROC) - Specifies the name of the procedure to run to perform the requested operation when starting from its first step. The value *DFT will use the procedure designated as the default for the application group. The value *LASTRUN uses the same procedure used for the previous run of the command. You can also specify the name of a procedure that is valid the specified application group and type of request. Where should the procedure begin? The value specified for the Begin at step (STEP) parameter on the request to run the procedure determines the step at which the procedure will start. The status of the last run of the procedure determines which values are valid. The default value, *FIRST, will start the specified procedure at its first step. This value can be used when the procedure has never been run, when its previous run completed (*COMPLETED or *COMPERR), or when a user acknowledged the status of its previous run which failed, was canceled, or completed with errors (*ACKFAILED, *ACKCANCEL, or *ACKERR respectively). Other values are for resolving problems with a failed or canceled procedure. When a procedure fails or is canceled, subsequent attempts to run the same procedure will fail until user action is taken. You will need to determine the best course of action for your environment based on the implications of the canceled or failed steps and any steps which completed. The value *RESUME will start the last run of the procedure beginning with the step at which it failed, the step that was canceled in response to an error, or the step following where the procedure was canceled. The value *RESUME may be appropriate after you have investigated and resolved the problem which caused the procedure to end. Optionally, if the problem cannot be resolved and you want to resume the procedure anyway, you can override the attributes of a step before resuming the procedure. The value *OVERRIDE will override the status of all runs of the specified procedure that did not complete. The *FAILED or *CANCELED status of these procedures are changed to acknowledged (*ACKFAILED or *ACKCANCEL) and a new run of the procedure begins at the first step. . For more information about starting a procedure with the step at which it failed, see “Resuming a procedure” on page 91. 173 What occurs when a data group is started What occurs when a data group is started The Start Data Group (STRDG) command will start the replication processes for the specified data group. The STRDG command can be used interactively or programatically. Default values for the command are used when it is invoked by the STRMMX command or by the STRAG command running the default START procedure. When a STRDG request is processed, MIMIX may take a few minutes while it does the following for each specified data group: • Determines whether the RJ link is active and whether all required system managers and journal managers on each system are started. If necessary, the managers and the remote journal function defined by the RJ link are started. • Determines the starting point for replication (database, object, or both, as configured). • Locates the starting point in the appropriate journal receiver. This will be the starting point for send processes. • If necessary, changes the object audit level of existing objects identified for replication. This occurs when starting following a switch or a configuration change to any data group object, IFS, or DLO entry. This ensures that all replicated objects identified by all entries of each entry type are set with an object audit level suitable for replication. The processing order for data group entries can affect the auditing value of IFS objects. For examples and for information about manually specifying the audit level of objects, see the MIMIX Administrator Reference book. • Submits the appropriate start requests for the processes specified on the start request. • Makes configuration changes for the data group become effective. If a configuration change affects the set of objects to be replicated, the start request also automatically deploys the configuration changes to an internal list used by other functions. This may cause the start request to take longer. • Attempts to recover any existing access path maintenance1 errors for the data group, if the Access path maintenance (APMNT) policy is enabled. • If specified on the start request, clears all pending entries for apply processes and clears all error entries identified in replication processing for the data group. There are times when it is necessary to clear pending entries, error entries, or both, to establish a new synchronization point for the data group. Starting a data group may take longer if the remote journal function is operating in catchup mode. 1. The access path maintenance function is available on installations running MIMIX 7.1.15.00 or higher. Access path maintenance replaces the parallel access path maintenance function available on installations running earlier software levels, On earlier software levels, a start data group request creates and activates the monitors used by the parallel access path maintenance function if the parallel access path maintenance (PRLAPMNT) policy is enabled. 174 What occurs when a data group is started Journal starting point identified on the STRDG request On the STRDG command, you can optionally specify the point at which to start replication in the journal receivers. The parameters for database and object journal receivers and sequence numbers provide this capability. You may need to use these parameters when starting data groups for the first time. • For user journal replication, the IBM i remote journal function controls where processing starts in the source journal receiver. The values specified for the Database journal receiver (DBJRNRCV) and Database large sequence number (DBSEQNBR2) identify the starting location for the database reader process and the database apply process. • For system journal replication, the value specified for Object journal receiver (OBJJRNRCV) and Object large sequence number (OBJSEQNBR2) identify the starting location for the object send process and the object apply process. Note: The parameters Database sequence number (DBSEQNBR) and Object sequence number (OBJSEQNBR) continue to be valid for journal definitions which specify *MAXOPT2 for the Receiver size option (RCVSIZEOPT) and for values that do not exceed 10 digits. To ensure continued compatibility, the use of parameters DBSEQNBR2 and OBJSEQNBR2 is recommended. Journal starting point when the object send process is shared When starting data groups that share an object send process, the first data group to start will start the shared job at that data group’s starting point in the system journal (QAUDJRN). As additional data groups start, each recognizes that the shared object send job is active. The object send job determines whether the starting point for that data group is earlier or later than the sequence number being read. If the data group’s starting point is later, replication will begin when the shared job reaches the data group's starting point. If the data group’s starting point is earlier, the shared job completes its current block of entries, then returns to the earliest point for any of the data groups being started. The shared job reads the earlier entries and routes the transactions to the data group being started. When the shared job reaches the last entry it read at the time of the STRDG request, it resumes routing transactions to all active data groups using the shared job. If the starting data group has a significant object send backlog, the other data groups sharing the job will not receive transactions to replicate while the backlog for the starting data group is being addressed. Therefore, when a significant backlog exists, it is recommended that you change the data group configuration to use a dedicated job (*DGDFN for object send prefix), start the data group, and allow it to catch up to the current location of the shared job. Then end the data group, change its configuration to use the desired shared job, and restart the data group. Clear pending and clear error processing The Clear pending and Clear error prompts on the STRDG command provide flexibility when starting a data group by allowing you to optionally reset error status conditions on data group file entries and discard pending journal entries that are 175 What occurs when a data group is started stored in the journal log space. Clear pending resets the starting point for all data group file entries and object entries. Clear error clears the hold log spaces. When clearing pending entries, you can optionally specify which system to use for determining database file network relationships when distributing files among database apply sessions. The System for DB file relations (DBRSYS) prompt identifies which system is used to assign data group file entries to apply sessions when the start request specifies to clear pending entries in all apply sessions. Table 34 shows the processing that occurs based on the selection made for the Clear pending (CLRPND) and Clear error (CLRERR) prompts. The Clear pending and Clear error prompts work independently. For example, when CLRPND(*NO) is selected, no clear pending processing occurs. Table 34. CLRPND and CLRERR processing CLRPND CLRERR Processing Description *NO *NO Data groups start with regular processing: • Data group file entry status remains unchanged. • Hold logs remain unchanged. *NO *CLRPND The value selected for the CLRPND parameter is used for CLRERR. Same processing as CLRPND(*NO) CLRERR(*NO). *NO *YES • Data group file entries in *HLDERR, *HLDRGZ, *HLDRNM, *HLDPRM, and *HLDRLTD status are cleared. • Tracking entries in *HLDERR status are cleared. • Hold log space is deleted. See File entry states See Log spaces *YES *NO Note: CLRPND(*YES) will not start a data group when there are open commit cycles on files defined to the data group. See File entry apply session assignment See Single apply session processing See Log spaces • Data group file entries in *HLDRGZ, *HLDRNM, and *HLDPRM status are cleared and reset to active. • Data group tracking entries in *HLDRNM are cleared and reset to active. • Data group file entries and tracking entries in *HLDERR status remain unchanged. • If there is a requested status at the time of starting, it is cleared. • Journal, hold, tracking entry hold, and apply history log spaces are deleted. • The apply session to which data group file entries are assigned may change. Notes 176 What occurs when a data group is started Table 34. CLRPND and CLRERR processing CLRPND CLRERR Processing Description Notes *YES *YES Note: CLRPND(*YES) will not start a data group when there are open commit cycles on files defined to the data group. See File entry states See File entry apply session assignment See Single apply session processing See Log spaces • Data group file entries in *HLDERR, *HLDRGZ, *HLDRNM, *HLDPRM, and *HLDRLTD status are cleared and reset to active. • Data group file entries in *HLDRTY status remain unchanged. • Data group object activity entries in any failed or active status are changed to CC (Completed by clear request). • Tracking entries in *HLDERR and *HLDRNM status are cleared and reset to active. • Tracking entries are primed if any configuration changes occurred for data group object entries or data group IFS entries. • If there is a requested status at the time of starting, it is cleared. • Journal, hold, and apply history log spaces are deleted. • The apply session to which data group file entries are assigned may change. *YES *CLRPND Note: CLRPND(*YES) will not start a data group when there are open commit cycles on files defined to the data group. The value selected for the CLRPND parameter is used for CLRERR. Same processing as CLRPND(*YES) CLRERR(*YES). See File entry states See File entry apply session assignment See Single apply session processing See Log spaces File entry states: Files in specific states will not reset to active when you specify *YES on the Clear Error prompt. If you have set data group file entries to any of these states, the following process exception applies: Note: The only states that can be set using the Set Data Group File Entry (SETDGFE) command are *HLD, *RLSWAIT, *ACTIVE, and *HLDIGN. All other states are the result of internal processing. • *HLD - Journal entries cached before *YES is specified are discarded. If *ALL or *ALLSRC is specified on the Start processes prompt, all subsequent entries from the specified starting point will be cached again. • *RLSWAIT - Journal entries are discarded as they wait for the synchronization point to arrive in the journal stream. This occurs regardless of the value specified for Clear Error or Clear Pending. • *HLDIGN - Journal entries are discarded until the file status is changed to something else. • *HLDSYNC - Journal entries are ignored since an external process is actively synchronizing the file. When that event completes normally, the file is set to *RLSWAIT. 177 What occurs when a data group is started Table 34. CLRPND CLRPND and CLRERR processing CLRERR Processing Description Notes File entry apply session assignment: Clear pending processing attempts to load balance the data group file entries among the defined apply sessions. If the requested apply session in the data group file entry definition is *ANY, or if it is *DGDFT and the requested apply session for the data group definition is *ANY, then the apply session to which the data group file entry is assigned may be changed when processing occurs. For data groups configured to replicate through the user journal, the requested apply session may be ignored to ensure that related files are handled by the same apply session. The value specified for System for DB file relations (DBRSYS) determines the system used to determine database file relationships while assigning files to apply sessions. This parameter is evaluated only when the start request specifies to clear pending entries for all database apply sessions. The default value, *TGT, uses the target system to determine the file relationships. Single apply session processing: In most situations, you will perform clear pending processing on all apply sessions belonging to a data group by specifying *ALL or *DBALL on the Start processes (PRC) prompt. MIMIX also supports the ability to perform clear pending processing on a single apply session, which is useful for recovery purposes in certain error situations. The System for DB file relations (DBRSYS) parameter is ignored when the start request specifies a specific apply session.To perform clear pending processing on a single apply session, specify PRC(*DBAPY) and the specific apply session (APYSSN). Log spaces: Because they have not been applied, journal entries that exist in the journal log space are considered pending. Journal entries that exist in the hold log space, however, are considered in error. The Clear pending and Clear error prompts affect which log spaces are deleted (and recreated) when a data group is started. 178 Starting MIMIX Starting MIMIX To start all MIMIX products within an installation library, do the following: 1. If you are starting MIMIX for the first time or starting MIMIX after a system IPL, do the following: a. Use the command WRKSBSJOB SBS(MIMIXSBS)to verify that the MIMIX subsystem is running. If the MIMIXSBS is not already active, start the subsystem using the STRSBS SBSD(MIMIXQGPL/MIMIXSBS)command. b. If MIMIX uses TCP/IP for system communication, the TCP/IP servers must be running. If TCP/IP is not already active, start TCP/IP using the port number defined in the transfer definitions and the procedures described in “Starting the TCP/IP server” on page 260. 2. Do one of the following: • From the MIMIX Basic Main Menu, select option 2 (Start MIMIX) and press Enter. • From a command line type STRMMX and press Enter. 3. The Start MIMIX (STRMMX) display appears. Accept the default value for the System definition prompt and press Enter. 4. If you see a confirmation display, press Enter to start MIMIX. 179 Starting an application group Starting an application group For an application group, a procedure for only one operation (start, end, or switch) can run at a time. For information about parameters and shipped procedures, see “What is started by the default START procedure for an application group” on page 172 and “Choices when starting or ending an application group” on page 172. To start an application group, do the following: 1. From the Work with Application Groups display, type 9 (Start) next to the application group you want and press F4 (Prompt). 2. Verify that the values you want are specified for Resource groups and Data resource group entry. 3. If you are starting after addressing problems with the previous start request, specify the value you want for Begin at step. Be certain that you understand the effect the value you specify will have on your environment. 4. Press Enter. 5. The Procedure prompt appears. Do one of the following: • To use the default start procedure, press Enter. • To use a different start procedure for the application group, specify its name. Then press Enter. 180 Starting selected data group processes Starting selected data group processes This procedure can be used to do any of the following: • Start all or selected processes for a data group, or start a specific database apply process • Specify a starting point for journal receivers when starting a data group • Clear pending and error entries when starting a data group Data groups that are in an application group: The preferred method of starting data groups that are part of an application group is to use the Start Application Group (STRAG) command. Beginning with service pack 7.1.06.00, the default behavior of the STRDG command helps to enforce this best practice when necessary by not allowing the command to run when the data group is participating in a resource group with three or more nodes. (A data resource group provides the association between one or more data groups and an application group.). The STRDG request will run when the data group is participating in a resource group with two nodes. In earlier software levels, default behavior does not allow a start request when the data group is part of an application group. In application group environments with three or more nodes, it is particularly important to treat all members of an application group as one entity. For example, a configuration change that is made effective by starting and ending a single data group would not be propagated to the other data groups in the same resource group. However, the same change would be propagated to the other data groups if it is made effective by ending and starting the parent application group. When to clear pending entries and entries in error: Table 35 identifies when it is necessary to clear pending entries for apply processes and clear logs of entries indicating files in error to establish a new synchronization point when starting a data group. The reason for starting the data group determines whether you need to clear only pending entries for transactions waiting to be applied, clear only errors, or both. Before clearing pending entries, determine if there are any file entries on hold. These are the transactions that will be lost by clearing pending entries. When clearing pending entries, most environments can accept the default value for the System for DB file relations prompt. If necessary, you can specify a value when directed to by your MIMIX administrator. 181 Starting selected data group processes Table 35. When to clear pending entries and entries in error when starting a data group If starting the data group in any of these conditions: Specify these values on the STRDG command: After enabling a previously disabled data group Clear pending entries. Specify *YES for the Clear pending prompt After changing the Number of DB apply sessions (NBRDBAPY) parameter on the data group definition After synchronizing database files and objects between two systems Note: This assumes that you have synchronized the objects and database files and have changed the journal receivers using TYPE(*ALL) on the CHGDGRCV command. After switching the direction of the data group, when starting replication on the system that now becomes the source system Clear pending entries and entries in error. • Specify *YES for the Clear pending prompt • Specify *CLRPND or *YES for the Clear error prompt For additional information about the STRDG command, refer to the following topics: • “What occurs when a data group is started” on page 174 • “What replication processes are started by the STRDG command” on page 199 To start a data group, do the following: 1. From the Work with Data Groups display, type a 9 (Start DG) next to the data group that you want to start and press Enter. The Start Data Group (STRDG) display appears. 2. At the Start processes prompt, specify the value for the processes you want to start. If you are starting the data group for the first time specify *ALL. To see a list of values, press F4 (Prompt). 3. Press Enter. 4. Additional prompts appear. For most situations, you should accept the default values. If necessary, specify the following: • At the Database journal receiver and Database large sequence number prompts, identify where the database reader and apply processes begin. • At the Object journal receiver and Object large sequence number prompts, identify where the object send and apply processes begin. • If you are starting the data group for any of the reasons listed in Table 35, specify the indicated values for that reason in the Clear pending and Clear error prompts. • If you are submitting this command for batch processing, you should specify *NO for the Show confirmation screen prompt. 5. To start the data group, press Enter. 182 Starting replication when open commit cycles exist Starting replication when open commit cycles exist Open commit cycles may be present when a data group ends, or if a system event or failure occurred. In most conditions, an open commit cycle present at the time that a data group ended will not prevent a request to start replication from running. However, MIMIX will prevent the data group from starting when either of these conditions exist: • When the start request specifies to clear pending entries. Certain procedures may require a clear pending start. Message LVE387F is issued with reason code AP. • When the commit mode specified for the database apply process changed. Changing the commit mode is not a common occurrence. Message LVEC0B3 is issued. When these conditions exist, the open commit cycles must be resolved. Checking for open commit cycles Do the following to check for open commit cycles: 1. From the MIMIX Basic Main Menu, type a 6 (Work with data groups) and press Enter. 2. The Work with Data Group display appears. Type an 8 (Display status) next to the data group you ended and press Enter. 3. Press F8 (Database) to view the Data Group Detail Status display. 4. For each apply session listed, check the value shown in the Open Commit column at the right side of the display. If the value is *YES, open commit cycles exist for the data group. Resolving open commit cycles This procedure assumes that the data group is ended and that you have confirmed the presence of open commit cycles. 1. Start the data group, specifying *NO for the Clear pending prompt. 2. You must take action to resolve the open commit cycles, such as ending or quiescing the application or closing the commit cycle. MIMIX will process the open commit cycles when they are resolved. 3. Perform a controlled end of the data group. 4. When the data group is ended, check for open commit cycles again. You may need to repeat this procedure until all open commit cycles have been resolved. 183 Before ending replication Before ending replication Consider the following: • If the next time you start the data groups requires that you clear pending entries, or if you will be performing a switch, you should verify that no activity is still in progress before you perform these activities. Use the command WRKDGACTE STATUS (*ACTIVE) to ensure all activity entries completed. • Data groups that are in a disabled state are not ended. Only data groups that have been enabled and have been started can be ended. Commands for ending replication These commands end replication processes. The significant differences between these commands are: • End MIMIX (ENDMMX) - The ENDMMX command will end all MIMIX processes in a MIMIX installation, including those used for replication, in a single operation. Optionally, this command can be used to end all MIMIX processes on the local system only. • End Application Group (ENDAG) - The ENDAG command will end replication processes for data groups that are part of an application group. This is the preferred method of ending replication in application groups. The command invokes a procedure which performs the operations to end replication for the participating data groups. • End Data Group (ENDDG) - The ENDDG command will end the specified replication processes for the data group either immediately or in a controlled manner. This command is the basis for all other methods of ending replication, and is also called by commands that perform switch operations. Optionally, this command can end a subset of replication processes or a selected database apply process, specify a wait time and end option for controlled ends, and end the remote journal link. Command choice by reason for ending replication Table 36 lists common reasons for ending MIMIX activity and the appropriate command to use. Depending on why you are ending replication, you may need to choose values other than the defaults. Table 36. Choosing the appropriate command to end replication Reason for Ending Replication Use Command Ending communications for any reason ENDMMX Performing a full save and restore of data that is defined to MIMIX ENDMMX Additional Information 184 Commands for ending replication Table 36. Choosing the appropriate command to end replication Reason for Ending Replication Use Command Additional Information Performing a save from the source system ENDAG or ENDDG When application groups are used, use the ENDAG command with its default END procedure. See “What is ended by the default END procedure for an application group” on page 189. For ENDDG, specify *ALL for the Process (PRC) parameter. See “What replication processes are ended by the ENDDG command” on page 203. The save request may not be able to save all the files or objects if they are opened or locked by MIMIX. Performing a save from the target system If using step programs and procedures, run ENDTGT or ENDDG PRC(*ALLTGT) See “Ending all or selected processes” on page 187. You may be able to end only selected processes on the target system. See “Ending selected data group processes” on page 198. The save request may not be able to save all the files or objects if they are opened or locked by MIMIX. Preparing to update MIMIX software ENDMMX See controlled end information in “Ending immediately or controlled” on page 186. Performing an IPL of either system ENDMMX Also end the RJ link Upgrading the operating system release on either system ENDMMX Also end the RJ link Performing a switch in preparation for performing maintenance on either system --- Let your switching mechanism end replication (switch procedure for application group or (MIMIX Switch Assistant or MIMIX Model Switch Framework) Ending only a selected replication process ENDDG See “Ending selected data group processes” on page 198. Changing configuration, such as adding or changing data group entries ENDAG or ENDDG When application groups are used, use the ENDAG command. The changes are not available to active replication processes until the data group processes are ended and restarted. 185 Commands for ending replication Additional considerations when ending replication The following questions will help you determine additional options you may need when ending replication. All methods of ending replication can accomplish these activities, but in some, the action is not default or may require additional programming. • Do processes need to end in a controlled manner or can they be ended immediately? Both commands support these options. For more information, see “Ending immediately or controlled” on page 186 • Do you need to end only a subset of the replication processes? Only ENDDG supports ending selected processes. For more information see “Ending all or selected processes” on page 187. • Does the RJ link also need to end? For data groups that use remote journaling you may also choose whether to end the RJ link. In most cases, the RJ link can remain active. For more information, see “When to end the RJ link” on page 188. Ending immediately or controlled Both ENDMMX and ENDDG commands provide the ability to choose whether replication processes end immediately or in a controlled manner through the End process (ENDOPT) parameter. For the ENDAG command, the specified end procedure determines whether replication processes end immediately or in a controlled manner. If the procedure specifies a controlled end, the procedure also determines wait time and time out options. When you perform an immediate end, the processes end independently of each other. For example, it is possible for the apply process to end before the send or receive process. Each replication process verifies that its processing is at a point that will permit ending, then ends. The amount of time it takes for an immediate end varies depending on the delay values set for each manager and what each process is doing at the time. An immediate end does not ensure that all journal entries generated are sent to or applied on the target system. If an incomplete IFS or object tracking entry for a data group is being processed during an immediate end, the entire entry may not be applied. When the data group is restarted, the entire incomplete entry is rewritten to ensure the integrity of the object. When you perform a controlled end, MIMIX creates either a journal entry or log space entry. This entry proceeds through the replication path. The date and time of the entry are compared to the date and time of when the process being considered was started. If the entry is earlier than the process start time, the end request is ignored. If the entry is later than when the process being considered was started, the process is ended. A controlled end ensures that processes end in order and that each process completes any queued or in-progress transactions before the next process is permitted to end. This ensures that you have a known point in each journal at which you can restart replication. 186 Commands for ending replication If any processes have a backlog of entries, it may take some time for the entry created by the request to be processed through the replication path. Any entries that precede the entry requesting to end are processed first. A data group that is ended in a controlled manner is prepared for a more effective and safer start when the start request specifies to clear pending entries. The existence of commit cycles implies that there is application activity on the source system that should not be interrupted; replication should be allowed to continue through the end of the commit cycle. It is preferable to ensure that commit cycles are resolved or removed before ending a data group. There are conditions in which a data group will not start if open commit cycles exist. For more information, see “Starting replication when open commit cycles exist” on page 183. If the request to perform a controlled end also includes ending the RJ link, the RJ link is ended after all requested processes end. Either type of end request may be ignored if the request is submitted just before the time that MIMIX jobs are restarted daily. For more information about restarting jobs, see ‘Configuring restart times for MIMIX jobs’ in the MIMIX Administrator Reference book. Controlling how long to wait for a controlled end to complete On the ENDMMX or ENDDG command, when you request a controlled end you can determine how long to wait for all specified data group processes to end. The Wait time (seconds) (WAIT) parameter specifies how long to wait for all of the specified data group processes to end. MIMIX will attempt to resolve all pending activity entries before ending the data groups. If a numeric value was specified, and the selected processes do not end within the specified time, the action specified for the Timeout option (TIMOUTOPT) will occur. The WAIT parameter also supports special values of *SBMRQS and *NOMAX. When these values are used, the TIMOUTOPT parameter is ignored. Note: If *ALL is specified for any part of the data group definition, the Wait time value must be *SBMRQS (submit request). Ending all or selected processes MIMIX determines which data group replication processes to end based on the command specified and options on the command. The ENDMMX command ends all replication processes for all data groups on the systems specified on the end request. The default END procedure for the ENDAG command uses the default settings of the ENDDG command. MIMIX also ships an ENDTGT procedure that, when specified on the ENDAG command, will end only processes on the target system. Only the ENDDG command supports the ability to end selected replication processes through its Process (PRC) parameter. The default value is to end all replication processes for the specified data groups. The configuration of each data group determines which processes end with each possible value for the PRC parameter. If you choose to use this parameter, be sure that you understand what processes will 187 Commands for ending replication end. See “What replication processes are ended by the ENDDG command” on page 203. When to end the RJ link The RJ link remains active unless you change the value of the End remote journaling (ENDRJLNK) parameter on the ENDMMX command or the ENDDG command. The RJ link can normally remain active unless you have a need to prevent data from being sent to the target system. Some situations where you need to end the RJ link include: • Following a switch, to prevent data from returning to the system on which it originated (round-tripping), and to reduce communications and DASD usage • Before performing an IPL on either the source system or target system • Before upgrading the IBM i release on either the source system or the target system • Before performing a hardware upgrade The default END procedure for the ENDAG command used the default values for the ENDDG command. MIMIX also ships a step program, MXENDRJLNK, that can be added into the END procedure if necessary. What is ended by the ENDMMX command The ENDMMX command will end all MIMIX processes needed for replication on the specified systems in the installation. If you are using application groups, the application group is not specifically ended, and the associated end procedure will not be run. Any processes for user applications or IBM cluster resource groups must be ended separately. When you use this command, the following occurs: Data groups - The end process specified is used to end all enabled data groups and their supporting processes, including automatic recovery, on the specified systems. This includes data groups associated with data resource groups. Default values end data groups in a controlled manner. Remote journal links - If you selected to end remote journaling, all remote journal links associated with the specified systems are ended. MIMIX managers and services - Ends the system managers, journal managers, target journal inspection, and collector services on the specified systems. Monitors - Ends all individual monitors currently active in the installation library on the specified systems. Master monitor - Ends the master monitor on each of the specified systems. MIMIX Promoter - Ends promoter group activity on the specified systems. Audits and Recoveries - All queued audits, all audits in progress, and all recoveries in progress that are associated with the specified systems are ended. This includes jobs with locks on the installation library. Queued audits are set to 188 Commands for ending replication *NOTRUN and audits in comparison phase are set to *FAILED. Audits in recovery phase reflect their state of processing at the time of the end request, which may be *NOTRCVD. Note: Cluster services is not ended when MIMIX managers end because cluster services may be necessary for other applications. What is ended by the default END procedure for an application group When an application group is created, a default procedure named END is created for it from a shipped default procedure. The End Application Group (ENDAG) command automatically uses the application group’s default END procedure unless you specify a different procedure. Steps in the shipped default END procedure, as well as steps in additional shipped procedures that end application groups, are described in the MIMIX Administrator Reference book. 189 What occurs when a data group is ended What occurs when a data group is ended The End Data Group (ENDDG) command will end replication processes for the specified data group. The ENDDG command can be used interactively or programatically. This command is invoked by the ENDMMX command and by the ENDAG command running the default END procedure, using values other than default for some parameters. When an ENDDG request is processed, MIMIX may take a few minutes while it does the following for each specified data group: • Determines which data group replication processes to end based on the value you specify for the Process (PRC) parameter. The default value ends all MIMIX replication processes. • When ending data groups that use a shared object send job, the job is ended by the last data group to end. • When ending data groups that perform access path maintenance1, the database apply process signals the access path maintenance job and then ends. The access path maintenance job uses additional jobs, if needed, to change the access path maintenance attribute to immediate on all files that MIMIX had previously changed to delayed. Any files that could not be changed are identified as having an access path maintenance error before the maintenance jobs end. • Ends the specified replication processes in the manner specified for the End process (ENDOPT) parameter. The command defaults to processing the end request immediately (*IMMED). When invoked by the ENDMMX command, the default value specified on ENDMMX is *CNTRLD, which takes precedence. When invoked by a procedure specified on the ENDAG command, the procedure determines whether ENDDG is passed parameter values or uses the command defaults. • Uses the specified Wait time and Timeout options if a controlled end is requested. • If requested, ends the RJ link. The RJ link is not automatically ended. In most cases, the default value *NO for the End remote journaling (ENDRJLNK) parameter is appropriate. Keeping the RJ link active allows database changes to continue to be sent to the target system even though the data group is not active. • If you have used the MIMIX CDP feature to set a recovery point in a data group and then end the data group, the recovery point will be cleared. When the data group is started again, the apply processes will process any available transactions, including those which may have had corruptions. (Recovery points are set with the Set DG Recovery Point (SETDGRCYP) command.) If a recovery window is configured for the data group, its configured duration is not affected by requests to end or start the data group. • On installations running software earlier than 7.1.15.00, if the parallel access path maintenance function has been enabled, the End parallel AP maintenance 1. The access path maintenance function is available on installations running MIMIX 7.1.15.00 or higher and is the replacement for the parallel access path maintenance function in earlier software levels. 190 What occurs when a data group is ended (PRLAPMNT) parameter1 determines whether MIMIX will end the monitors used by this function when the data group ends. The default value, *DFT, will end the monitors when the value specified for Processes (PRC) includes database processes that run on the target system (*ALL, *ALLTGT, *DBALL, *DBTGT, or *DBAPY) and the value *ALL is specified for the Apply session (APYSSN) parameter. The ENDDG command does not end the system manager, journal manager, or other processes that run at the node level. To end those processes, either use the ENDMMX command or use the End MIMIX Managers (ENDMMXMGR) command after replication processes have ended. 1. This parameter is not available on installations running MIMIX 7.1.15.00 or higher. 191 Ending MIMIX Ending MIMIX For most configurations, It is recommended that you end MIMIX products from the management system, which is usually the backup system. If your installation is configured so that the backup system is a network system, you should end MIMIX from the network system. Notes: • If you are ending MIMIX for a software upgrade or to install a service pack, use the procedures in the software’s ReadMe document. • The ENDMMX command cannot run when application groups are configured and there are any active, failed, or canceled procedures. To end MIMIX, use the following procedures: 1. Use one of the following procedures: • “Ending with default values” on page 192 • “Ending by prompting the ENDMMX command” on page 192 2. Complete any needed follow-up actions using the information and procedures in “After you end MIMIX products” on page 193. Ending with default values Use this procedure to end all MIMIX production in an installation library. 1. From the MIMIX Basic Main Menu, select option 3 (End MIMIX) and press Enter. You will see a confirmation display. 2. From the confirmation display, you can press F1 (Help) to see a description of the default values that will be used. To end MIMIX, press Enter, Ending by prompting the ENDMMX command To end all MIMIX processes for the specified systems within an installation library, do the following: 1. From a command line, type ENDMMX and press F4 (Prompt). 2. The End MIMIX display appears. At the End process prompt, specify *CNTRLD for a controlled end or *IMMED for an immediate end. This parameter applies to the application group (ENDAG) and data group (ENDDG) processes only. Note: When ENDMMX ends data groups, it waits for each data group to end before attempting to end the next MIMIX product. 3. At the End remote journaling prompt, specify whether you want to end remote journaling. Note: If you specify *YES, all data groups using the remote journal link in the installation library will be affected. If other data groups are using the same remote journal link, you should specify *NO. 4. If you specified *CNTRLD for Step 2, ensure that the values for the Wait time 192 Ending MIMIX (seconds) and Timeout option prompts are what you want for the controlled end. 5. At the System definition prompt, indicate the scope of the request by specifying either *ALL or *LOCAL. This determines the systems on which to end MIMIX processes. 6. To end MIMIX processes, press Enter. After you end MIMIX products Some pending transactions may not be handled before the end process completes. You may need to ensure that all activity entries are complete before you issue additional commands. Examples of scenarios where it is important to check whether all pending transactions are completed include: • Switching a data group (SWTDG command) • Starting a data group with clear pending entries (STRDG CLRPND(*YES)). To check for active entries, use the command WRKDGACTE STATUS(*ACTIVE). When to also end the MIMIX subsystem - You will also need to end the MIMIX subsystem when you need to IPL the system, when upgrading MIMIX software, and when installing a MIMIX software service pack. The MIMIX subsystem must be ended from the 5250 emulator. To end the subsystem, do the following: 1. If you use MIMIX Availability Manager to monitor earlier releases of MIMIX, do the following: a. Ensure that all users have logged out of MIMIX Availability Manager. b. From the 5250 emulator, enter LAKEVIEW/ENDMMXAM. 2. Enter the command WRKSBS. The Work with Subsystems display appears. 3. Type an 8 (Work with subsystem jobs) next to subsystem MIMIXSBS and press Enter. 4. End any remaining jobs in a controlled manner. Type a 4 (End) next to the job and press F4 (Prompt). The How to end (OPTION) parameter should have a value of *YES. Press Enter. If you see a confirmation display, press Enter to continue. 5. Press F12 (Cancel) to return to the Work with Subsystems display. 6. Type a 4 (End subsystem) next to subsystem MIMIXSBS and press Enter. 193 Ending an application group Ending an application group For an application group, a procedure for only one operation (start, end, or switch) can run at a time. For information about parameters and shipped procedures, see “What is ended by the default END procedure for an application group” on page 189 and “Choices when starting or ending an application group” on page 172. To end an application group, do the following: 1. From the Work with Application Groups display, type 10 (End) next to the application group you want and press F4 (Prompt). 2. Verify that the values you want are specified for Resource groups and Data resource group entry. 3. If you are starting the procedure after addressing problems with the previous end request, specify the value you want for Begin at step. Be certain that you understand the effect the value you specify will have on your environment. 4. Press Enter. 5. The Procedure prompt appears. Do one of the following: • To use the default end procedure, press Enter. • To use a different end procedure for the application group, specify its name. Then press Enter. 194 Ending a data group in a controlled manner Ending a data group in a controlled manner The following procedures describe how to check for errors before requesting a controlled end of a data group, how to perform the controlled end request, and how to confirm that the end completed. Held files must be released and the apply process must complete operations for journal entries stored in log spaces before you end data group activity. Data groups that are in an application group: The preferred method of ending data groups that are part of an application group is to use the End Application Group (ENDAG) command. Preparing for a controlled end of a data group It is good practice to ensure that errors are resolved before requesting a controlled end of a data group. Do the following: 1. From the Work with Data Groups display, type an 8 (Display status) next to the data group you want to end and press Enter. 2. The Data Group Status display appears. In the upper right of the display, you should see either one or both of the following fields. A non-zero value in these fields will not prevent the end request from completing. • Database errors identifies the number of items replicated through the user (database) journal that have a status of *HLDERR. This number should be 0 before you end the data group. • Object in error/active identifies two key statistics associated with objects replicated through the system journal. The first number identifies the number of objects that have a status of *FAILED and the second number identifies the number of objects with active (pending) activity entries. Both numbers should be 0 before you end the data group. Note: Only information for the type of information replicated by the data group appears on the status displays. For example, if the data group does not contain database files, you will only see fields for object information. 3. For data groups which replicate from the user journal, you also need to check for any files that are held for other reasons. Press F8 (Database). The Held for other reasons field In the upper right of the Data Group Database Status display should also be 0 before you end the data group. A non-zero value may or may not prevent the end request from completing. For more information, see topics “Working with files needing attention (replication and access path errors)” on page 210. Performing the controlled end 1. From the Work with Data Groups display, type a 10 (End DG) next to the data group you want to end and press Enter. 2. The End Data Group (ENDDG) display appears. Specify *CNTRLD for the End 195 Ending a data group in a controlled manner processes prompt. 3. If the data group uses remote journaling, verify that the value of the End remote journaling prompt is what you want. 4. Because you specified *CNTRLD in Step 2, you can also use the Wait Time (WAIT) parameter to specify how long MIMIX should try to end the selected processes in a controlled manner. Use F1 (Help) to see additional information about the possible options. • Specify *SBMRQS to submit a request to end the data groups. The appropriate actions are issued to end the specified processes and control is returned to the caller immediately. When you specify this value, the TIMOUTOPT parameter (Step 5) is ignored. • Specify *NOMAX. When you specify this value, MIMIX will wait until all specified MIMIX processes are ended. • Specify a numeric value (number-of-seconds). MIMIX waits the specified time for a controlled end to complete before using the option specified in the TIMOUTOPT parameter. 5. If you specified a numeric value for the WAIT parameter in Step 4, you can also use the Timeout Option (TIMOUTOPT) parameter. You can specify what action you want the ENDDG command to perform if the time specified in the WAIT parameter is reached: • The current process should quit and return control to the caller (*QUIT). • A new request should be issued to end all processes immediately (*ENDIMMED). When this value is specified, pending activity entries may still exist after the data group processes are ended. • An inquiry message should be sent to the operator notifying of a possible error condition (*NOTIFY). If you specify this value, the command must be run from the target system. 6. Press Enter to process the command. Confirming the end request completed without problems After you request a controlled end of a data group, the Work with Data Group display appears. Do the following: 1. From the Work with Data Group display appears. Type an 8 (Display status) next to the data group you ended and press Enter. 2. The Data Group Status display appears. In the Target Statistics section near the middle of the display, the Unprocessed Entry Count column should be blank for any database apply processes and any object apply processes. If unprocessed entries exist when you end the data group and perform a switch, you may lose these entries when the data group is started following the switch. Note: To ensure that you are aware of any possible pending or delayed activity entries, enter the WRKDGACTE STATUS(*ACTIVE) command. Any activities that are still in progress will be listed. Ensure that all activities are completed. 196 Ending a data group in a controlled manner 3. Ensure that there are no open commit cycles.The next attempt to start the data group will fail if open commit cycles exist and either the start request specified to clear pending entries (CLRPND(*YES)) or the commit mode specified in the data group definition changed. (Certain process, such as performing a hardware upgrade with a disk image change, converting to MIMIX Dynamic Apply, or enabling a disabled data group, require a clear pending start.) To verify commit cycles, do the following: a. Press F8 (Database) to view the Data Group Detail Status display. b. For each apply session listed, verify that the value shown in the Open Commit column at the right side of the display is *NO. c. If open commit cycles exist, restart the data group. You must take action to resolve the open commit cycles, such as ending or quiescing the application or closing the commit cycle. Then repeat the controlled end again. 197 Ending selected data group processes Ending selected data group processes This procedure can be used to end all or selected processes for a data group, or end a specific database apply process. Data groups that are in an application group: The preferred method of ending data groups that are part of an application group is to use the End Application Group (ENDAG) command. Beginning with service pack 7.1.06.00, the default behavior of the ENDDG command helps to enforce this best practice when necessary by not allowing the command to run when the data group is participating in a resource group with three or more nodes. (A data resource group provides the association between one or more data groups and an application group.). The ENDDG request will run when the data group is participating in a resource group with two nodes. In earlier software levels, default behavior does not allow a end request when the data group is part of an application group. In application group environments with three or more nodes, it is particularly important to treat all members of an application group as one entity. For example, a configuration change that is made effective by starting and ending a single data group would not be propagated to the other data groups in the same resource group. However, the same change would be propagated to the other data groups if it is made effective by ending and starting the parent application group. For additional information about the ENDDG command, refer to the following topics: • “What occurs when a data group is ended” on page 190 • “What replication processes are ended by the ENDDG command” on page 203 To selectively end processes for a data group, do the following: 1. From the Work with Data Groups display, type a 10 (End DG) next to the data group that you want to end and press Enter. 2. The End Data Group (ENDDG) display appears. At the Process prompt, specify the value for the processes you want to end. To see a list of values, press F4 (Prompt). 3. At the End process prompt, specify the value you want. 4. If the data group uses remote journaling, verify that the value of the End remote journaling prompt is what you want. 5. If you want to end only a selected apply session, press F10 (Additional parameters). Then specify the value for the session you want to end at the Apply session prompt. 6. To end the selected processes, press Enter. 198 What replication processes are started by the STRDG command What replication processes are started by the STRDG command MIMIX determines how each data group is configured and starts the appropriate replication processes based on the value you specify for the Start processes (PRC parameter). Default configuration values create data groups that use MIMIX Remote Journal support (MIMIX RJ support) for database replication and source-send technology for object replication. Table 37 identifies the processes that are started when MIMIX RJ support is used for database replication for each of the possible values on the PRC parameter. An RJ link identifies the IBM i remote journal function, which transfers data to the target system. On the target system, the data is processed by the MIMIX database reader (DBRDR) before the database apply process (DBAPY) completes replication. For data groups that use MIMIX RJ support, it is standard practice to leave the RJ link active when the data groups are ended. If the RJ link is not already active when starting data groups, MIMIX starts the RJ link when the value specified for the PRC parameter includes database source system processes or all processes. The RJ Link column in Table 37 shows the result of each process when the RJ link is not active while the Notes column identifies behavior that may not be anticipated when the RJ link is already active. Table 37. Value for PRC Processes started by data groups configured for MIMIX Remote Journal support. This assumes that all replication processes are inactive when the STRDG request is made. Notes Source Processes Target Processes DB replication Object replication DB replication RJ Link 1 OBJSND OBJRTV CNRSND STSRCV DBRDR DBAPY2 OBJRCV CNRRCV STSSND OBJAPY Object replication *ALL E Starts1 Starts Starts Starts Starts Starts Starts Starts Starts Starts Starts *ALLSRC A, E Starts1 Starts Starts Starts Starts Inactive Inactive Starts Starts Starts Inactive *ALLTGT A, B Inactive1 Inactive Inactive Inactive Inactive Starts Starts Inactive Inactive Inactive Starts *DBALL A, E Starts1 Inactive3 Inactive3 Inactive3 Inactive3 Starts Starts Inactive3 Inactive3 Inactive3 Inactive3 Notes: A. Data groups which use cooperative processing should have both database and object processes started to prevent objects and data on the target system from becoming not fully synchronized. B. When the RJ link is already active, database replication becomes operational. C. When the RJ link is already active, database journal entries continue to transfer to the target system over the RJ link D. When the RJ link is already active, database journal entries continue to transfer to the target system over the RJ link, where they will be processed by the DBRDR. E. If data group data area entries are configured, the data area polling process also starts when values which start database source processes are selected. 199 What replication processes are started by the STRDG command Table 37. Value for PRC Processes started by data groups configured for MIMIX Remote Journal support. This assumes that all replication processes are inactive when the STRDG request is made. Notes Source Processes Target Processes DB replication Object replication DB replication RJ Link 1 OBJSND OBJRTV CNRSND STSRCV DBRDR DBAPY2 OBJRCV CNRRCV STSSND OBJAPY Object replication *OBJALL A, C Inactive1 Starts Starts Starts Starts Inactive4 Inactive 4 Starts Starts Starts Starts *DBSRC A, C, E Starts1 Inactive3 Inactive3 Inactive3 Inactive3 Inactive Inactive Inactive3 Inactive3 Inactive3 Inactive3 *DBTGT A, B Inactive1 Inactive3 Inactive3 Inactive3 Inactive3 Starts Starts Inactive3 Inactive3 Inactive3 Inactive3 *OBJSRC A, C Inactive1 Starts Starts Starts Starts Inactive4 Inactive4 Starts Starts Starts Inactive *OBJTGT A, C Inactive1 Inactive Inactive Inactive Inactive Inactive4 Inactive4 Inactive Inactive Inactive Starts *DBRDR A, D Inactive1 Inactive3 Inactive3 Inactive3 Inactive3 Starts Inactive Inactive3 Inactive3 Inactive3 Inactive3 *DBAPY A, C Inactive1 Inactive3 Inactive3 Inactive3 Inactive3 Inactive4 Starts4 Inactive3 Inactive3 Inactive3 Inactive3 Notes: A. Data groups which use cooperative processing should have both database and object processes started to prevent objects and data on the target system from becoming not fully synchronized. B. When the RJ link is already active, database replication becomes operational. C. When the RJ link is already active, database journal entries continue to transfer to the target system over the RJ link D. When the RJ link is already active, database journal entries continue to transfer to the target system over the RJ link, where they will be processed by the DBRDR. E. If data group data area entries are configured, the data area polling process also starts when values which start database source processes are selected. 1. 2. 3. 4. This column shows the effect of the specified value on the RJ link when the RJ link is not active. See the Notes for the effect of values when the RJ Link is already active, which is default behavior. If the access path maintenance (APMNT) policy has been enabled at the installation or data group level, an access path maintenance job is also started. Access path maintenance is available on installations running 7.1.15.00 or higher. These object replication processes are not available in data groups configured for database-only replication. These database replication processes are not available in data groups configured for object-only replication. Optionally, data groups can use source-send technology instead of remote journaling for database replication. Data groups created on earlier levels of MIMIX may still be configured this way. 200 What replication processes are started by the STRDG command Table 38 identifies the processes that are started by each value for Start processes when source-send technology is used for database replication. The MIMIX database send (DBSND) process and database receive (DBRCV) process replace the IBM i remote journal function and the DBRDR process, respectively. Table 38. Value for PRC Processes started by data groups configured for Source Send replication This assumes that all replication processes are inactive when the STRDG request is made. Notes Source Processes Target Processes DB replication Object replication DB replication DBSND 1 OBJSND OBJRTV CNRSND STSRCV DBRCV DBAPY2 OBJRCV CNRRCV STSSND OBJAPY Object replication *ALL — Starts 1 Starts Starts Starts Starts Starts Starts Starts Starts Starts Starts *ALLSRC A Starts 1 Starts Starts Starts Starts Starts Inactive Starts Starts Starts Inactive *ALLTGT A Inactive Inactive Inactive Inactive Inactive Inactive Starts Inactive Inactive Inactive Starts *DBALL A Starts 1 Inactive3 Inactive3 Inactive3 Inactive3 Starts Starts Inactive3 Inactive3 Inactive3 Inactive3 *OBJALL A Inactive 4 Starts Starts Starts Starts Inactive4 Inactive4 Starts Starts Starts Starts *DBSRC A Starts 1 Inactive3 Inactive3 Inactive3 Inactive3 Starts Inactive Inactive3 Inactive3 Inactive3 Inactive3 *DBTGT A Inactive Inactive3 Inactive3 Inactive3 Inactive3 Inactive Starts Inactive3 Inactive3 Inactive3 Inactive3 *OBJSRC A Inactive4 Starts Starts Starts Starts Inactive4 Inactive4 Starts Starts Starts Inactive *OBJTGT A Inactive4 Inactive Inactive Inactive Inactive Inactive4 Inactive4 Inactive Inactive Inactive Starts *DBRDR 5 — — Inactive3 Inactive3 Inactive3 Inactive3 — — Inactive3 Inactive3 Inactive3 Inactive3 *DBAPY A Inactive4 Inactive3 Inactive3 Inactive3 Inactive3 Inactive4 Starts 4 Inactive3 Inactive3 Inactive3 Inactive3 Notes: A. Data groups which use cooperative processing should have both database and object processes started to prevent objects and data on the target system from becoming not fully synchronized. 1. 2. 3. 4. When the database send (DBSND) process starts, the data area polling process also starts. If the access path maintenance (APMNT) policy has been enabled at the installation or data group level, an access path maintenance job is also started. Access path maintenance is available on installations running 7.1.15.00 or higher. These object replication processes are not available in data groups configured for database-only replication. These database replication processes are not available in data groups configured for object-only replication 201 What replication processes are started by the STRDG command 5. The database reader (*DBRDR) process is not used by data groups configured for source-send replication. 202 What replication processes are ended by the ENDDG command What replication processes are ended by the ENDDG command MIMIX determines how each data group is configured and ends the appropriate replication processes based on the value you specify for the Process (PRC parameter). Default configuration values create data groups that use MIMIX Remote Journal support (MIMIX RJ support) for database replication and source-send technology for object replication. Table 39 identifies the processes that are ended by each value for PRC when MIMIX RJ support is used for database replication. An RJ link identifies the IBM i remote journal function, which transfers data to the target system. On the target system, the data is processed by the MIMIX database reader (DBRDR) before the database apply process (DBAPY) completes replication. The communications defined by the RJ link remains active and is not affected by any value for PRC. In most cases, leaving the RJ link active is preferable. If necessary, you can end the RJ link by changing value for End remote journaling (ENDRJLNK parameter). “When to end the RJ link” on page 188 describes when you need to end the RJ link. Table 39. Value for PRC Processes ended by data groups configured for MIMIX Remote Journal support. This assumes that all replication processes are active when the ENDDG request is made and that the request does not specify to end the RJ link. Notes Source Processes Target Processes DB replication Object replication DB replication RJ Link 1 OBJSND OBJRTV CNRSND STSRCV DBRDR DBAPY2 OBJRCV CNRRCV STSSND OBJAPY Object replication *ALL E Active1 Ends Ends Ends Ends Ends Ends Ends Ends Ends Ends *ALLSRC A, E Active1 Ends Ends Ends Ends Active Active Ends Ends Ends Active *ALLTGT — Active1 Active Active Active Active Ends Ends Active Active Active Ends *DBALL B, E Active1 Active 3 Active 3 Active 3 Active 3 Ends Ends Active 3 Active 3 Active 3 Active 3 Notes: A. Has no effect on database-only replication. New database journal entries continue to transfer to the target system over the RJ link, where they will be processed. B. Data groups that use cooperative processing may be affected by the result of this value. Ending database processes while object processes remain active may result in object activity entries being placed on hold. Similarly, ending object processes while database processes remain active may result in files being placed on hold due to error. C. New database journal entries continue to transfer to the target system over the RJ link. Existing entries stored in the log space on the target system before the end request was processed will be applied. D. New database journal entries continue to transfer to the target system over the RJ link, where they will be processed by the DBRDR. E. The data area polling process ends when values which end database source processes are specified. 203 What replication processes are ended by the ENDDG command Table 39. Value for PRC Processes ended by data groups configured for MIMIX Remote Journal support. This assumes that all replication processes are active when the ENDDG request is made and that the request does not specify to end the RJ link. Notes Source Processes Target Processes DB replication Object replication DB replication RJ Link 1 OBJSND OBJRTV CNRSND STSRCV DBRDR DBAPY2 OBJRCV CNRRCV STSSND OBJAPY Object replication *OBJALL A, B Active1 Ends Ends Ends Ends Active 4 Active 4 Ends Ends Ends Ends *DBSRC A, B, E Active1 Active 3 Active 3 Active 3 Active 3 Active Active Active 3 Active 3 Active 3 Active 3 *DBTGT B Active1 Active 3 Active 3 Active 3 Active 3 Ends Ends Active 3 Active 3 Active 3 Active 3 *OBJSRC A, B Active1 Ends Ends Ends Ends Active 4 Active 4 Ends Ends Ends Active *OBJTGT A, B Active1 Active Active Active Active Active 4 Active 4 Active Active Active Ends *DBRDR B, C Active1 Active 3 Active 3 Active 3 Active 3 Ends Active Active 3 Active 3 Active 3 Active 3 *DBAPY B, D Active1 Active 3 Active 3 Active 3 Active 3 Active Ends Active 3 Active 3 Active 3 Active 3 Notes: A. Has no effect on database-only replication. New database journal entries continue to transfer to the target system over the RJ link, where they will be processed. B. Data groups that use cooperative processing may be affected by the result of this value. Ending database processes while object processes remain active may result in object activity entries being placed on hold. Similarly, ending object processes while database processes remain active may result in files being placed on hold due to error. C. New database journal entries continue to transfer to the target system over the RJ link. Existing entries stored in the log space on the target system before the end request was processed will be applied. D. New database journal entries continue to transfer to the target system over the RJ link, where they will be processed by the DBRDR. E. The data area polling process ends when values which end database source processes are specified. 1. 2. 3. The RJ link is not ended by the End options (PRC) parameter. New database journal entries continue to transfer to the target system over the RJ link. See the Notes column for additional details. On installations running 7.1.15.00 or higher, if access path maintenance is enabled, the database apply process signals the access path maintenance job and then ends. The access path maintenance job uses additional jobs, if needed, to change the access path maintenance attribute to immediate on all files that MIMIX had previously changed to delayed. Any files that could not be changed are identified as having an access path maintenance error before the maintenance jobs end. On installations running software earlier than 7.1.15.00, if parallel access path maintenance function is enabled, the associated monitors are also ended when ENDDG command specifies *DFT for End parallel AP maintenance (PRLAPMNT) and *ALL for the Apply session (APYSSN). When *YES is specified for PRLAPMNT, the function is always ended regardless of the values specified for PRC or APYSSN. These object replication processes are not available in data groups configured for database-only replication. 204 What replication processes are ended by the ENDDG command 4. These database replication processes are not available in data groups configured for object-only replication. Optionally, data groups can use source-send technology instead of remote journaling for database replication. Data groups created on earlier levels of MIMIX may still be configured this way. Table 40 identifies the processes that are ended by each value for End options when source-send technology is used for database replication. The MIMIX database send (DBSND) process and database receive (DBRCV) process are replaced by the IBM i remote journal function and the DBRDR process, respectively. Table 40. Value for PRC Processes ended by data groups configured for Source Send replication This assumes that all replication processes are active when the ENDDG request is made. Notes Source Processes Target Processes DB replication Object replication DB replication DBSND 1 OBJSND OBJRTV CNRSND STSRCV DBRCV DBAPY2 OBJRCV CNRRCV STSSND OBJAPY Object replication *ALL — Ends 1 Ends Ends Ends Ends Ends Ends Ends Ends Ends Ends *ALLSRC — Ends 1 Ends Ends Ends Ends Ends Active Ends Ends Ends Active *ALLTGT — Active Active Active Active Active Active Ends Active Active Active Ends *DBALL A Ends 1 Active 3 Active 2 Active 2 Active 2 Ends Ends Active 2 Active 2 Active 2 Active 2 *OBJALL A Active 4 Ends Ends Ends Ends Active 3 Active 3 Ends Ends Ends Ends *DBSRC A Ends 1 Active 2 Active 2 Active 2 Active 2 Ends Active Active 2 Active 2 Active 2 Active 2 *DBTGT A Active Active 2 Active 2 Active 2 Active 2 Active Ends Active 2 Active 2 Active 2 Active 2 *OBJSRC A Active 3 Ends Ends Ends Ends Active 3 Active 3 Ends Ends Ends Active *OBJTGT A Active 3 Active Active Active Active Active 3 Active 3 Active Active Active Ends *DBRDR 5 — — Active 2 Active 2 Active 2 Active 2 — — Active 2 Active 2 Active 2 Active 2 *DBAPY A Active 3 Active 2 Active 2 Active 2 Active 2 Active 3 Ends 3 Active 2 Active 2 Active 2 Active 2 Notes: A. Data groups that use cooperative processing may be affected by the result of this value. Ending database processes while object processes remain active may result in object activity entries being placed on hold. Similarly, ending object processes while database processes remain active may result in files being placed on hold due to error. 1. When the database send (DBSND) process ends, the data area polling process also ends. 205 What replication processes are ended by the ENDDG command 2. 3. 4. 5. On installations running 7.1.15.00 or higher, if access path maintenance is enabled, the database apply process signals the access path maintenance job and then ends. The access path maintenance job uses additional jobs, if needed, to change the access path maintenance attribute to immediate on all files that MIMIX had previously changed to delayed. Any files that could not be changed are identified as having an access path maintenance error before the maintenance jobs end. On installations running software earlier than 7.1.15.00, if parallel access path maintenance function is enabled, the associated monitors are also ended when ENDDG command specifies *DFT for End parallel AP maintenance (PRLAPMNT) and *ALL for the Apply session (APYSSN). When *YES is specified for PRLAPMNT, the function is always ended regardless of the values specified for PRC or APYSSN. These object replication processes are not available in data groups configured for database-only replication. These database replication processes are not available in data groups configured for object-only replication The database reader (*DBRDR) process is not used by data groups configured for source-send replication. 206 CHAPTER 11 Resolving common replication problems Occasionally, a journaled transaction for a file or object may fail to replicate. User intervention is required to correct the problem. This chapter provides procedures to help you resolve problems that can occur during replication processing. The following topics are included in this chapter: • “Working with message queues” on page 208 describes how to use the MIMIX primary and secondary message queues from a 5250 emulator. • “Working with the message log” on page 209 describes how to access the MIMIX message log from either user interface. • “Working with user journal replication errors” on page 210 includes topics for how to resolve a file that is held due to an error. It also includes topics about options for placing a file on hold and releasing held files. • “Working with tracking entries” on page 219 describes how to use tracking entries to resolve replication errors for IFS objects, data areas, or data queues that are replicated cooperatively with the user journal. It also includes topics about options for placing a tracking entry on hold and releasing held tracking entries. • “Working with objects in error” on page 224 describes how to resolve objects in error by working with the data group activities used for system journal replication. This topic includes information about how to retry failed activity entries and how to determine whether MIMIX is automatically attempting to retry an activity. • “Removing data group activity history entries” on page 229 describes how to manually remove completed entries for system journal replication activity. This may be necessary if you need to conserve disk space. 207 Working with message queues Working with message queues You can access the MIMIX primary and secondary message queues to display messages or manage the list of messages. Do the following to access a MIMIX message queue: 1. Type the command DSPMMXMSGQ and press F4 (Prompt). 2. Specify either *PRI or *SEC to access the message queue you want and press Enter. 3. The Display MIMIX Message Queue display appears listing all of the current messages. To view all of the information for a message, place the cursor on the message you want and press Enter. You can also use the function keys on this display to perform several message-related tasks. Refer to the help text (F1 key) for information about these function keys. Note: The MIMIX primary and secondary message queues are defined for each system definition. You can control the severity and type of messages to be sent to each message queue through parameters on the system definition. 208 Working with the message log Working with the message log The MIMIX message log provides a common location for you to see all messages related to MIMIX products. A consolidated list of messages for all systems in the installation library is available on the management system. Note: The target system only shows messages that occurred on the target system. LVI messages are informational messages and LVE messages are error or diagnostic messages. CPF messages are generated by an underlying operating function and may be passed up to the MIMIX product. Do the following to access the MIMIX message log: 1. Do one of the following to access the message log display: • From the MIMIX Basic Main Menu, select option 13 (Work with messages) and press Enter. • From the MIMIX Intermediate Main Menu, select option 3 (Work with messages) and press Enter. 2. The Work with Message Log appears with a list of the current messages. The initial view shows the message ID and text. 3. Press F11 to see additional views showing the message type, severity, the product and process from which it originated, whether it is associated with a group (for MIMIX, a data group), and the system on which it originated. 4. You can subset the messages shown on the display. A variety of subsetting options are available that allow you to manage the message log more efficiently. 5. To work with a message, type the number of the option you want and press Enter. The following options are available: • 4=Remove - Use this option if you want to delete a message. When you select this option, a confirmation display appears. Verify that you want to delete the messages shown and press Enter. The message is deleted only from the local system. • 5=Display message - Use this option to view the full text of the first level message and gain access to the second level text. • 6=Print - Use this option to print the information for the message. • 8=Display details - Use this option to display details for a message log entry including its from and to program information, job information, group information, product, process, originating system, and call stack information. • 9=Related messages - Use this option to display a list of messages that relate to the selected message. Related messages include a summary and any detail messages immediately preceding it. This can be helpful when you have a large message log list and you want to show the messages for a certain job. • 12=Display job - If job information exists on the system, you can use this option to access job information for a message log entry. The Work with Jobs display appears from which you can select options for displaying specific information about the job. 209 Working with user journal replication errors Working with user journal replication errors MIMIX reports user journal replication errors for files as status on the associated data group file entry. This status is also reported at the data group level in a consolidated form. File replication problems are categorized as follows: Held due to error - If a journal transaction is not replicated successfully, the file entry is placed in *HLDERR status. This indicates a problem that must be resolved. Held for other reasons - File entries can also be placed in a variety of other held statuses by user action or by MIMIX. Generally, these statuses are also considered problems; some are transitional conditions that resolve automatically while others require user action. To determine if there are files on hold for other reasons, use the procedure in “Working with the detailed status of data groups” on page 105. For information about resolving problems with IFS objects and library-based objects that are replicated by user journal, see “Working with tracking entries” on page 219. Working with files needing attention (replication and access path errors) The DB Errors column on the Work with Data Groups display identifies the number of errors for user journal replication. Specifically, this column identifies the sum of the number of database files, IFS, *DTAARA, and *DTAQ objects on hold due to errors (*HLDERR) plus the number of LF and PF files that have access path maintenance1 failures for a data group. Data group file entries and tracking entries should not be left in *HLDERR state for any extended time. Access path maintenance errors occur when MIMIX could not change a file’s access path maintenance attribute back to immediate. To access a list of files in error for a data group, do the following: 1. From the MIMIX Basic Main Menu select option 6 (Work with data groups) and press Enter. 2. The Work with Data Groups display appears. Type 12 (Files needing attention) next to the data group you want which has errors identified in the DB Errors column and press Enter. 3. The Work with DG File Entries display appears with a list of file entries for the data group that have replication errors, access path maintenance2 errors, or both. Do the following: a. The initial view shows the current replication status of file entries. Any entry with a status of *HLD, *HLDERR, *HLDIGN or *HLDRLTD indicates that action is required. Use Table 41 to identify choices based on the file entry status. 1. Errors for the access path maintenance function are included on installations running MIMIX 7.1.15.00 or higher. 2. Access path maintenance errors can only be reported on data group file entries in installations running MIMIX 7.1.15.00 or higher. 210 Working with user journal replication errors Note: MIMIX retains log spaces for file entries with these statuses so that the journal entries that are being held can be released and applied to the target system. File entries should not be left in these states for an extended period. b. Use Table 41 to identify choices based on the file entry status and Table 42 to identify available options from this display. c. If necessary, take action to prevent the error from happening again. Refer to the following topics: • “Correcting file-level errors” on page 216 • “Correcting record-level errors” on page 217 4. Press F10 as needed on the Work with DG File entries display until you see the access path maintenance view. The AP Maint. Status column identifies any AP maintenance errors for a file with the value *FAILED and failures for logical files associated with a file as *FAILEDLF. Immediate action may not be necessary because MIMIX will attempt to retry access path maintenance when the data group ends and when it is restarted. To attempt an immediate retry, use option 40 (Retry AP maintenance). Table 41. Possible actions based on replication status of a file entry Status Preferred Action1 *ACTIVE Unless an error has occurred, no action is necessary. Entries in the user journal for the file are replicated and applied. If necessary, any of the options to hold journal entries can be used. *HLD User action is required to release the file entry (option 26) so that held journal entries from the user journal can be applied to the target system. *HLDERR User action is required. Attempt to resolve the error by synchronizing the file (option 16). Note: Transactions and hold logs are discarded for file entries with a status of *HLDERR and an error code of IG. Such a file must be synchronized. *HLDIGN User action is required to either synchronize the file (option 16) or to change the configuration if you no longer want to replicate the file. Journal entries for the file are discarded. Replication is not occurring and the file may not be synchronized. Depending on the circumstances, Release may also be an option. *HLDRGZ *HLDRNM *HLDPRM *HLDSYNC These are transitional states that should resolve to *ACTIVE. If these status persist, check the journaling status for the entry. MIMIX retains log spaces for the held journal entries for the duration of these temporary hold requests. 211 Working with user journal replication errors Table 41. Possible actions based on replication status of a file entry Status Preferred Action1 *HLDRTY The file entry is held because an entry could not be applied due to a condition which required waiting on some other condition (such as inuse). After a short delay, the database apply job will automatically attempt to process this entry again. The preferred action is to allow MIMIX to periodically retry the file entry. By default, the database apply job will automatically attempt to process the entry every 5 minutes for up to 1 hour. Manually releasing the file entry will cause MIMIX to attempt to process the entry immediately *HLDRLTD User action is required for a file in the same network. View the related files (option 35). A file that is related due to a dependency, such as a constraint or a materialized query table, is held. Resolving the problem for the related held file will resolve this status. *RLSWAIT The file is waiting to be released by the DB apply process and will be changed to *ACTIVE. If the status does not change to *ACTIVE, check the journaling status. If this status persists, you may need to synchronize (option 16). *CMPACT *CMPRLS *CMPRPR These are transitional states that should resolve automatically. The file entry represents a member that is being processed cooperatively between the CMPFILDTA command and the database apply process. 1. Evaluate the cause of the problem before taking any action. Table 42. Options for working with file entries from the Work with DG FIle Entries display Option Additional Information 9=Start journaling See “Starting journaling for physical files” on page 235. 10=End journaling See “Ending journaling for physical files” on page 236. 11=Verify journaling See “Verifying journaling for physical files” on page 237. 16=Sync DG file entry See topic ‘Synchronizing database files’ in the MIMIX Administrator Reference book. 20=Work with file error entries See topic “Working with journal transactions for files in error” on page 213. 23=Hold file See topic “Placing a file on hold” on page 214. 24=Ignore file See topic “Ignoring a held file” on page 214. 25=Release wait See topic “Releasing a held file at a synchronization point” on page 215. 26=Release See topic “Releasing a held file” on page 215. 27=Release clear See topic “Releasing a held file and clearing entries” on page 216. 212 Working with user journal replication errors Table 42. Options for working with file entries from the Work with DG FIle Entries display Option Additional Information 31=Repair member data Available for entries with a status of *HLDERR that identify a member. See topic ‘Comparing and repairing file data - members on hold (*HLDERR)’ in the MIMIX Administrator Reference book. 35=Work with related files Displays file entries that are related to the selected file by constraints or by other dependencies such as materialized query tables 40=Retry AP maintenance Retries access path maintenance operations on the target system for the selected file. This option is only valid on data group file entries that have an access path maintenance status of *FAILED or *FAILEDLF. Working with journal transactions for files in error When resolving problems for a file that is in *HLDERR state, a MIMIX administrator may find it useful to examine the journal entries that are being held by MIMIX. Although you can determine why a file is in error from either the source or target system, to view the actual journal entries, you must be on the target system. If you attempt to view the journal entries from the source system, MIMIX will indicate that you are on the incorrect system to view the information. Do the following: 1. From the subsetted list of files in error for a data group on the Work with DG File Entries display, type 20 (Work with file error entries) next to the file entry you want and press Enter. 2. The Work with DG FE on Hold display appears. A variety of information about the transaction appears on the display. Note: The values shown in the Sequence number column may be truncated if the journal supports *MAXOPT3 for the receiver size and the journal sequence number value exceeds the available display field. When truncation is necessary, the most significant digits (left-most) are omitted. Truncated journal sequence numbers are prefixed by '>'. The First journal sequence number field displays the full sequence number of the first item displayed in the list. a. Locate the transaction that caused the file to be placed on hold. Use the Position to field to position the list to a specific sequence number. b. Select the option (Table 43) you want to use on the journal transaction: Table 43. 2=Change Options available from the Work with DG FE on Hold display. You can change the contents or characteristics of the journal entry. Use this option with caution. Any changes can affect the validity of data in the journal entry. 213 Working with user journal replication errors Table 43. Options available from the Work with DG FE on Hold display. 4=Delete You can delete the journal entry. 5=Display You can display details for the specified journal entry associated with the data group file entry in question. 9=Immediate apply You can immediately apply a transaction that has caused a file to go on hold. The entry you selected is immediately applied to the file outside of the apply process. If the apply is successful, the error/hold entry that was applied is removed from the error/hold log. However, if the apply fails, a message is issued and the entry remains in the error/hold log. This process does not release the file; it only applies the selected entry. Placing a file on hold Use this procedure to hold any journal entries for a file identified by a data group file entry. Avoid leaving a file entry on hold for any extended period. File entries with a status of *ACTIVE, *HLDRGZ, *HLDRNM, *HLDPRM, *HLDSYNC, *HLDRLTD, and *RLSWAIT can be placed on hold. The request changes the file entry status to *HLD. Any journal entries for the associated file are replicated but not applied. If the file is being processed by an active apply session, suspending of the update process can take a short time to complete. You will receive a message when the file is held. MIMIX retains log spaces containing any replicated journal entries in anticipation that the file entry will be released. When the file is released, the accumulated journal entries will be applied. The *HLD status remains until additional action is taken. Do the following: 1. From the MIMIX Basic Main Menu select option 6 (Work with Data Groups) and press Enter. 2. The Work with Data Groups display appears. Type 17 (File entries) next to the data group you want and press Enter. 3. The Work with DG File Entries display appears. Type 23 (Hold file) next to the entry you want and press Enter. Ignoring a held file Use this procedure to ignore any journal entries for an file identified by a data group file entry. The request changes the file entry status to *HLDIGN. Any journal entries for the associated file, including any hold logs, are discarded. The *HLDIGN status remains until additional action is taken. Note: Be certain that you want to use the ignore feature. Any ignored transactions cannot be retrieved. You must replace the object on the target system with a current version from the source system. If a file has been on hold for a long time or you expect that it will be, the amount of storage used by the error/hold log space can be quite large. If you anticipate that you 214 Working with user journal replication errors will need to save and restore the file or replace it for any other reason, it may be best to just ignore all current transactions. Do the following: 1. From the MIMIX Basic Main Menu select option 6 (Work with Data Groups) and press Enter. 2. The Work with Data Groups display appears. Type 17 (File entries) next to the data group you want and press Enter. 3. The Work with DG File Entries display appears. Type 24 (Ignore file) next to the entry you want and press Enter. The status of the file is changed to *HLDIGN. The file entry is ignored. Journal entries for the file entry, including any hold logs, are discarded. Releasing a held file at a synchronization point Use this procedure to wait for a synchronization point to release any held journal entries for file identified by a data group file entry, then resume replication. The request changes the file entry status to *RLSWAIT. Any journal entries for the associated file are discarded until a File member saved (F-MS) journal entry or a Start of save of a physical file member using save-while-active function (F-SS) is encountered. This is the synchronization point. The file entry status is then changed to *ACTIVE and all journal entries that were held after the synchronization point are applied. If the F-MS or F-SS journal entry is not in the log space, the file entry remains in *RLSWAIT status. If you are unsure as to how many save requests might accumulate for an object, you can synchronize the file associated with the file entry. The entry status will become *ACTIVE. To wait for a synchronization point before releasing a held file, do the following: 1. From the MIMIX Basic Main Menu select option 6 (Work with Data Groups) and press Enter. 2. The Work with Data Groups display appears. Type 17 (File entries) next to the data group you want and press Enter. 3. The Work with DG File Entries display appears. Type 25 (Release wait) next to the entry you want and press Enter. Releasing a held file Use this procedure to immediately release any held journal entries for file identified by a data group file entry with a status of *HLD and resume replication. The request changes the file entry status to *ACTIVE. Any held journal entries for the associated file are applied. Normal replication of the file resumes. While a file is being released, the appropriate apply session suspends its operations on other files. This allows the released file to catch up to the current level of processing. If a file or member has been on hold for a long time, this can be lengthy. 215 Working with user journal replication errors Do the following to immediately release a held file or file member: 1. From the MIMIX Basic Main Menu select option 6 (Work with Data Groups) and press Enter. 2. The Work with Data Groups display appears. Type 17 (File entries) next to the data group you want and press Enter. 3. The Work with DG File Entries display appears. Type 26 (Release) next to the entry you want and press Enter Releasing a held file and clearing entries Use this procedure to clear any held journal entries for a file identified by a data group file entry, then resume replication. The request changes the file entry status to *ACTIVE. Any held journal entries for the associated file are discarded. Journal entries received after the file entry status became *ACTIVE are applied, resuming normal replication. If a file entry is on hold and its associated file has been synchronized in such a way that the held entries already exist in the restored file, this procedure will ensure that those entries are not re-applied. This procedure will not work if the file is being actively updated on the source system. Do the following to release a held file and clear any journal entries that were replicated but not applied: 1. From the MIMIX Basic Main Menu select option 6 (Work with Data Groups) and press Enter. 2. The Work with Data Groups display appears. Type 17 (File entries) next to the data group you want and press Enter. 3. The Work with DG File Entries display appears. Type 27 (Release clear) next to the entry you want and press Enter Correcting file-level errors Typically, file-level errors can be categorized as one of the following: • A problem with the configuration of files defined for replication. • A discrepancy in the file descriptions between the management and network systems • An operational error. This topic identifies the most common file-level errors and measures that you can take to prevent the problem from recurring. See also “Correcting record-level errors” on page 217. Once you diagnose and correct a file-level error, the problem rarely manifests itself again. Some of the most common file-level errors are: • Authority: The MIMIXOWN user profile defined in the MIMIX job description does not have authority to perform a function on the target system. You can prevent this problem by ensuring that the MIMIXOWN user profile has all object authority 216 Working with user journal replication errors (*ALLOBJ). This guarantees that the user profile has all the necessary authority to run IBM i commands and has the ability to access the library and files on the management system. Refer to the Using License Manager book for more information about the MIMIXOWN user profile and authority. • Objects existence or corruption: MIMIX cannot run a function against a file on the target system because the file or a supporting object (such as logical files) does not exist or has become damaged. System security is the only way to prevent an object from being accidentally deleted from the target system. Make sure that only the correct personnel have the ability to remove objects from the target system where replicated data is applied. Also, ensure that application programs do not delete files on the target system when there are no apply sessions running. • MIMIX subsystem ended: If the MIMIX subsystem is ended in an immediate mode while MIMIX processes are still active, files may be placed in a “Held” status. This is a result of MIMIX being unable to complete a transaction normally. After MIMIX is restarted, you only need to release the affected files. Correcting record-level errors Record-level errors occur when MIMIX updates or attempts to update a file and the feedback from the update process indicates a discrepancy between the files on the management and network system. Record-level errors can usually be traced back to problems with one of the following: • The system • Unique application environments, such as System 36 code running in native IBM i. • Operational errors. Record written in error This section describes the most common record-level errors. MIMIX DB Replicator was able to write the record on the target system; however, it wrote to the wrong relative record number. In most situations, the IBM i database function writes a new record to the end of a file. MIMIX did so, but it did not match the relative record number of the sending system. Usually this error occurs when transactions (journal entries) are skipped on the send system. Common reasons why records are written in error include the following: • Journaling was ended: When journaling is ended, transaction images are not being collected. If users update the files while journaling is not running, no journal entries are created and MIMIX DB Replicator has no way of replicating the missing transactions. The best way to prevent this error is to restrict the use of the Start Journaling Physical File (STRJRNPF) and End Journaling Physical File (ENDJRNPF) commands. • User journal replication was restarted at the wrong point: When you change the starting point of replication for a data group, it is imperative that transactions are not skipped. 217 Working with user journal replication errors • Apply session restarted after a system failure: This is caused when the target system experiences a hard failure. MIMIX always updates its user spaces with the last updated and sent information. When a system fails, some information may not be forced to disk storage. The data group definition parameter for database apply processing determines how frequently to force data to disk storage. When the apply sessions are restarted, MIMIX may attempt to rewrite records to the target system database. • Unable to write/update a record: This error is caused when MIMIX cannot access a record in a file. This is usually caused when there are problems with the logical files associated with the file or when the record does not exist. The best way to prevent this error is to make sure that replication is started in the correct position. This error can also be due to one of the problems listed in topic “Correcting file-level errors” on page 216. • Unable to delete a record: This is caused when MIMIX is trying to delete a record that does not exist or has a corrupted logical file associated with the physical file. This error can also be due to one of the problems listed in topic “Correcting file-level errors” on page 216. 218 Working with tracking entries Working with tracking entries Tracking entries identify library-based objects (data areas and data queues) and IFS objects configured for cooperative processing (advanced journaling). You can access the following displays to work with tracking entries in any status: • Work with DG IFS Trk. Entries display (WRKDGIFSTE command) • Work with DG Obj. Trk. Entries display (WRKDGOBJTE command) These displays provide access for viewing status and working with common problems that can occur while replicating objects identified by IFS and object tracking entries. Held tracking entries: Status for the replicated objects is reported on the associated tracking entries. If a journal transaction is not replicated successfully, the tracking entry is placed in *HLDERR status. This indicates a problem that must be resolved. Tracking entries can also be placed in *HLD, *HLDIGN statuses by user action These statuses are reported as ‘held for other reasons’ and also require user action. When a tracking entry has a status of *HLD or *HLDERR, MIMIX retains log spaces so that journal entries that are being held can be released and applied to the target system. Tracking entries should not be left in these states for an extended period. Additional information: To determine if a data group has any IFS objects, data areas, or data queues configured for advanced journaling, see “Determining if non-file objects are configured for user journal replication” on page 271. When working with tracking entries, especially for IFS objects, you should be aware of the information provided in “Displaying long object names” on page 262. Accessing the appropriate tracking entry display To access IFS tracking entry or object tracking entry displays for a data group, do the following: 1. From the MIMIX Basic Main Menu select option 6 (Work with Data Groups) and press Enter. 2. The Work with Data Groups display appears. Type 17 (File entries) next to the data group you want and press Enter. 3. Next to the data group you want, type the number for the option you want and press Enter. Table 44 shows the options for tracking entries. Table 44. Tracking entry options on the Work with Data Groups display Select Option Result 50=IFS trk entries Lists all IFS tracking entries for the selected data group on the Work with DG IFS Trk. Entries display. 51=IFS trk entries not active Lists IFS tracking entries for the selected data group with inactive status values (*HLD, *HLDERR, *HLDIGN, *HLDRNM, and *RLSWAIT) on the Work with DG IFS Trk. Entries display. 219 Working with tracking entries Table 44. Tracking entry options on the Work with Data Groups display Select Option Result 52=Obj trk entries Lists all object tracking entries for the selected data group on the Work with DG Obj. Trk. Entries display. 53=Obj trk entries not active Lists object tracking entries for the selected data group with inactive status values (*HLD, *HLDERR, *HLDIGN, and *RLSWAIT) on the Work with DG Obj. Trk. Entries display. 4. The tracking entry display you selected appears. Significant capability is available for addressing common replication problems and journaling problems. Do the following: a. Use F10 to toggle between views showing status, journaling status, and the database apply session in use. b. Any entry with a status of *HLD, *HLDERR or *HLDIGN indicates that action is required. The identified object remains in this state until action is taken. Statuses of *HLD and *HLDERR result in journal entries being held but not applied. Use Table 45 to identify choices based on the tracking entry status. c. Use options identified in Table 46 to address journaling problems or replication problems. Table 45. Possible actions based on replication status of a tracking entry Status Preferred Action1 *ACTIVE Unless an error has occurred, no action is necessary. Entries in the user journal for the IFS object are replicated and applied. If necessary, any of the options to hold journal entries can be used. *HLD User action is required to release the entry (option 26) so that held journal entries from user journal can be applied to the target system. *HLDERR User action is required. Attempt to resolve the error by synchronizing the file (option 16). *HLDIGN User action is required to either synchronize the object (option 16) or to change the configuration if you no longer want to replicate the object. Journal entries for the object are discarded. Replication is not occurring and the object may not be synchronized. Depending on the circumstances, Release may also be an option. *HLDRNM This is a transitional state for IFS tracking entries that should resolve to *ACTIVE. If this status persists, check the journaling status for the entry. Object tracking entries cannot have this status. *RLSWAIT If the status does not change to *ACTIVE, you may need to synchronize (option 16) 1. Evaluate the cause of the problem before taking any action. 220 Working with tracking entries Table 46. Options for working with tracking entries Option Additional Information 4=Remove See “Removing a tracking entry” on page 223 5=Display Identifies an object, its replication status, journaling status, and the database apply session used. 6=Print Creates a spooled file which can be printed 9=Start journaling See “Starting journaling for IFS objects” on page 238 and “Starting journaling for data areas and data queues” on page 241. 10=End journaling See “Ending journaling for IFS objects” on page 239 and “Ending journaling for data areas and data queues” on page 242. 11=Verify journaling See “Verifying journaling for IFS objects” on page 240 and “Verifying journaling for data areas and data queues” on page 243. 16=Synchronize Synchronizes the contents, attributes, and authorities of the object represented by the tracking entry between the source and target systems. For more information, see topic ‘Synchronizing tracking entries’ in the MIMIX Administrator Reference book. 23=Hold See “Holding journal entries associated with a tracking entry” on page 221. 24=Ignore See “Ignoring journal entries associated with a tracking entry” on page 222. 25=Release wait See “Waiting to synchronize and release held journal entries for a tracking entry” on page 222. 26=Release See “Releasing held journal entries for a tracking entry” on page 223. 27=Release clear See “Releasing and clearing held journal entries for a tracking entry” on page 223. Holding journal entries associated with a tracking entry Use this procedure to hold any journal entries for an object identified by a tracking entry. Avoid leaving a tracking entry on hold for any extended period. The request changes the tracking entry status to *HLD. Any journal entries for the associated IFS object, data area, or data queue are replicated but not applied. MIMIX retains log spaces containing any replicated journal entries in anticipation that the tracking entry will be released. When the tracking entry is released, the accumulated journal entries will be applied. The *HLD status remains until additional action is taken. Do the following: 1. Access the IFS or object tracking entry display as described in “Accessing the 221 Working with tracking entries appropriate tracking entry display” on page 219. 2. Type 23 (Hold) next to the tracking entry for the object you want and press Enter. Ignoring journal entries associated with a tracking entry Use this procedure to ignore any journal entries for an object identified by a tracking entry. The request changes the tracking entry status to *HLDIGN. Any journal entries for the associated IFS object, data area, or data queue, including any hold logs, are discarded. The *HLDIGN status remains until additional action is taken. Note: Be certain that you want to use the ignore feature. Any ignored transactions cannot be retrieved. You must replace the object on the target system with a current version from the source system. If a tracking entry has been on hold for a long time or you expect that it will be, the amount of storage used by the error/hold log space can be quite large. If you anticipate that you will need to save and restore the object or replace it for any other reason, it may be best to just ignore all current transactions. Do the following: 1. Access the IFS or object tracking entry display as described in “Accessing the appropriate tracking entry display” on page 219. 2. Type 24 (Ignore) next to the tracking entry for the object you want and press Enter. Waiting to synchronize and release held journal entries for a tracking entry Use this procedure to wait for a synchronization point to release any held journal entries for an object identified by a tracking entry, then resume replication. The request changes the tracking entry status to *RLSWAIT. Any journal entries for the associated IFS object, data area, or data queue are discarded until an object saved journal entry is encountered. This is the synchronization point. The tracking entry status is then changed to *ACTIVE and all journal entries that were held after the synchronization point are applied. If the object saved journal entry is not in the log space, the tracking entry remains in *RLSWAIT status. If you are unsure as to how many save requests might accumulate for an object, you can synchronize the object associated with the tracking entry. The tracking entry status will become *ACTIVE. Do the following: 1. Access the IFS or object tracking entry display as described in “Accessing the appropriate tracking entry display” on page 219. 2. Type 25 (Release wait) next to the tracking entry for the object you want and press Enter. 222 Working with tracking entries Releasing held journal entries for a tracking entry Use this procedure to immediately release any held journal entries for an object identified by a tracking entry with a status of *HLD or *HLDERR and resume replication. The request changes the tracking entry status to *ACTIVE. Any held journal entries for the associated IFS object, data area, or data queue are applied. Normal replication of the object resumes. Do the following: 1. Access the IFS or object tracking entry display as described in “Accessing the appropriate tracking entry display” on page 219. 2. Type 26 (Release) next to the tracking entry for the object you want and press Enter Releasing and clearing held journal entries for a tracking entry Use this procedure to clear any held journal entries for an object identified by a tracking entry, then resume replication. The request changes the tracking entry status to *ACTIVE. Any held journal entries for the associated IFS object, data area, or data queue are discarded. Journal entries received after the tracking entry status became *ACTIVE are applied, resuming normal replication. If a tracking entry is on hold and its associated object has been synchronized in such a way that the held entries already exist in the restored object, this procedure will ensure that those entries are not re-applied. Do the following: 1. Access the IFS or object tracking entry display as described in “Accessing the appropriate tracking entry display” on page 219. 2. Type 27 (Release clear) next to the tracking entry for the object you want and press Enter. Removing a tracking entry Use this procedure to remove a duplicate tracking entry for an IFS object, data area, or data queue. A tracking entry with a status of *HLDERR cannot be removed. Note: Do not use this procedure to prevent user journal replication of an object represented by a tracking entry. If you need to exclude the object from replication or have it replicated through the system journal instead of the user journal, change or create the appropriate data group IFS entry or object entry. Do the following: 1. Access the IFS or object tracking entry display as described in “Accessing the appropriate tracking entry display” on page 219. 2. Type 4 (Remove) next to the tracking entry you want to remove and press Enter. 3. You will see a confirmation display. To remove the tracking entry, press Enter. 223 Working with objects in error Working with objects in error Use this topic to work with replication errors for objects replicated through the system journal. To access a list of objects in error for a data group, do the following: 1. From the MIMIX Basic Main Menu select option 6 (Work with Data Groups) and press Enter. 2. The Work with Data Groups display appears. Type 13 (Objects in error) next to the data group you want which has values shown in the Obj Errors column and press Enter. 3. The Work with Data Group Activity display appears with a list of the objects in error for the data group you selected. You can do any of the following: • Use F10 (Error view) to see the reason why the object is in error. • Use F11 to change between views for objects, DLOs, IFS objects, and spooled files. • Use the options identified in Table 47 to resolve the errors. Type the number of the option you want next to the object and press Enter Table 47. Options on the Work with Data Group Activity display for working with objects in error. 4=Remove Use this option to remove an entry with a *COMPLETED or *FAILED status from the list. For entries with *FAILED status, this option removes only the failed entry. Prompting is available for extended capability. You may need to take action to synchronize the object associated with the entry. Note: If an entry with a status of *FAILED has related entries in *DELAYED status, you can remove both the failed and the delayed entries in one operation by using option 14 (Remove related). For more information, see “Removing data group activity history entries” on page 229. 7=Display message Use this option to display any error message that is associated with the entry. 8=Retry Use this option to retry the data group activity. MIMIX changes the entry status to pending and attempts the failed operation again. Note: It is possible to schedule the request for a time when the retry is more likely to be successful. For more information about retrying failed entries, see “Retrying data group activity entries” on page 227. 224 Working with objects in error Table 47. Options on the Work with Data Group Activity display for working with objects in error. 12=Work with entries Use this option to access the Work with DG Activity Entries display. From the display you can display additional information about replicated journal transactions for the object, including the journal entry type and access type (if available), as well as see whether the object is undergoing delay retry processing. You can also take options to display related entries, view error messages for a failure, and synchronize the object. For more information, see “Using the Work with DG Activity Entries display” on page 225. 14=Remove related Use this option to remove an entry with a status of *FAILED and any related entries that have a status of *DELAYED. You may need to take action to synchronize the object associated with the entry. Using the Work with DG Activity Entries display From the Work with DG Activity Entries display, you can display information about and take actions on activity entries for a replicated object. To access the display, select option 12 (Work with entries) from the Work with Data Group Activity display. Table 48 lists the available options. Table 48. Options available on the Work with DG Activity Entries display. 4=Remove Use this option to remove an individual entry with a *COMPLETED or *FAILED status from the list. For entries with *FAILED status, this option removes only the failed entry. You may need to take action to synchronize the object associated with the entry. Note: No prompting is available when using this option from this display. To prompt for additional capability, use the option to remove from the Work with Data Group Activity display. For more information, see “Removing data group activity history entries” on page 229. 5=Display Use this option display details about the individual entry. The information available about the object includes whether the object is undergoing delay retry processing, and journal entry information, including access type information for T-SF, T-YC, and T-ZC journal entry types. For more information, see “Determining whether an activity entry is in a delay/retry cycle” on page 228 6=Print Use this option to print the entry. 7=Display message Use this option to display the error message associated with the processing failure for the entry. 8=Retry Use this option to retry the data group activity entry. MIMIX changes the entry status to pending and attempts the failed operation again as soon as possible. 225 Working with objects in error Table 48. Options available on the Work with DG Activity Entries display. 9=Display related Displays entries related to the specified object. For example, use this option to see entries associated with a move or rename operation for the object. 12=Display job Displays the job that was processing the object when the error occurred, if the still job information exists and is on this system. 16=Synchronize Use this option to synchronize objects defined to MIMIX for system journal replication (objects that are not configured for cooperative processing). Activity entries with *ACTIVE or *COMPLETED status can be synchronized, as well as entries with a *FAILED status and with the following journal types: T-CO, T-CP, T-OR, T-SE, T-ZC (see notes), T-YC, and T-SF (see notes). A confirmation display allows you to confirm your choices before the request is processed. Entries are placed in a ‘pending synchronization’ status. When the data group is active, the contents of the object, its attributes, and its authorities are synchronized between the source and target systems. The status of the activity entry is set to ‘completed by synchronization.’ Notes: • To synchronize files defined for cooperative processing, use the Synchronize DG File Entry (SYNCDGFE) command. • Spooled files (T-SF journal entries) with the following access types can be synchronized: C = spooled file created; U = spooled file changed. • Changed objects (T-ZC journal entries) with the following access types can be synchronized: 1 (Add); 7 (Change); 25 (Initialize); 29 (Merge); 30 (Open); 34 (Receive); 36 (Reorganize); 50 (Set); and 51 (Send). 226 Retrying data group activity entries Data group activity entries that did not successfully complete replication have a status of *FAILED. These failed data group activity entries are also called error entries. You can request to retry processing for these activity entries. Activity entries with a status of *ACTIVE can also be retried in some circumstances. For example, you may want to retry an entry that is delayed but which has no preceding pending activity entry. Or, you may want to retry a pending entry that is undergoing processing in a delay retry cycle. The retry request places the activity entry in the queue for processing by the system journal replication process where the failure or delay occurred. Activity entries with a status of *FAILED or *DELAYED are set to *PENDING until they are processed. Retrying a failed data group activity entry You can manually request that MIMIX retry processing for a data group activity entry that has a status of *FAILED. The retry can be requested from either the Work with Data Group Activity display or from the Work with DG Activity Entries display. Note: Only the Work with Data Group Activity supports the ability to schedule the retry request for a time in the future when the request is more likely to be successful. To retry failed (error) activity entries, do the following: 1. From the Work with Data Groups display, type a 13 (Objects in error) next to the data group you want that has values shown in the Obj Errors column and press Enter. 2. The Work with Data Group Activity display appears with a list of the objects in error for the data group selected. Type an 8 (Retry) next to the entry you want and do one of the following: • To submit the retry request for immediate processing, press Enter. Then skip to Step 4. • To schedule the retry request for a time at which it is more likely to be successful, press F4 (Prompt). 3. On the Retry DG Activity Entries (RTYDGACTE) display, specify a value for the Time of day to retry prompt. Then press Enter. You can specify a specific time within 24 hours. The scheduled time is based on the time on the system from which the request is submitted regardless of the system on which the activity to retry occurs. When you submit a retry request for a scheduled time, MIMIX will make the entry active and will wait until the specified time before retrying the request. The scheduled time is the earliest the request will be processed. Be sure to consider any time zone differences between systems as you determine a scheduled time. For additional information and examples, press F1 (Help). 4. The Confirm Retry of DG Activity display appears. Press Enter. If failed activity entries occur frequently, consider using the third delay retry cycle. When the Automatic object recovery policy is enabled, a third retry cycle is performed 227 using the settings in effect from the Number of third delay/retries and Third retry interval (min.) policies. These policies can be set for the installation or for a specific data group. Determining whether an activity entry is in a delay/retry cycle This procedure allows you to check the status of an activity entry to determine whether MIMIX is attempting automatic delay retry cycles for the object. 1. From the Work with Data Groups display, type a 14 (Active objects) next to the data group you want and press Enter. 2. The Work with Data Group Activity display appears with a list of the objects that are actively being replicated. 3. Type a 12 (Work with Entries) next to the list entry for the object you want and press Enter. The Work with DG Activity Entries display appears with the list of activity entries for the object you selected. 4. To view additional details for an entry, type a 5 (Display) next to the activity entry you want and press Enter. The Display DG Activity Details display appears. 5. Check the value listed in the Waiting for retry field. The value *YES is displayed when the activity entry is undergoing automatic delay/retry processing. Delayed or failed activity entries and pending activity entries that are not in a delay retry cycle will always have a value of *NO. 6. When the value of the Waiting for retry field is *YES, the Delay/Retry Processing Information fields are also available and provide the following information: • The Retries attempted field identifies the number of times that MIMIX has attempted to process the activity entry. • The Retries remaining field identifies the remaining number of times that MIMIX can automatically attempt to retry the activity entry. MIMIX uses only as many of the remaining retry attempts as necessary to achieve a successful attempt. • The Delay interval (seconds) field identifies the number of seconds between the previous attempt and the next retry attempt. • The Timestamp of next attempt field identifies the approximate date and time that MIMIX will make the next attempt to process the activity entry. If object replication processes are busy processing other entries, there may be a delay between this time and when processing of this entry is actually attempted. The value *PENDING indicates that the time of the next attempt has passed and processing for the entry is waiting while other entries are being processed. This field is displayed only on the system of the process that is in delay/retry. 228 Removing data group activity history entries Removing data group activity history entries MIMIX maintains history of successfully completed distribution requests to provide a record of all object, DLO, and IFS replication activity completed by system journal replication processes. While MIMIX efficiently uses disk space and removes completed requests according to the value specified in the Keep data group history parameter of the system definition, you may occasionally need to manually remove completed activity entries. One reason to manually remove completed entries may be to conserve disk space, while another may be to clean up entries for an object that has been removed from replication as a result of a configuration change. Note: Your business policies and procedures may require that you archive completed activity entries to tape before you delete them. To remove completed activity entries, do the following: 1. From the Work with Data Groups display, type 28 (Completed objects) next to the data group you want and press Enter. The Work with Data Group Activity display appears with a list of objects with completed entries. 2. Type a 4 (Remove) next to the entry you want and do one of the following: • To remove all available completed entries for the selected object, press Enter. Then continue with Step 4. • To change the selection criteria to include entries for additional objects or to limit the entries based on a time range, press F4 (Prompt). The Remove DG Activity Entries (RMVDGACTE) display appears. 3. To change the selection criteria, do the following as needed: • To remove a subset of completed entries for the selected object based on the timestamp of the replicated journal entries, specify values for Starting date and time and Ending date and time prompts. • To expand the set of objects for which completed entries will be removed, change the values of the following prompts as needed: For an expanded set of object types, use the Object type prompt. For a library based object, use the Object and Library prompts. For a DLO, use the Document and Folder prompts. For an IFS object use the IFS object prompt. For a spooled file, use the Spooled file name, Output queue, and Library prompts. 4. A confirmation display appears. Press Enter. 229 Starting, ending, and verifying journaling CHAPTER 12 This chapter describes procedures for starting and ending journaling. Journaling must be active on all files, IFS objects, data areas and data queues that you want to replicate through a user journal. Normally, journaling is started during configuration. However, there are times when you may need to start or end journaling on items identified to a data group. The topics in this chapter include: • “What objects need to be journaled” on page 231 describes, for supported configuration scenarios, what types of objects must have journaling started before replication can occur. It also describes when journaling is started implicitly, as well as the authority requirements necessary for user profiles that create the objects to be journaled when they are created. • “MIMIX commands for starting journaling” on page 233 identifies the MIMIX commands available for starting journaling and describes the checking performed by the commands. • “Journaling for physical files” on page 235 includes procedures for displaying journaling status, starting journaling, ending journaling, and verifying journaling for physical files identified by data group file entries. • “Journaling for IFS objects” on page 238 includes procedures for displaying journaling status, starting journaling, ending journaling, and verifying journaling for IFS objects replicated cooperatively (advanced journaling). IFS tracking entries are used in these procedures. • “Journaling for data areas and data queues” on page 241 includes procedures for displaying journaling status, starting journaling, ending journaling, and verifying journaling for data area and data queue objects replicated cooperatively (advanced journaling). IFS tracking entries are used in these procedures. 230 What objects need to be journaled What objects need to be journaled A data group can be configured in a variety of ways that involve a user journal in the replication of files, data areas, data queues and IFS objects. Journaling must be started for any object to be replicated through a user journal or to be replicated by cooperative processing between a user journal and the system journal. Requirements for system journal replication - System journal replication processes use a special journal, the security audit (QAUDJRN) journal. Events are logged in this journal to create a security audit trail. When data group object entries, IFS entries, and DLO entries are configured, each entry specifies an object auditing value that determines the type of activity on the objects to be logged in the journal. Object auditing is automatically set for all objects defined to a data group when the data group is first started, or any time a change is made to the object entries, IFS entries, or DLO entries for the data group. Because security auditing logs the object changes in the system journal, no special action is need. Requirements for user journal replication - User journal replication processes require that the journaling be started for the objects identified by data group file entries. Both MIMIX Dynamic Apply and legacy cooperative processing use data group file entries and therefore require journaling to be started. Configurations that include advanced journaling for replication of data areas, data queues, or IFS objects also require that journaling be started on the associated object tracking entries and IFS tracking entries, respectively. Starting journaling ensures that changes to the objects are recorded in the user journal, and are therefore available for MIMIX to replicate. During initial configuration, the configuration checklists direct you when to start journaling for objects identified by data group file entries, IFS tracking entries, and object tracking entries. The MIMIX commands STRJRNFE, STRJRNIFSE, and STRJRNOBJE simplify the process of starting journaling. For more information about these commands, see “MIMIX commands for starting journaling” on page 233. Although MIMIX commands for starting journaling are preferred, you can also use IBM commands (STRJRNPF, STRJRN, STRJRNOBJ) to start journaling if you have the appropriate authority for starting journaling. Requirements for implicit starting of journaling - Journaling can be automatically started for newly created database files, data areas, data queues, or IFS objects when certain requirements are met. The user ID creating the new objects must have the required authority to start journaling and the following requirements must be met: • IFS objects - A new IFS object is automatically journaled if the directory in which it is created is journaled as a result of a request that permitted journaling inheritance for new objects. Typically, if MIMIX started journaling on the parent directory, inheritance is permitted. If you manually start journaling on the parent directory using the IBM command STRJRN, specify INHERIT(*YES). This will allow IFS objects created within the journaled directory to inherit the journal options and journal state of the parent directory. • Database files created by SQL statements - A new file created by a CREATE 231 What objects need to be journaled TABLE statement is automatically journaled if the library in which it is created contains a journal named QSQJRN. • New *FILE, *DTAARA, *DTAQ objects - The default value (*DFT) for the Journal at creation (JRNATCRT) parameter in the data group definition enables MIMIX to support both release-specific techniques that the operating system uses to automatically start journaling for physical files, data areas, and data queues when they are created. – On systems running IBM i 6.1 or higher releases, MIMIX uses the support provided by the IBM i command Start Journal Library (STRJRNLIB). Customers are advised not to re-create the QDFTJRN data area on systems running IBM i 6.1 or higher. – On systems running IBM i 5.4, MIMIX uses the QDFTJRN data area for journal at creation. The operating system will automatically journal a new object if it is created in a library that contains a QDFTJRN data area and the data area has enabled automatic journaling for the object type. When configuration requirements are met, MIMIX will either start library journaling or create the QDFTJRN data area for the appropriate libraries as well as enable automatic journaling for the configured cooperatively processed object types. When journal at creation configuration requirements are met, all new objects of that type are journaled, not just those which are eligible for replication. When the data group is started, MIMIX evaluates all data group object entries for each object type. (Entries for *FILE objects are only evaluated when the data group specifies COOPJRN(*USRJRN).) Entries properly configured to allow cooperative processing of the object type determine whether MIMIX will enforce library journaling or create the QDFTJRN data area. MIMIX uses the data group entry with the most specific match to the object type and library that also specifies *ALL for its System 1 object (OBJ1) and Attribute (OBJATR). Note: MIMIX prevents library journaling from starting or the QDFTJRN data area from being created in the following libraries: QSYS*, QRECOVERY, QRCY*, QUSR*, QSPL*, QRPL*, QRCL*, QRPL*, QGPL, QTEMP and SYSIB*. For example, if MIMIX finds only the following data group object entries for library MYLIB, it would use the first entry when determining whether to enforce library journaling or create the QDFTJRN data area because it is the most specific entry that also meets the OBJ1(*ALL) and OBJATR(*ALL) requirements. The second entry is not considered in the determination because its OBJ1 and OBJATR values do not meet these requirements. LIB1(MYLIB) OBJ1(*ALL) OBJTYPE(*FILE) OBJATR(*ALL) COOPDB(*YES) PRCTYPE(*INCLD) LIB1(MYLIB) OBJ1(MYAPP) OBJTYPE(*FILE) OBJATR(DSPF) COOPDB(*YES) PRCTYPE(*INCLD) Authority requirements for starting journaling Normal MIMIX processes run under the MIMIXOWN user profile, which ships with *ALLOBJ special authority. Therefore, it is not necessary for other users to account 232 MIMIX commands for starting journaling for journaling authority requirements when using MIMIX commands (STRJRNFE, STRJRNIFSE, STRJRNOBJE) to start journaling. When the MIMIX journal managers are started, or when the Build Journaling Environment (BLDJRNENV) command is used, MIMIX checks the public authority (*PUBLIC) for the journal. If necessary, MIMIX changes public authority so the user ID in use has the appropriate authority to start journaling. Authority requirements must be met to enable the automatic journaling of newly created objects and if you use IBM commands to start journaling instead of MIMIX commands. • If you create database files, data areas, or data queues for which you expect automatic journaling at creation, the user ID creating these objects must have the required authority to start journaling. • If you use the IBM commands (STRJRNPF, STRJRN, STRJRNOBJ) to start journaling, the user ID that performs the start journaling request must have the appropriate authority requirements. For journaling to be successfully started on an object, one of the following authority requirements must be satisfied: • The user profile of the user attempting to start journaling for an object must have *ALLOBJ special authority. • The user profile of the user attempting to start journaling for an object must have explicit *ALL object authority for the journal to which the object is to be journaled. • Public authority (*PUBLIC) must have *OBJALTER, *OBJMGT, and *OBJOPR object authorities for the journal to which the object is to be journaled. MIMIX commands for starting journaling Before you use any of the MIMIX commands for starting journaling, the data group file entries, IFS tracking entries, or object tracking entries associated with the command’s object class must be loaded. The MIMIX commands for starting journaling are: • Start Journal Entry (STRJRNFE) - This command starts journaling for files identified by data group file entries. • Start Journaling IFS Entries (STRJRNIFSE) - This command starts journaling of IFS objects configured for advanced journaling. Data group IFS entries must be configured and IFS tracking entries be loaded (LODDGIFSTE command) before running the STRJRNIFSE command to start journaling. • Start Journaling Obj Entries (STRJRNOBJE) - This command starts journaling of data area and data queue objects configured for advanced journaling. Data group object entries must be configured and object tracking entries be loaded (LODDGOBJTE command) before running the STRJRNOBJE command to start journaling. 233 MIMIX commands for starting journaling If you attempt to start journaling for a data group file entry, IFS tracking entry, or object tracking entry and the files or objects associated with the entry are already journaled, MIMIX checks that the physical file, IFS object, data area, or data queue is journaled to the journal associated with the data group. If the file or object is journaled to the correct journal, the journaling status of the data group file entry, IFS tracking or object tracking entry is changed to *YES. If the file or object is not journaled to the correct journal or the attempt to start journaling fails, an error occurs and the journaling status is changed to *NO. 234 Journaling for physical files Journaling for physical files Data group file entries identify physical files to be replicated. When data group file entries are added to a configuration, they may have an initial status of *ACTIVE. However, the physical files which they identify may not be journaled. In order for replication to occur, journaling must be started for the files on the source system. This topic includes procedures to display journaling status, and to start, end, or verify journaling for physical files. Displaying journaling status for physical files Use this procedure to display journaling status for physical files identified by data group file entries. Do the following: 1. From the MIMIX Intermediate Main Menu, type 1 and press Enter to access the Work with Data Groups display. 2. On the Work with Data Groups display, type 17 (File entries) next to the data group you want and press Enter. 3. The Work with DG File Entries display appears. The initial view shows the current and requested status of the data group file entry. Press F10 (Journaled view). At the right side of the display, the Journaled System 1 and System 2 columns indicate whether the physical file associated with the file entry is journaled on each system. Note: Logical files will have a status of *NA. Data group file entries exist for logical files only in data groups configured for MIMIX Dynamic Apply. Starting journaling for physical files Use this procedure to start journaling for physical files identified by data group file entries. In order for replication to occur, journaling must be started for the file on the source system. This procedure invokes the Start Journal Entry (STRJRNFE) command. The command can also be entered from a command line. Do the following: 1. Access the journaled view of the Work with DG File Entries display as described in “Displaying journaling status for physical files” on page 235. 2. From the Work with DG File Entries display, type a 9 (Start journaling) next to the file entries you want. Then do one of the following: • To start journaling using the command defaults, press Enter. • To modify command defaults, press F4 (Prompt) then continue with the next step. 3. The Start Journal Entry (STRJRNFE) display appears. The Data group definition prompts and the System 1 file prompts identify your selection. Accept these values or specify the values you want. 235 Journaling for physical files 4. Specify the value you want for the Start journaling on system prompt. Press F4 to see a list of valid values. When *DGDFN, *SRC, or *TGT is specified, MIMIX considers whether the data group is configured for journaling on the target system (JRNTGT) and starts or prevents journaling from starting as required. 5. If you want to use batch processing, specify *YES for the Submit to batch prompt. 6. To start journaling for the physical file associated with the selected data group, press Enter. The system returns a message to confirm the operation was successful. Ending journaling for physical files Use this procedure to end journaling for a physical file associated with a data group file entry. Once journaling for a file is ended, any changes to that file are not captured and are not replicated. You may need to end journaling if a file no longer needs to be replicated, to prepare for upgrading MIMIX software, or to correct an error. This procedure invokes the End Journaling File Entry (ENDJRNFE) command. The command can also be entered from a command line. To end journaling, do the following: 1. Access the journaled view of the Work with DG File Entries display as described in “Displaying journaling status for physical files” on page 235. 2. From the Work with DG File Entries display, type a 10 (End journaling) next to the file entry you want and do one of the following: Note: MIMIX cannot end journaling on a file that is journaled to the wrong journal. For example, a file that does not match the journal definition for that data group. If you want to end journaling outside of MIMIX, use the ENDJRNPF command. • To end journaling using command defaults, press Enter. Journaling is ended. • To modify additional prompts for the command, press F4 (Prompt) and continue with the next step. 3. The End Journal File Entry (ENDJRNFE) display appears. If you want to end journaling for all files in the library, specify *ALL at the System 1 file prompt. 4. Specify the value you want for the End journaling on system prompt. Press F4 to see a list of valid values. When *DGDFN, *SRC, or *TGT is specified, MIMIX considers whether the data group is configured for journaling on the target system (JRNTGT) and ends or prevents journaling from ending as required. 5. If you want to use batch processing, specify *YES for the Submit to batch prompt. 6. To end journaling, press Enter. 236 Journaling for physical files Verifying journaling for physical files Use this procedure to verify if a physical file defined by a data group file entry is journaled correctly. This procedure invokes the Verify Journaling File Entry (VFYJRNFE) command to determine whether the file is journaled and whether it is journaled to the journal defined in the journal definition. When these conditions are met, the journal status on the Work with DG File Entries display is set to *YES. The command can also be entered from a command line. To verify journaling for a physical file, do the following: 1. Access the journaled view of the Work with DG File Entries display as described in “Displaying journaling status for physical files” on page 235. 2. From the Work with DG File Entries display, type a 11 (Verify journaling) next to the file entry you want and do one of the following: • To verify journaling using command defaults, press Enter. • To modify additional prompts for the command, press F4 (Prompt) and continue with the next step. 3. The Verify Journaling File Entry (VFYJRNFE) display appears. The Data group definition prompts and the System 1 file prompts identify your selection. Accept these values or specify the values you want. 4. Specify the value you want for the Verify journaling on system prompt. When *DGDFN is specified, MIMIX considers whether the data group is configured for journaling on the target system (JRNTGT) when determining where to verify journaling. 5. If you want to use batch processing, specify *YES for the Submit to batch prompt 6. Press Enter. 237 Journaling for IFS objects Journaling for IFS objects IFS tracking entries are loaded for a data group after the data group IFS entries have been configured for replication through the user journal (advanced journaling). However, loading IFS tracking entries does not automatically start journaling on the IFS objects they identify. In order for replication to occur, journaling must be started on the source system for the IFS objects identified by IFS tracking entries. This topic includes procedures to display journaling status, and to start, end, or verify journaling for IFS objects identified for replication through the user journal. These references go to different files in different books. You should be aware of the information in “Considerations for working with long IFS path names” on page 262. Displaying journaling status for IFS objects Use this procedure to display journaling status for IFS objects identified by IFS tracking entries. Do the following: 1. From the MIMIX Intermediate Main Menu, type 1 and press Enter to access the Work with Data Groups display. 2. On the Work with Data Groups display, type 50 (IFS trk entries) next to the data group you want and press Enter. 3. The Work with DG IFS Trk. Entries display appears. The initial view shows the object type and status at the right of the display. Press F10 (Journaled view). At the right side of the display, the Journaled System 1 and System 2 columns indicate whether the IFS object identified by the tracking is journaled on each system. Starting journaling for IFS objects Use this procedure to start journaling for IFS objects identified by IFS tracking entries. This procedure invokes the Start Journaling IFS Entries (STRJRNIFSE) command. The command can also be entered from a command line. To start journaling for IFS objects, do the following: 1. If you have not already done so, load the IFS tracking entries for the data group. For more information see the MIMIX Administrator Reference book. 2. Access the journaled view of the Work with DG IFS Trk. Entries display as described in “Displaying journaling status for IFS objects” on page 238. 3. From the Work with DG IFS Trk. Entries display, type a 9 (Start journaling) next to the IFS tracking entries you want. Then do one of the following: • To start journaling using the command defaults, press Enter. • To modify the command defaults, press F4 (Prompt) and continue with the next step. 238 Journaling for IFS objects 4. The Start Journaling IFS Entries (STRJRNIFSE) display appears. The Data group definition and IFS objects prompts identify the IFS object associated with the tracking entry you selected. You cannot change the values shown for the IFS objects prompts1. 5. Specify the value you want for the Start journaling on system prompt. Press F4 to see a list of valid values. When *DGDFN, *SRC, or *TGT is specified, MIMIX considers whether the data group is configured for journaling on the target system (JRNTGT) and starts or prevents journaling from starting as required. 6. To use batch processing, specify *YES for the Submit to batch prompt and press Enter. Additional prompts for Job description and Job name appear. Either accept the default values or specify other values. 7. The System 1 file identifier and System 2 file identifier prompts identify the file identifier (FID) of the IFS object on each system. You cannot change the values2. 8. To start journaling on the IFS objects specified, press Enter. Ending journaling for IFS objects Use this procedure to end journaling for IFS objects identified by IFS tracking entries. This procedure invokes the End Journaling IFS Entries (ENDJRNIFSE) command. The command can also be entered from a command line. To end journaling for IFS objects, do the following: 1. Access the journaled view of the Work with DG IFS Trk. Entries display as described in “Displaying journaling status for IFS objects” on page 238. 2. From the Work with DG IFS Trk. Entries display, type a 10 (End journaling) next to the IFS tracking entries you want. Then do one of the following: • To end journaling using the command defaults, press Enter. • To modify the command defaults, press F4 (Prompt) and continue with the next step. 3. The End Journaling IFS Entries (ENDJRNIFSE) display appears. The Data group definition and IFS objects prompts identify the IFS object associated with the tracking entry you selected. You cannot change the values shown for the IFS objects prompts1. 4. Specify the value you want for the End journaling on system prompt. Press F4 to see a list of valid values. When *DGDFN, *SRC, or *TGT is specified, MIMIX considers whether the data group is configured for journaling on the target system (JRNTGT) and ends or 1. When the command is invoked from a command line, you can change values specified for the IFS objects prompts. Also, you can specify as many as 300 object selectors by using the + for more values prompt. 2. When the command is invoked from a command line, use F10 to see the FID prompts. Then you can optionally specify the unique FID for the IFS object on either system. The FID values can be used alone or in combination with the IFS object path name. 239 Journaling for IFS objects prevents journaling from ending as required. 5. To use batch processing, specify *YES for the Submit to batch prompt and press Enter. Additional prompts for Job description and Job name appear. Either accept the default values or specify other values. 6. The System 1 file identifier and System 2 file identifier identify the file identifier (FID) of the IFS object on each system. You cannot change the values shown2. 7. To end journaling on the IFS objects specified, press Enter. Verifying journaling for IFS objects Use this procedure to verify if an IFS object identified by an IFS tracking entry is journaled correctly. This procedure invokes the Verify Journaling IFS Entries (VFYJRNIFSE) command to determine whether the IFS object is journaled, whether it is journaled to the journal defined in the data group definition, and whether it is journaled with the attributes defined in the data group definition. The command can also be entered from a command line. To verify journaling for IFS objects, do the following: 1. Access the journaled view of the Work with DG IFS Trk. Entries display as described in “Displaying journaling status for IFS objects” on page 238. 2. From the Work with DG IFS Trk. Entries display, type a 11 (Verify journaling) next to the IFS tracking entries you want. Then do one of the following: • To verify journaling using the command defaults, press Enter. • To modify the command defaults, press F4 (Prompt) and continue with the next step. 3. The Verify Journaling IFS Entries (VFYJRNIFSE) display appears. The Data group definition and IFS objects prompts identify the IFS object associated with the tracking entry you selected. You cannot change the values shown for the IFS objects prompts1. 4. Specify the value you want for the Verify journaling on system prompt. Press F4 to see a list of valid values. When *DGDFN is specified, MIMIX considers whether the data group is configured for journaling on the target system (JRNTGT) and verifies journaling on the appropriate systems as required. 5. To use batch processing, specify *YES for the Submit to batch prompt and press Enter. Additional prompts for Job description and Job name appear. Either accept the default values or specify other values. 6. The System 1 file identifier and System 2 file identifier identify the file identifier (FID) of the IFS object on each system. You cannot change the values shown2. 7. To verify journaling on the IFS objects specified, press Enter. “Using file identifiers (FIDs) for IFS objects” on page 273. 240 Journaling for data areas and data queues Journaling for data areas and data queues Object tracking entries are loaded for a data group after the data group object entries have been configured replication through the user journal (advanced journaling). However, loading object tracking entries does not automatically start journaling on the objects they identify. In order for replication to occur, journaling must be started for the objects on the source system for the objects identified by object tracking entries. This topic includes procedures to display journaling status, and to start, end, or verify journaling for data areas and data queues identified for replication through the user journal. Displaying journaling status for data areas and data queues To check journaling status for data areas and data queues identified by object tracking entries. Do the following: 1. From the MIMIX Intermediate Main Menu, type 1 and press Enter to access the Work with Data Groups display. 2. On the Work with Data Groups display, type 52 (Obj trk entries) next to the data group you want and press Enter. 3. The Work with DG Obj. Trk. Entries display appears. The initial view shows the object type and status at the right of the display. Press F10 (Journaled view). At the right side of the display, the Journaled System 1 and System 2 columns indicate whether the object identified by the tracking is journaled on each system. Starting journaling for data areas and data queues Use this procedure to start journaling for data areas and data queues identified by object tracking entries. This procedure invokes the Start Journaling Obj Entries (STRJRNOBJE) command. The command can also be entered from a command line. To start journaling for data areas and data queues, do the following: 1. If you have not already done so, load the object tracking entries for the data group. For more information see the MIMIX Administrator Reference book. 2. Access the journaled view of the Work with DG Obj. Trk. Entries display as described in “Displaying journaling status for data areas and data queues” on page 241. 3. From the Work with DG Obj. Trk. Entries display, type a 9 (Start journaling) next to the object tracking entries you want. Then do one of the following: • To start journaling using the command defaults, press Enter. • To modify the command defaults, press F4 (Prompt) and continue with the next step. 4. The Start Journaling Obj Entries (STRJRNOBJE) display appears. The Data group definition and Objects prompts identify the object associated with the 241 Journaling for data areas and data queues tracking entry you selected. Although you can change the values shown for these prompts, it is not recommended unless the command was invoked from a command line. 5. Specify the value you want for the Start journaling on system prompt. Press F4 to see a list of valid values. When *DGDFN, *SRC, or *TGT is specified, MIMIX considers whether the data group is configured for journaling on the target system (JRNTGT) and starts or prevents journaling from starting as required. 6. To use batch processing, specify *YES for the Submit to batch prompt and press Enter. Additional prompts for Job description and Job name appear. Either accept the default values or specify other values. 7. To start journaling on the objects specified, press Enter. Ending journaling for data areas and data queues Use this procedure to end journaling for data areas and data queues identified by object tracking entries. This procedure invokes the End Journaling Obj Entries (ENDJRNOBJE) command. The command can also be entered from a command line. To end journaling for data areas and data queues, do the following: 1. Access the journaled view of the Work with DG Obj. Trk. Entries display as described in “Displaying journaling status for data areas and data queues” on page 241. 2. From the Work with DG Obj. Trk. Entries display, type a 10 (End journaling) next to the object tracking entries you want. Then do one of the following: • To verify journaling using the command defaults, press Enter. • To modify the command defaults, press F4 (Prompt) and continue with the next step. 3. The End Journaling Obj Entries (ENDJRNOBJE) display appears. The Data group definition and IFS objects prompts identify the object associated with the tracking entry you selected. Although you can change the values shown for these prompts, it is not recommended unless the command was invoked from a command line. 4. Specify the value you want for the End journaling on system prompt. Press F4 to see a list of valid values. When *DGDFN, *SRC, or *TGT is specified, MIMIX considers whether the data group is configured for journaling on the target system (JRNTGT) and ends or prevents journaling from ending as required. 5. To use batch processing, specify *YES for the Submit to batch prompt and press Enter. Additional prompts for Job description and Job name appear. Either accept the default values or specify other values. 6. To end journaling on the objects specified, press Enter. 242 Journaling for data areas and data queues Verifying journaling for data areas and data queues Use this procedure to verify if an object identified by an object tracking entry is journaled correctly. This procedure invokes the Verify Journaling Obj Entries (VFYJRNOBJE) command to determine whether the object is journaled, whether it is journaled to the journal defined in the data group definition, and whether it is journaled with the attributes defined in the data group definition. The command can also be entered from a command line. To verify journaling for objects, do the following: 1. Access the journaled view of the Work with DG Obj. Trk. Entries display as described in “Displaying journaling status for data areas and data queues” on page 241. 2. From the Work with DG Obj. Trk. Entries display, type a 11 (Verify journaling) next to the object tracking entries you want. Then do one of the following: • To verify journaling using the command defaults, press Enter. • To modify the command defaults, press F4 (Prompt) and continue with the next step. 3. The Verify Journaling Obj Entries (VFYJRNOBJE) display appears. The Data group definition and Objects prompts identify the object associated with the tracking entry you selected. Although you can change the values shown for these prompts, it is not recommended unless the command was invoked from a command line. 4. Specify the value you want for the Verify journaling on system prompt. Press F4 to see a list of valid values. When *DGDFN is specified, MIMIX considers whether the data group is configured for journaling on the target system (JRNTGT) and verifies journaling on the appropriate systems as required. 5. To use batch processing, specify *YES for the Submit to batch prompt and press Enter. Additional prompts for Job description and Job name appear. Either accept the default values or specify other values. 6. To verify journaling on the objects specified, press Enter. 243 About switching CHAPTER 13 Switching Switching is when you temporarily reverse the roles of the systems. The original source system (production) becomes the temporary target system and the original target system (backup) becomes the temporary source system. When the scenario that required you to switch directions is resolved, you typically switch again to return the systems to their original roles. This chapter provides information and procedures to support switching. The following topics are included: • “About switching” on page 244 provides information about switching with MIMIX including best practice and reasons why a switch should be performed. Subtopics describe: – What is a planned switch and requirements for a planned switch – What is an unplanned switch and actions to be completed after the failed source system is recovered – The role of procedures for switching environments that use application groups – The role of MIMIX Model Switch Framework for switching environments that do not use application groups • “Switching an application group” on page 250 describes how to run a procedure to switch an application group. • “Switching a data group-only environment” on page 251. describes how to switch from a 5250 emulator. • “Determining when the last switch was performed” on page 253 describes how to check the Last switch field which indicates the switch compliance status and provides the date when the last switch was performed. • “Problems checking switch compliance” on page 254 describes problems that can occur with data for the Last switch field. • “Performing a data group switch” on page 255 describes how to switch a single data group using the SWTDG command. • “Switch Data Group (SWTDG) command” on page 257 provides background information about the SWTDG command, which is used in all switch interfaces. About switching Replication environments rarely remain static. Therefore, best practice is to perform regular switches to ensure that you are prepared should you need to perform one during an emergency. MIMIX supports two methods for switching the direction in which replication occurs for a data group. These methods are known as a planned switch and an unplanned switch. 244 About switching You may need to perform a switch for any of the following reasons: • The production system becomes unavailable due to an unplanned outage. A switch in this scenario is unplanned. • You need to perform hardware or software maintenance on the production system. Typically, you can schedule this in advance so the switch is planned. • You need to test your recovery plan. This activity is also a planned switch. Historically, the concept of switching consists of three phases: switch to the backup system, synchronize the systems when the production is ready to use, and switch back to the production system.This round-trip view of switching assumes your goal is to return to your original production system as quickly as possible. However, this view overlooks the fact some customers may have an extended time pass between phase one and the other phases, or may even view a switch as a one-way trip. MIMIX supports both conceptual views of switching. Switching data groups is only a part of performing a switch. MIMIX provides robust support for customizing switching activity include all the needs of your environment. Best practice for switching includes performing regular switches. Best practice also includes performing all audits with the audit level set at level 30 immediately prior to a planned switch to the backup system and before switching back to the production system. For performing the switch in an environment that uses application groups is to use option 4 (Switch all application groups) from the MIMIX Basic Main Menu. Best practice for performing a switch in an environment using only data groups is to use option 5 (start or complete switch using Switch Asst.) from the MIMIX Basic Main Menu. Planned switch You can start a planned switch from either system. In a planned switch, MIMIX initiates a controlled shutdown of the data group. Both systems and the communications between them must be active. Before you start a planned switch of a data group, you should ensure that the following actions have been completed. Your enterprise may have additional requirements. • Perform an full set of audits with the audit level policy set to level 30. Running the #FILDTA audit at this audit level checks 100 percent of file member data for the data group for synchronization between source and target systems and is strongly recommended. • Shut down any applications that use database files or objects defined to the data group. If any users or other non-MIMIX processes remain active while the switch is being performed, the data can become not synchronized between systems and orphaned data may result. • Ensure that there are no jobs other than MIMIX currently active on the source system. This may require ending all interactive and batch subsystems other than MIMIX and ending communications. • Users should be prevented from accessing either system until after the switch is complete and the data group is restarted. 245 About switching • If you use user journal replication processes, you should address any files, IFS tracking entries, or object tracking entries in error for your critical database files. If you use system journal replication processes, you should address any object errors. You are not required to run journal analysis after a planned switch. MIMIX retains information about where activity ended so that when you restart the data group, it is started at the correct point. When the data group is started, the temporary target system (the production system) is now being updated with user changes that are being replicated from the temporary source system (the backup system). Do not allow users onto the production system until after the production system is caught up with these transactions and you run the switch process again to revert to the normal roles. Unplanned switch In an unplanned switch, the source system is assumed to be unavailable. An unplanned switch is generally required when the source system fails and, in order to continue normal operations, you must switch users to a backup system. (Typically MIMIX is configured so that the target for replication is your backup system.) You must run an unplanned switch from the target system. MIMIX performs a controlled shutdown of replication processes on the target system. The controlled shutdown allows all apply processing to catch up before the apply processes are ended. There are default (*DFT) values for several parameters on the SWTDG command that allow the switch operation to continue without intervention from the user. See “Planned switch” on page 245 for additional details about these default values. In an unplanned switch of a data group that uses remote journaling, the default behavior is to end the RJ link. Once the failed source system is recovered, the following actions should be completed: • You should perform journal analysis on that system before restarting the data group or user applications. Journal analysis helps identify any possible loss of data that may have occurred when the source system failed. Journal analysis relies on status information on the source system about the last entry that was applied. This information will be cleared when the data group is restarted. • Communication between the systems must be active before you restart the data group. The switch process is complete when you restart the data group. When the data group is restarted, MIMIX notifies the source system that it is now the temporary target system. • New transactions are created on the temporary source system (the backup system) while the production system (the temporary target system) is unavailable for replication. After you have completed journal analysis, you can send these new transactions to the production system to synchronize the databases. Once the databases are synchronized, you must run the switch process again to revert to the normal roles before allowing users onto the production system. 246 About switching When the data group is started after a switch, any pending transactions are cleared. The journal receiver is already changed by the switch process and the new journal receiver and first sequence number are used. Switching application group environments with procedures Application groups can only be switched using procedures. Procedures and steps are a highly customizable means of performing operations for application groups. Each application group has a set of default procedures that include procedures for performing pre-check activity for switching and switching. Each operation is performed by a procedure that consists of a sequence of steps and multiple jobs. Each step calls a predetermined step program to perform a specific sub-task of the larger operation. This following paragraphs describe the behavior of the switch (SWTAG) command for application groups that do not participate in a cluster controlled by the IBM i operating system (*NONCLU application groups). What is the scope of the request? The following parameters identify the scope of the requested operation: Application group definition (AGDFN) - Specifies the requested application group. You can either specify a name or the value *ALL. Resource groups (TYPE) - Specifies the types of resource groups to be processed for the requested application group. Data resource group entry (DTARSCGRP) - Specifies the data resource groups to include in the request. The default is *ALL or you can specify a name. This parameter is ignored when TYPE is *ALL or *APP. What is the requested switch behavior? The following parameters on the SWTAG command define the expected behavior: Switch type (SWTTYP) - This specifies the reason the application group is being switched. The procedure called to perform the switch and the actions performed during the switch differ based on whether the current primary node (data source) is available at the start of the switch procedure. The default value, *PLANNED, indicates that the primary node is still available and the switch is being performed for normal business processes (such as to perform maintenance on the current source system or as part of a standard switch procedure). The value *UNPLANNED indicates that the switch is an unplanned activity and the data source system may not be available. Node roles (ROLE) - This specifies which set of node roles will determine the node that becomes the new primary node as a result of the switch. The default value *CURRENT uses the current order of node roles. If the application group participates in a cluster, the current roles defined within the CRGs will be used. If *CONFIG is specified, the configured primary node will become the new primary node and the new role of other nodes in the recovery domain will be determined from their current roles. If you specify a name of a node within the recovery domain for the application group, the node will be made the new primary node and the new role of other nodes in the recovery domain will be determined from their current roles. 247 About switching What procedure will be used? The following parameters identify the procedure to use and its starting point: Begin at step (STEP) - Specifies where the request will start within the specified procedure. This parameter is described in detail below. Procedure (PROC) - Specifies the name of the procedure to run to perform the requested operation when starting from its first step. The value *DFT will use the procedure designated as the default for the application group. The value *LASTRUN uses the same procedure used for the previous run of the command. You can also specify the name of a procedure that is valid the specified application group and type of request. Where should the procedure begin? The value specified for the Begin at step (STEP) parameter on the request to run the procedure determines the step at which the procedure will start. The status of the last run of the procedure determines which values are valid. The default value, *FIRST, will start the specified procedure at its first step. This value can be used when the procedure has never been run, when its previous run completed (*COMPLETED or *COMPERR), or when a user acknowledged the status of its previous run which failed, was canceled, or completed with errors (*ACKFAILED, *ACKCANCEL, or *ACKERR respectively). Other values are for resolving problems with a failed or canceled procedure. When a procedure fails or is canceled, subsequent attempts to run the same procedure will fail until user action is taken. You will need to determine the best course of action for your environment based on the implications of the canceled or failed steps and any steps which completed. The value *RESUME will start the last run of the procedure beginning with the step at which it failed, the step that was canceled in response to an error, or the step following where the procedure was canceled. The value *RESUME may be appropriate after you have investigated and resolved the problem which caused the procedure to end. Optionally, if the problem cannot be resolved and you want to resume the procedure anyway, you can override the attributes of a step before resuming the procedure. The value *OVERRIDE will override the status of all runs of the specified procedure that did not complete. The *FAILED or *CANCELED status of these procedures are changed to acknowledged (*ACKFAILED or *ACKCANCEL) and a new run of the procedure begins at the first step. . For more information about starting a procedure with the step at which it failed, see “Resuming a procedure” on page 91. For more information about customizing procedures, see the MIMIX Administrator Reference book. Switching data group environments with MIMIX Model Switch Framework Note: MIMIX Model Switch Framework does not support switching application groups. Only data groups that are not associated with application groups should be switched with MIMIX Model Switch Framework. 248 About switching MIMIX provides a customized implementation of MIMIX Model Switch Framework to perform a switch. MIMIX Model Switch Framework is ideally suited for customizing a switching solution that detects the need for an unplanned switch, switches the direction of data group replication, and switches users to the backup system. Typically, if you have a Runbook, it will direct you when to use your MIMIX Model Switch Framework implementation for both planned and unplanned switches. The MIMIX Model Switch Framework calls the Switch Data Group (SWTDG) command. The SWTDG command only switches the direction in which replication occurs for a single data group; it does not switch users or any other facets of your normal operating environment to the backup system. However, MIMIX Model Switch Framework can be configured to address these additional facets of your environment for multiple data groups. If you choose to use the SWTDG command either by invoking it from a command line or by using the options for switching on the Work with Data Groups display, you must take action to switch users to the backup system and address other requirements for operating there. The switching option from the MIMIX Basic Main menu are implementations of MIMIX Model Switch Framework. The implementation is identified within policies. Instructions for switching using MIMIX Model Switch Framework are described in “Switching a data group-only environment” on page 251. For additional information see the chapter “Using the MIMIX Model Switch Framework” in the Using MIMIX Monitor book. 249 Switching an application group Switching an application group For an application group, a procedure for only one operation (start, end, or switch) can run at a time. For details about parameters and behavior of the SWTAG command, see “Switching application group environments with procedures” on page 247. To switch an application group, do the following: 1. From the Work with Application Groups display, type 15 (Switch) next to the application group you want and press Enter. The Switch Application Group (SWTAG) display appears. 2. Verify that the values you want are specified for Resource groups and Data resource group entry. 3. Specify the type of switch to perform at the Switch type prompt. 4. Verify that the default value *CURRENT for Node roles prompt is valid for the switch you need to perform. If necessary, specify a different value. 5. If you are starting the procedure after addressing problems with the previous switch request, specify the value you want for Begin at step. Be certain that you understand the effect the value you specify will have on your environment. 6. Press Enter. 7. The Procedure prompt appears. Do one of the following: • To use the default switch procedure for the specified switch type, press Enter. • To use a different switch procedure for the application group, specify its name. Then press Enter. 8. A switch confirmation panel appears. To perform the switch, press F16. 250 Switching a data group-only environment Switching a data group-only environment In environments that do not use application groups, option 5 (Start or complete switch using Switch Asst.) on the MIMIX Basic Main Menu is designed to simplify switching by using a default MIMIX Model Switch Framework implementation. When you use this option, MIMIX keeps track of which phase of the switch process you are in. You will see a confirmation display that is appropriate for each phase. Each phase will prompt the Run Switch Framework command (RUNSWTFWK) with your default switch framework and appropriate values for the phase. To change the default switch framework to a different implementation, see “Policies for switching with model switch framework” on page 48. Switching to the backup system This procedure switches operations to the backup system. Before using this procedure, consult your runbook for any additional procedures that must be performed when switching to the backup system. 1. If this is a planned switch, Vision Solutions strongly recommends that you perform a full set of audits with the audit level policy set to level 30. Running the #FILDTA audit at this audit level checks 100 percent of file member data for the data group for synchronization between source and target systems. 2. Shut down all active applications that are reading or updating replicated objects from the production and backup systems. Do the following from the backup system: 3. Ensure that all transactions have been applied to the backup system by doing the following: a. Select option 6 (Work with data groups) from the MIMIX Basic Main Menu.and press Enter. b. For each data group, select option 8 (Display status) and ensure that the Unprocessed entry counts for both database and object apply have no values. 4. From the MIMIX Basic Main Menu, select option 5 (Start or complete switch using Switch Asst.). 5. You will see the Confirm Switch to Backup confirmation display. Press F16 to confirm your choice to switch MIMIX and specify switching options. 6. The Run Switch Framework (RUNSWTFWK) command appears. The default Switch framework and the value *BCKUP for the Switch framework process are preselected and cannot be changed. Do the following: a. You must specify the type of switch to perform, *PLANNED or *UNPLANNED, at the Switch type prompt. b. You can change values for other parameters as needed. c. To start the switch, press Enter. 7. Consult your runbook to determine if any additional steps are needed. 251 Switching a data group-only environment After you complete this phase of the switch you must wait until the original production system is available again. Then perform the steps in “Synchronizing data and starting MIMIX on the original production system” on page 252. Synchronizing data and starting MIMIX on the original production system This procedure synchronizes data and starts replication from the backup system to the original production system. Synchronizing the data ensures that the data on both systems is equivalent before replication is started. Before using this procedure, consult your runbook for any additional procedures that must be performed when synchronizing and starting replication from the backup system to the original production system Do the following from the backup system: 1. Ensure the original production system is available again. 2. From the MIMIX Basic Main Menu, select option 5 (Start or complete switch using Switch Asst.). 3. You will see the Confirm Synchronize and Start confirmation display. Press F16 to confirm your choice and specify switching options. 4. The Run Switch Framework (RUNSWTFWK) command appears. The default Switch framework and the value *SYNC for the Switch framework process are preselected and cannot be changed. Do the following: a. Optionally, you can change the value of the Set object auditing level prompt. b. To synchronize and start, press Enter. 5. Once replication has caught up, Vision Solutions strongly recommends that you perform a full set of audits with the audit level policy set to level 30. Running the #FILDTA audit at this audit level checks 100 percent of file member data for the data group for synchronization between source and target systems. 6. Consult your runbook to determine if any additional steps are needed. When you are ready to switch back to the original production system, use “Switching to the production system” on page 252. Switching to the production system This procedure returns operations to the original production system. Before using this procedure, consult your runbook for any additional procedures that must be performed when switching to the production system 1. Shut down all active applications that are reading or updating replicated objects from the production and backup systems. Do the following from the original production system: 2. Ensure that all transactions have been applied by doing the following: a. Select option 6 (Work with data groups) from the MIMIX Basic Main Menu.and press Enter. 252 Determining when the last switch was performed b. For each data group, select option 8 (Display status) and ensure that the Unprocessed entry counts for both database and object apply have no values. 3. From the MIMIX Basic Main Menu, select option 5 (Start or complete switch using Switch Asst.). 4. You will see the Confirm Switch to Production confirmation display. Press F16 to confirm your choice to switch MIMIX and specify switching options. 5. The Run Switch Framework (RUNSWTFWK) command appears. The default Switch framework and the value *PROD for the Switch framework process are preselected and cannot be changed. Do the following: a. You can change values for other parameters as needed. b. To start the switch, press Enter. 6. Consult your runbook to determine if any additional steps are needed. Determining when the last switch was performed Replication environments rarely remain static. Therefore, best practice is to perform regular switches to ensure that you are prepared should you need to perform one during an emergency. The Last switch field indicates compliance with best practices. The status of the field is highlighted to indicate the following: Yellow - The number of days since the last switch is at the limit of what is considered to be best practice. This threshold is determined by the Switch warning threshold policy. Red - The number of days since the last switch is beyond what is considered to be best practice. This threshold is determined by the Switch action threshold policy. Checking the last switch date A 5250 emulator session provides information on the last switch date for an installation from the Last switch field on the MIMIX Availability Status display. This field is only displayed when a value is specified for the Default model switch framework policy. The date indicates when the last completed switch was performed using the switch framework specified in the policy. To check the last switch date from a 5250 emulator, do the following: 1. Access the MIMIX Basic Main Menu. See “Accessing the MIMIX Main Menu” on page 24. 2. From the MIMIX Basic Main Menu, select option 10 (Availability status) and press Enter. The MIMIX Availability Status display appears. The last switch date is located in the upper right corner of the display. 253 Problems checking switch compliance Problems checking switch compliance The Last switch field indicates the switch compliance status and provides the date when the last switch was performed. This field is displayed correctly when certain requirements have been met. The following problems can occur: • Approaching or out of compliance - The status of the field is highlighted to indicate the number of days since the last switch is at the limit of what is considered to be best practice. Schedule and perform a switch to resolve this problem. • No Last switch field - This field is only displayed when there is a value specified for the Default model switch framework policy. The date indicates when the last completed switch was performed using the switch framework specified in the policy. Specify the name of the model switch framework you use for switching in policies. See “Policies for switching with model switch framework” on page 48. “Policies for switching with model switch framework” on page 48 254 Performing a data group switch Performing a data group switch Performing a data group switch changes the direction of replication for a data group through the Switch Data Group (SWTDG) command. Only replication for the selected data group is switched. You may want to perform a data group switch if you are having problems with an application that only affects a specific data group or if you need to manually load balance because of heavily used applications. Note: You cannot switch a disabled data group. For more information, see “Disabling and enabling data groups” on page 269. To perform a data group switch, do the following: 1. If you will be performing a planned switch, do the following: a. Shut down any applications that have database file or objects defined to the data group. b. Ensure that you have addressed any critical database files that are held due to error or held for other reasons. c. Ensure there are no pending object activity entries by entering: WRKDGACTE STATUS(*ACTIVE) 2. From the Work with Data Groups display, type the option for the type of switch you want next to the data group you want to switch and press Enter. • Use option 15 for a planned switch • Use option 16 for an unplanned switch 3. Some of the parameter values that you may want to consider when the Switch Data Group display appears are: • If you specified Switch type of *PLANNED and have specified a number for the Wait time (seconds) parameter, you can specify a value for the Timeout Option parameter to specify what action you want the SWTDG command to perform if the time specified in the Wait time (seconds) parameter is exceeded. When you are performing a planned switch you may want to specify the number of seconds to wait before all the active data group processes end. If you specify *NOMAX the switch process will wait until all data group processes are ended. This could delay the switch process. • You can use the Conditions that end switch parameter to specify the types of errors that you want to end the switch operation. To ensure that the most comprehensive checking options are used, choose *ALL. For a planned switch, the default value, *DFT, is the same as *ALL. For an unplanned switch, *DFT will prevent the switch only when database apply backlogs exist. • Verify that the value for the Start journaling on new source prompt is what you want. If necessary, change the value. 4. After the confirmation screen, press F16 to continue. 5. Press Enter. Messages appear indicating the status of the switch request. When you see a message indicating that the switch is complete, users can begin processing as usual on the temporary source system. 255 Performing a data group switch 6. If you performed an unplanned switch, perform journal analysis on the original source system as soon as it is available, to determine if any transactions were missed. Use topic “Performing journal analysis” on page 295. 7. Start the data group, clearing pending entries, using the procedure in “Starting selected data group processes” on page 181. This starts replication in the new temporary direction. 256 Switch Data Group (SWTDG) command Switch Data Group (SWTDG) command The Switch Data Group (SWTDG) command provides the following parameters to control how you want your switch operation handled: • The Wait time (seconds) parameter (WAIT) is used to specify the number of seconds to wait for all of the active data group processes to end. The function of the default value *DFT is different for planned switches than it is for unplanned switches. For a planned switch, the value *DFT is equivalent to the value *NOMAX. For an unplanned switch, the value *DFT is set to wait 300 seconds (5 minutes) for all of the active data group processes to end. • If you specify a value for the WAIT parameter you can use the Timeout option parameter (TIMOUTOPT) to specify what action to take when the wait time you specified is reached. The function of the default value *DFT is different for planned switches than it is for unplanned switches. For a planned switch, the value *DFT is equivalent to the value *QUIT. When the value specified for the WAIT parameter is reached, the current process quits and returns control to the caller. For an unplanned switch, the value *DFT is equivalent to the value *NOTIFY. When the value specified for the WAIT parameter is reached, an inquiry message is sent to notify the operator of a possible error condition. • The Conditions that end switch (ENDSWT) parameter is used to specify which conditions should end the switch process. The function of the default value *DFT is different for planned switches than it is for unplanned switches. – For a planned switch, the value *DFT is equivalent to the value *ALL. The value *ALL provides the most comprehensive checking for conditions that are not compatible with best practices for switching. Additionally, the value *ALL ensures that your programs will automatically include any future ENDSWT parameter values that may be added to maintain a conservative approach to the switching operation. – For an unplanned switch, the value *DFT ends the process if there are any backlogs for the database apply process. However, backlogs on other user journal processes are not checked and switch processing is not ended even though conditions may exist which are not compatible with best practices for switching and may result in the loss of data. • The Start journaling on new source (STRJRNSRC) parameter is used to specify whether you want to start journaling for the data group on the new source system. • The End journaling on new target (ENDJRNTGT) parameter is used to specify whether you want to end journaling of the data group on the new target system. • The End remote journaling (ENDRJLNK) parameter is used in a planned switch of a data group that uses remote journaling. This parameter specifies whether you want to end remote journaling for the data group. The default behavior is to leave the RJ link running. You need to consider whether to keep the RJ link active after a planned switch of a data group. For more information, see “When to end the RJ link” on page 188. • The Change user journal receiver (CHGUSRRCV) parameter is used to specify whether or not you want MIMIX to create and attach a new user (database) journal 257 Switch Data Group (SWTDG) command receiver during the switch operation. If you have applications that are dependent on the receiver name for recovery purposes, It is recommended that you choose CHGUSRRCV(*NO) to prevent a new journal receiver from being created during a data group switch. • The Change system journal receiver (CHGSYSRCV) parameter is used to specify whether or not you want MIMIX to create and attach a new journal receiver to the system (audit) journal (QAUDJRN) during the switch operation. If you have applications that are dependent on the receiver name for recovery purposes, it is recommended that you choose CHGSYSRCV(*NO) to prevent a new journal receiver from being created during a data group switch. • The End if database errors (ENDDBERR) parameter has been obsoleted by the Conditions that end switch (ENDSWT) parameter. Previously, the ENDDBERR parameter was used to specify whether to switch the data group when data replication errors exist. Use the ENDSWT parameter and specify *DBERR to produce the equivalent of ENDDBERR(*YES), or *NONE to produce the equivalent of ENDDBERR(*NO). • The Confirm (CONFIRM) parameter is used to specify if a confirmation panel is displayed. The default is *NO (the confirmation panel is not displayed). Note that options for switching on the Work with Data Groups display call the SWTDG command with *YES specified so that the confirmation panel is automatically displayed and the user must press F16 to continue. 258 CHAPTER 14 Less common operations This chapter describes how to perform infrequently used operations that help keep your MIMIX environment running. The following topics are included: • “Starting the TCP/IP server” on page 260 contains the procedure for starting the TCP/IP server. • “Ending the TCP/IP server” on page 261 contains the procedure for ending the TCP/IP server. • “Working with objects” on page 262 contains tips for working with long object and IFS path names. • “Viewing status for active file operations” on page 263 describes how to check status when replicating database files that you are reorganizing or copying with MIMIX Promoter. • “Displaying a remote journal link” on page 264 describes how to display information about he link between a source journal definition and a target journal definition. • “Displaying status of a remote journal link” on page 265 includes procedures for determining whether a data group uses remote journaling and for checking the status of a remote journal link. • “Identifying data groups that use an RJ link” on page 267 includes the procedure to determine which data groups use a remote journal link. • “Identifying journal definitions used with RJ” on page 268 describes how to determine whether a journal definition is defined to one or more remote journal links. • “Disabling and enabling data groups” on page 269 describes when it can be beneficial to disable and enable data groups. Procedures for these processes are included in this topic. • “Determining if non-file objects are configured for user journal replication” on page 271 provides procedures for determining whether configured for IFS objects, data areas, and data queues are configured to be cooperatively processed through the user journal. • “Using file identifiers (FIDs) for IFS objects” on page 273 describes file identifiers (FIDs) which are used by commands to uniquely identify the correct IFS tracking entries to process. • “Operating a remote journal link independently” on page 274 describes how to configure, start, and end a remote journal link without defining data to be replicated by MIMIX processes. 259 Starting the TCP/IP server Starting the TCP/IP server Use this procedure if you need to manually start the TCP/IP server. Once the TCP communication connections have been defined in a transfer definition, the TCP server must be started on each of the systems identified by the transfer definition. You can also start the TCP/IP server automatically through an autostart job entry. Either you can change the transfer definition to allow MIMIX to create and manage the autostart job entry for the TCP/IP server, or you can add your own autostart job entry. MIMIX only manages entries for the server when they are created by transfer definitions. When configuring a new installation, transfer definitions and MIMIX-added autostart job entries do not exist on other systems until after the first time the MIMIX managers are started. Therefore, during initial configuration you may need to manually start the TCP server on the other systems using the STRSVR command. Note: Use the host name and port number (or port alias) defined in the transfer definition for the system on which you are running this command. Do the following on the system on which you want to start the TCP server: 1. From the MIMIX Intermediate Main Menu, select option 13 (Utilities menu) and press Enter. 2. The Utilities Menu appears. Select option 51 (Start TCP server) and press Enter. 3. The Start Lakeview TCP Server display appears. At the Host name or address prompt, specify the host name or address for the local system as defined in the transfer definition. 4. At the Port number or alias prompt, specify the port number or alias as defined in the transfer definition for the local system. Note: If you specify an alias, you must have an entry in the service table on this system that equates the alias to the port number. 5. Press Enter. 6. Verify that the server job is running under the MIMIX subsystem on that system. You can use the Work with Active Jobs (WRKACTJOB) command to look for a job under the MIMIXSBS subsystem with a function of PGM-LVSERVER. 260 Ending the TCP/IP server Ending the TCP/IP server To end the TCP server, do the following on both systems defined by the transfer definition. One example of why you might end the TCP server is when you are preparing to upgrade the MIMIX products in a product library. Note: Use the host name and port number (or port alias) defined in the transfer definition for the system you on which you are running this command To end the TCP server on a system, do the following: 1. From the MIMIX Intermediate Main Menu, select option 13 (Utilities menu) and press Enter. 2. The Utilities Menu appears. Select option 52 (End TCP server) and press Enter. 3. The End Lakeview TCP Server display appears. At the Host name or address prompt, specify the host name for the local system as specified in the transfer definition. 4. At the Port number or alias prompt, verify that the value shown is what you want. If necessary change the value. Note: If the configuration uses port aliases, specify the alias for local system. Otherwise, specify the port number for the local system. 5. Press Enter. 261 Working with objects Working with objects When working with objects, these tips may be helpful. Displaying long object names The names of some IFS entries cannot be fully displayed in the limited space on a "Work with" display. These entries are shown with a ‘>’ character in the right-most column of the Object field. You can display long object names from the following displays: • Work with Data Group IFS Entries display • Work with Data Group Activity • Work with Data Group Activity Entries To display the entire object name from any of these displays, position the cursor on an entry which indicates a long name and press F22 (Display entire field). Considerations for working with long IFS path names MIMIX currently replicates IFS path names of 512 characters. However, any MIMIX command that takes an IFS path name as input may be susceptible to a 506 character limit. This character limit may be reduced even further if the IFS path name contains embedded apostrophes ('). In this case, the supported IFS path name length is reduced by four characters for every apostrophe the path name contains. For information about IFS path name naming conventions, refer to the IBM book, Integrated File System Introduction V5R4. Displaying data group spooled file information If spooled files are created as a result of MIMIX replication, you can access the spooled file and the associated data group entry from the Work with Data Group Activity display. To access the spooled file information, do the following: 1. From the MIMIX Basic Main Menu, select option 6 (Work with data groups) and press Enter. 2. The Work with Data Groups display appears. Select option 14 (Active objects) for the data group you want to view and press Enter. The Work with Data Group Activity display appears. 3. From this display, press F16 (Spooled Files) to access the Display Data Group Spooled Files display. This display lists all of the current spooled files and shows the mapping of their names between the source and target systems. 262 Viewing status for active file operations Viewing status for active file operations If you are replicating database files that you are reorganizing or copying with MIMIX Promoter, you can check on the status of these operations. Do the following: 1. From the MIMIX Basic Main Menu, use F21 (Assistance level) to access the intermediate menu. 2. From the MIMIX Intermediate Main Menu, select option 13 (Utilities menu) and press Enter. 3. From the MIMIX Utilities Menu, select option 63 (Work with copy status) and press Enter. 4. The Work with Copy Status display appears. From this display you can track the status of active copy or reorganize operations, including the replication of physical file data as specified by METHOD(*DATA) on the Synchronize Data Group File Entry (SYNCDGFE) command. Note: You can only see status for the system on which you are working. 263 Displaying a remote journal link Displaying a remote journal link To display information about the link between a source journal definition and a target journal definition, do the following: 1. From the Work with RJ Links display, type a 5 (Display) next to the entry you want and press Enter. 2. The Display Remote Journal Link (DSPRJLNK) display appears, showing the current values defined for the link. 264 Displaying status of a remote journal link Displaying status of a remote journal link To check the status of a remote journal link, do the following: 1. Type the command WRKRJLNK and press Enter. 2. The Work with RJ Links display appears with a list of defined links. The Dlvry column indicates configured value for how the IBM i remote journal function sends the journal entries from the source journal to the target journal. The possible values for delivery are asynchronous (*ASYNC) and synchronous (*SYNC). *ASYNC - Journal entries are replicated asynchronously, independent of the applications that create the journal entries. The applications continue processing while an independent system task delivers the journal entries. If a failure occurs on the source system, journal entries on the source system may become trapped because they have not been delivered to the target system. *SYNC - Journal entries are replicated synchronously. The applications do not continue processing until after the journal entries are sent to the target journal. If a failure occurs on the source system, the target system contains the journal entries that have been generated by the applications. The State column represents the composite view of the state of the remote journal link. Because the RJ link has both source and a target component, the state shown is that of the component which has the most severe state. Table 49 shows the possible states of an RJ link, listed in order from most severe to least severe. Table 49. Possible states for RJ links, shown in order starting with most severe. State Description The following states are considered to be inactive: *UNKNOWN Neither journal defined to the remote journal link resides on the local system so the state of the link cannot be checked. *NOTAVAIL The ASP where the journal is located is varied off. *NOTBUILT The remote journal link is defined to MIMIX but one of the associated journal environments has not been built. *SRCNOTBLT The remote journal link is defined to MIMIX but the associated source journal environment has not been built. *TGTNOTBLT The remote journal link is defined to MIMIX but the associated target journal environment has not been built. *FAILED The remote journal cannot receive journal entries from the source journal due to an error condition. *CTLINACT The remote journal link is processing a request for a controlled end. *INACTIVE The remote journal link is not active. The following states are considered to be active: 265 Displaying status of a remote journal link Table 49. Possible states for RJ links, shown in order starting with most severe. State Description *INACTPEND An active remote journal link is in the process of becoming inactive. For asynchronous delivery, this is a transient state that will resolve automatically. For synchronous delivery, one system is inactive while the other system is inactive with pending unconfirmed entries. *SYNCPEND An active remote journal link is connected using synchronous delivery and is running in catch-up mode. The state will become *SYNC when catch-up mode ends. *ASYNCPEND An active remote journal link is connected using asynchronous delivery and is running in catch-up mode. The state will become *ASYNC when catch-up mode ends. *SYNC An active remote journal link is connected using synchronous delivery mode. *ASYNC An active remote journal link is connected using asynchronous delivery mode. 266 Identifying data groups that use an RJ link Identifying data groups that use an RJ link Use this procedure to determine which data groups use a remote journal link before you end a remote journal link or remove a remote journaling environment. 1. Enter the command WRKRJLNK and press Enter. 2. Make a note of the name indicated in the Source Jrn Def column for the RJ Link you want. 3. From the command line, type WRKDGDFN and press Enter. 4. For all data groups listed on the Work with DG Definitions display, check the Journal Definition column for the name of the source journal definition you recorded in Step 2. • If you do not find the name from Step 2, the RJ link is not used by any data group. The RJ link can be safely ended or can have its remote journaling environment removed without affecting existing data groups. • If you find the name from Step 2 associated with any data groups, those data groups may be adversely affected if you end the RJ link. A request to remove the remote journaling environment removes configuration elements and system objects that need to be created again before the data group can be used. Continue with the next step. 5. Press F10 (View RJ links). Consider the following and contact your MIMIX administrator before taking action that will end the RJ link or remove the remote journaling environment. • When *NO appears in the Use RJ Link column, the data group will not be affected by a request to end the RJ link or to end the remote journaling environment. Note: If you allow applications other than MIMIX to use the RJ link, they will be affected if you end the RJ link or remove the remote journaling environment. • When *YES appears in the Use RJ Link column, the data group may be affected by a request to end the RJ link. If you use the procedure for ending a remote journal link independently in topic “Ending a remote journal link independently” on page 274, ensure that any data groups that use the RJ link are inactive before ending the RJ link. 267 Identifying journal definitions used with RJ Identifying journal definitions used with RJ To see whether a journal definition is defined to one or more remote journal links, do the following: 1. From the MIMIX Basic Main Menu, select option 11 (Configuration menu) and press Enter. 2. The MIMIX Configuration menu appears. Select option 3 (Work with journal definitions) and press Enter. 3. The Work with Journal Definitions display appears. The RJ Link column indicates whether or not the journal definition is used by a remote journal link. A blank value indicates the journal definition is not associated with a remote journal link. Values that indicate the definition is used by a remote journal link are as follows: *SOURCE - The journal definition is a source journal definition in a remote journal link. *TARGET - The journal definition is the target journal definition in a remote journal environment. *BOTH - The journal definition is the source journal definition for one remote journal link and is also a target journal definition for another remote journal link in a cascading environment. *NONE - The journal definition is not used with the MIMIX RJ support. 4. To see the remote journal links associated with a journal definition, type 12 (Work with RJ Links) and press Enter. 268 Disabling and enabling data groups Disabling and enabling data groups MIMIX supports the concept of disabled data groups in a replication environment. The ability to disable a data group, and enable it later as desired, can be beneficial in a variety of configuration scenarios. The ability to disable a data group is particularly helpful in advanced cluster scenarios, where inactive data groups may be a necessary component of the replication environment. Because these data groups are inactive as part of the design, the user does not need to be notified when the data groups are in error. Disabling a data group is also useful in non-cluster situations. If you create a data group for testing purposes, for example, you no longer have to delete the data group in order to clean up your environment when testing is complete. Instead, you can simply disable the data group until it is needed again. This provides the benefit of retaining your object, file, IFS, and DLO entries while the data group is not needed. Additionally, the journal manager does not retain journal receivers that have not been processed by a disabled data group, which allows you to save storage space on your system. With support for disabled data groups, you also avoid having to start each data group individually when an installation has data groups configured to replicate in different directions. Let us assume you have two sets of data groups: one set configured to replicate from System A to System B, and another set configured to replicate from System B to System A. To start only those data groups replicating from System A to System B, it was previously necessary to start them individually in order to prevent those replicating from System B to System A from starting as well. Now you can disable the data groups you do not want to start and simply start the remaining data groups using the Start MIMIX (STRMMX) command. Customers with many systems and data groups across varying time zones may find support for disabled data groups useful when performing upgrades. Disabling data groups allows you to stagger upgrades, causing minimal impact to your replication environment. In this situation, you install a new installation and copy the configuration data from the old installation using the Copy Configuration Data (CPYCFGDTA) command. Over a convenient period of time, you can end and disable each data group on the old (original) installation, then enable and start each data group on the new installation. Once all data groups in the old installation are disabled and all data groups in the new installation are enabled, the old installation can be deleted. A disabled data group is initiated by a user and is in a state of *DISABLE. An enabled data group can be active or inactive. The Change Data Group (CHGDG) command can be used to change the state of a data group. Only inactive data groups and data groups that do not have processes suspended at a recovery point can be disabled. To make a data group inactive, you must end the data group. The request to end the data group will clear any recovery point. Disabled data groups are indicated by a status of -D (in green) on the Work with Data Groups (WRKDG) display. You can optionally not display disabled data groups by specifying a different value on the STATE parameter of the WRKDG command. Once a data group that is not part of an application group is disabled, it cannot be started, ended, or switched. 269 Disabling and enabling data groups Note: If the data group is part of an application group, the Switch Application Group (SWTAG) procedure may change its state so that it gets enabled and switched. In this case, if you do not want the data group to be switched, change the Allow to be switched (ALWSWT) parameter to *NO in the Data Group Definition (DGDFN). When a disabled data group is enabled, any pending entries must be cleared when the data group is started. Specify CLRPND(*YES) on the Start Data Group command. Procedures for disabling and enabling data groups The Change Data Group (CHGDG) command allows you to disable or enable a data group by changing its state. This command requires that the system manager is active and communication with the remote system is active. To disable or enable an individual data group, do the following: 1. On a command line, type CHGDG and press Enter. The Change Data Group display appears. 2. At the Data group definition prompts, fill in the values you want or press F4 for a valid list. 3. At the State prompt, do one of the following: • To keep the state of the data group the same, specify the default, *SAME. • To change the state of an active data group, you must first end the data group by running the End Data Group (ENDDG) command. See “Ending selected data group processes” on page 198. To disable an enabled data group, specify *DISABLE. When the state of the data group is changed to disabled, the status of the data group changes from *INACTIVE to *DISABLED. • To enable a disabled data group, specify *ENABLE. When the state of the data group is changed to enabled, the status of the data group changes from *DISABLED to *INACTIVE. 4. Press Enter to confirm your changes. Note: To start an enabled data group, you must specify *YES for the Clear pending entries prompt on the Start Data Group (STRDG) command. 270 Determining if non-file objects are configured for user journal replication Determining if non-file objects are configured for user journal replication MIMIX can take advantage of IBM i journaling functions that provide change-level details in journal entries in a user journal for object types other than files (*FILE). When properly configured, MIMIX can cooperatively process IFS stream files, data areas, and data queues between system journal and user journal replication processes. This enables changes to data or attributes to be replicated through the user journal instead of replicating the entire object through the system journal every time a change occurs. Determining how IFS objects are configured In order for IFS objects to be replicated from the user journal, one or more data group IFS entries must be configured to process cooperatively with the user journal. Also, IFS tracking entries must exist for the object identified by the data group IFS entries. To determine if a data group has any IFS objects that are configured for user journal replication and has any corresponding IFS tracking entries, do the following: 1. From the MIMIX Basic Main Menu select option 6 (Work with data groups) and press Enter. 2. The Work with Data Groups display appears. Type 22 (IFS entries) next to the data group you want and press Enter. The Work with DG IFS Entries display appears, showing the IFS entries configured for the data group. 3. Press F10 twice to access the CPD view. 4. The values shown in the Coop with DB column indicate how objects identified by the data group IFS entries will be replicated. • Entries with the value *YES are configured for user journal replication. Continue with the next step to ensure that IFS tracking entries exist for the IFS objects. Replication cannot occur without tracking entries. • Entries the value *NO are configured for system journal replication. To view additional information for a data group IFS entry, type 5 (Display) next to the entry and press Enter. 5. Press F12 (Cancel) to return to the Work with Data Groups display. Then type 50 (IFS trk entries) next to the data group you want and press Enter. 6. The Work with DG IFS Trk. Entries display appears with a list of tracking entries for the IFS objects identified for replication by the data group. If there are no tracking entries listed but Step 4 indicates that properly configured data group IFS entries exist, the tracking entries must be loaded. For more information about loading tracking entries, see the MIMIX Administrator Reference book. 271 Determining if non-file objects are configured for user journal replication Determining how data areas or data queues are configured In order for data area and data queue objects to be replicated from the user journal, one or more data group object entries must be configured to process cooperatively with the user journal. Also, object tracking entries must exist for the object identified by the data group object entries. To determine if a data group has any data area or data queue objects that are configured for user journal replication and has any corresponding object tracking entries, do the following: 1. From the MIMIX Basic Main Menu select option 6 (Work with data groups) and press Enter. 2. The Work with Data Groups display appears. Type 20 (Object entries) next to the data group you want and press Enter. The Work with DG Object Entries display appears, showing the object entries configured for the data group. 3. For each entry in the list, do the following: a. Type a 5 (Display) next to the entry and press Enter. b. The object entry must have the following values specified in the fields indicated: • The Object type field must be *ALL, *DTAARA, or *DTAQ • The Cooperate with database field must be *YES • The Cooperating object types field must specify *DTAARA to replicate data areas and *DTAQ to replicate data queues. 4. Press F12 (Cancel) to return to the Work with Data Groups display. Then type 52 (Obj trk entries) next to the data group you want and press Enter. 5. The Work with DG Obj. Trk. Entries display appears with a list of tracking entries for the data area and data queue objects identified for replication by the data group. If there are no tracking entries listed but Step 3 indicates that properly configured data group object entries exist, the tracking entries must be loaded. For more information about loading tracking entries, see the MIMIX Administrator Reference book. 272 Using file identifiers (FIDs) for IFS objects Using file identifiers (FIDs) for IFS objects Commands used for user journal replication of IFS objects use file identifiers (FIDs) to uniquely identify the correct IFS tracking entries to process. The System 1 file identifier and System 2 file identifier prompts ensure that IFS tracking entries are accurately identified during processing. These prompts can be used alone or in combination with the System 1 object prompt. These prompts enable the following combinations: • Processing by object path: A value is specified for the System 1 object prompt and no value is specified for the System 1 file identifier or System 2 file identifier prompts. When processing by object path, a tracking entry is required for all commands with the exception of the SYNCIFS command. If no tracking entry exists, the command cannot continue processing. If a tracking entry exists, a query is performed using the specified object path name. • Processing by object path and FIDs: A value is specified for the System 1 object prompt and a value is specified for either or both of the System 1 file identifier or System 2 file identifier prompts. When processing by object path and FIDs, a tracking entry is required for all commands. If no tracking entry exists, the command cannot continue processing. If a tracking entry exists, a query is performed using the specified FID values. If the specified object path name does not match the object path name in the tracking entry, the command cannot continue processing. • Processing by FIDs: A value is specified for either or both of the System 1 file identifier or System 2 file identifier prompts and, with the exception of the SYNCIFS command, no value is specified for the System 1 object prompt. In the case of SYNCIFS, the default value *ALL is specified for the System 1 object prompt. When processing by FIDs, a tracking entry is required for all commands. If no tracking entry exists, the command cannot continue processing. If a tracking entry exists, a query is performed using the specified FID values. 273 Operating a remote journal link independently Operating a remote journal link independently You can configure, start, and end a remote journal link without defining data to be replicated by MIMIX processes. For example, you might have a need to use remote journals without performing data replication. The Start Remote Journal Link (STRRJLNK) and End Remote Journal Link (ENDRJLNK) commands provide this capability. Note: These commands should only be used by personnel with experience using the IBM i remote journal function. For most needs, support for the RJ link that is integrated in the commands which start and end replication processes (STRMMX, STRDG, ENDMMX, and ENDDG). Starting a remote journal link independently To start a remote journal link separately from other MIMIX processes, do the following: 1. To access the Work with Journal Links display, type the command WRKRJLNK and press Enter. 2. From the Work with Remote Journal Links display, type a 9 (Start) next to the link in the list that you want to start and press Enter. 3. The Start Remote Journal Link (STRRJLNK) display appears. Specify the value you want for the Starting journal receiver prompt. 4. To start remote journaling for the specified link, press Enter. Ending a remote journal link independently Default values for this command will perform an immediate end for the specified link. Be aware that the actions taken by the ENDOPT parameter on this command are different from the actions taken when you perform an immediate or controlled end of a MIMIX data group. For more information about the differences between this command and the End Data Group (ENDDG) command, see the MIMIX Reference book. For the following situations, an immediate end is always performed (the value specified for the ENDOPT parameter is ignored): • The remote journal function is running in synchronous mode (DELIVERY(*SYNC)). • The remote journal function is performing catch-up processing. To end a remote journal link separately from other MIMIX processes, do the following: 1. To access the Work with Journal Links display, type the command WRKRJLNK and press Enter. 2. From the Work with Remote Journal Links display, type a 10 (End) next to the link in the list that you want to end. 3. Do one of the following: • To perform an immediate end from the source system, press Enter. This completes the procedure for an immediate end. 274 Operating a remote journal link independently • To perform a controlled end or to end from the target system, press F4 (Prompt), then continue with the next step. 4. The End Remote Journal Link (ENDRJLNK) display appears. Press F10 (Additional parameters). 5. To perform a controlled end, specify *CNTRLD at the End remote journal link prompt. If you need to end from the target system, specify *TGT at the End RJ link on system prompt. To process the request, press Enter. 275 CHAPTER 15 Troubleshooting - where to start Occasionally, a situation may occur that requires user intervention. This section provides information to help you troubleshoot problems that can occur in a MIMIX environment. You can also consult our website at www.mimix.com for the latest information and updates for MIMIX products. The following topics are included in this chapter: • “Gathering information before reporting a problem” on page 278 describes the information you should gather before you report a problem. A procedure is included to help you gather this information. • “Reducing contention between MIMIX and user applications” on page 279 describes a processing timing issue that may be resolved by specifying an Object retrieval delay value on the commands for creating or changing data group entries. • “Data groups cannot be ended” on page 280 describes possible causes for a data group that is taking too long to end. • “Verifying a communications link for system definitions” on page 281 describes the process to verify that the communications link defined for each system definition is operational. • “Verifying the communications link for a data group” on page 282 includes a process to use before synchronizing data to ensure that the communications link for the data group is active. • “Checking file entry configuration manually” on page 283 includes the process for checking that correct data group file entries exist with respect to the data group object entries. This process uses the Check DG File Entries (CHKDGFE) command. • “Data groups cannot be started” on page 285 describes some common reasons why a data group may not be starting. • “Cannot start or end an RJ link” on page 286 describes possible reasons that can prevent you from starting or ending an RJ link. This topic includes a procedure for removing unconfirmed entries to free an RJ link. • “RJ link active but data not transferring” on page 287 describes why an RJ link may not be transferring data and how to resolve this problem. • “Errors using target journal defined by RJ link” on page 288 describes why errors when using a target journal defined by an RJ link can occur and how to resolve them. • “Verifying data group file entries” on page 289 includes a procedure for verifying data group file entries using the Verify Data Group File Entries (VFYDGFE) command. • “Verifying data group data area entries” on page 289 includes a procedure for 276 verifying data group data area entries using the Verify Data Group Data Area Entries (VFYDGDAE) command. Data area entries are only used when data areas are replicated by the data area poller process, which is not preferred. • “Verifying key attributes” on page 289 includes a procedure for verifying key attributes using the VFYKEYATR (Verify Key Attributes) command. • “Working with data group timestamps” on page 291 describes timestamps and includes information for creating, deleting, displaying, and printing them. • “Removing journaled changes” on page 294 describes the configuration conditions that must be met using the Remove Journaled Changes (RMVJRNCHG) journal entry. • “Performing journal analysis” on page 295 describes and includes the procedure for performing journal analysis of the source system. 277 Gathering information before reporting a problem Gathering information before reporting a problem Before you report a problem, you should gather the following information: • The MIMIX product, library, installed version, and IBM i operating system level on the system you are using. To determine this information, follow the procedure “Obtaining MIMIX and IBM i information from your system” on page 278. • The Message ID number for any error messages associated with the problem. If you receive error messages, record the message number, any replacement text (such as “Process X failed for file Y”), and the to and from program information, if available. Since many messages have similar text, this information is much more helpful to us and enables us to handle your call more efficiently. • The specific operation you were attempting to perform when the error condition occurred. It is important that we understand what you were trying to do when you encountered the problem. Try to write down the specific sequence of events that you were doing when the error condition occurred, such as the commands entered, the display you were working from, or the program that was running. Obtaining MIMIX and IBM i information from your system To obtain the necessary MIMIX and IBM i information before reporting a problem, do the following: 1. Do one of the following to access the Lakeview Technology Installed Products display: • If you are configured for a MIMIX replication environment, select option 31 (Product management menu). Then select option 2 (Work with products). • From a command line, enter LAKEVIEW/WRKPRD 2. Next to the product you want, type a 6 (About version) and press Enter. The About pop-up appears, showing the Product, Library, Installed version, and the OS/400 level on this system. 3. Press F9 (Fixes) to see the Work with Installed Fixes display. From this display you can determine the latest level of the MIMIX cumulative fix package that is installed. Note: You should know the version and release level (VnRnMn) of the IBM i operating system that is on each system with which you are working. Use the process above on each system. 278 Reducing contention between MIMIX and user applications Reducing contention between MIMIX and user applications If your applications are failing in an unexpected manner, it may be caused by MIMIX locking your objects for object retrieval processing while your applications are trying to access the object. This is a processing timing issue and can be significantly reduced, or eliminated, by specifying an appropriate delay value for the Object retrieval delay element under the Object processing (OBJPRC) parameter on the change or create data group definition commands. Although you can specify this value at the data group level, you can override the data group value at the object level by specifying an Object retrieval delay value on the commands for creating or changing data group entries. For more information see “Selecting an object retrieval delay” in the MIMIX Administrator Reference book. You should use care when choosing the object retrieval delay. A long delay may impact the ability of MIMIX system journal replication processes to move data from a system in a timely manner. Too short a delay may allow MIMIX to retrieve an object before an application is finished with it. You should make the value large enough to reduce or eliminate contention between MIMIX and applications, but small enough to allow MIMIX to maintain a suitable high availability environment. 279 Data groups cannot be ended Data groups cannot be ended A controlled end for a data group may take some time if there is a backlog of files to process or if there are a number of errors that MIMIX is attempting to resolve before ending. If you think that a data group is taking too long to end, check the following for possible causes: • Check to see how many transactions are backlogged for the apply process. Use option 8 (Display status) on the Work with Data Groups display to access the detailed status. A number in the Unprocessed Entry Count column indicates a backlog. Use F7 and F8 to see additional information. • Determine which replication process is not ending. Use the command WRKSBSJOB SBS(MIMIXSBS) to see the jobs in the MIMIXSBS subsystem. Look for jobs for replication processes that have not changed to a status of END. For example, abc_OBJRTV, where abc is a 3-character prefix. • Check the QSYSOPR message log to see if there is message that requires a reply. • You can use the WRKDGACTE STATUS(*ACTIVE) command to ensure all data group activity entries are completed. If a controlled end was issued, all activity entries must be processed before the object processes are ended. 280 Verifying a communications link for system definitions Verifying a communications link for system definitions Do the following to verify that the communications link defined for each system definition is operational: 1. From the MIMIX Basic Main Menu, type an 11 (Configuration menu) and press Enter. 2. From the MIMIX Configuration Menu, type a 1 (Work with system definitions) and press Enter. 3. From the Work with System Definitions display, type an 11 (Verify communications link) next to the system definition you want and press Enter. You should see a message indicating the link has been verified. Note: If the system manager is not active, this process will only verify that communications to the remote system is successful. You will also see a message in the job log indicating that “communications link failed after 1 request.” This indicates that the remote system could not return communications to the local system. 4. Repeat this procedure for all system definitions. If the communications link defined for a system definition uses SNA protocol, do not check the link from the local system. Note: If your transfer definition uses the *TCP communications protocol, then MIMIX uses the Verify Communications Link command to validate the information that has been specified for the Relational database (RDB) parameter. MIMIX also uses VFYCMNLNK to verify that the System 1 and System 2 relational database names exist and are available on each system. 281 Verifying the communications link for a data group Verifying the communications link for a data group Before you synchronize data between systems, ensure that the communications link for the data group is active. This procedure verifies the primary transfer definition used by the data group. If your configuration requires multiple data groups, be sure to check communications for each data group definition. Do the following: 1. From the MIMIX Basic Main Menu, type an 11 (Configuration menu) and press Enter. 2. From the MIMIX Configuration Menu, type a 4 (Work with data group definitions) and press Enter. 3. From the Work with Data Group Definitions display, type an 11 (Verify communications link) next to the data group you want and press F4. 4. The Verify Communications Link display appears. Ensure that the values shown for the prompts are what you want. 5. To start the check, press Enter. 6. You should see a message "VFYCMNLNK command completed successfully." If your data group definition specifies a secondary transfer definition, use the following procedure to check all communications links. Verifying all communications links The Verify Communications Link (VFYCMNLNK) command requires specific system names to verify communications between systems. When the command is called from option 11 on the Work with System Definitions display or option 11 on the Work with Data Groups display, MIMIX identifies the specific system names. For transfer definitions using TCP protocol: MIMIX uses the Verify Communications Link (VFYCMNLNK) command to validate the values specified for the Relational database (RDB) parameter. MIMIX also uses VFYCMNLNK to verify that the System 1 and System 2 relational database names exist and are available on each system. When the command is called from option 11 on the Work with Transfer Definitions display or when entered from a command line, you will receive an error message if the transfer definition specifies the value *ANY for either system 1 or system 2. 1. From the Work with Transfer Definitions display, type an 11 (Verify communications link) next to all transfer definitions and press Enter. 2. The Verify Communications Link display appears. If you are checking a Transfer definition with the value of *ALL, you need to specify a value for the System 1 or System 2 prompt. Ensure that the values shown for the prompts are what you want and then press Enter. You will see the Verify Communications Link display for each transfer definition you selected. 3. You should see a message "VFYCMNLNK command completed successfully." 282 Checking file entry configuration manually Checking file entry configuration manually The Check DG File Entries (CHKDGFE) command provides a means to detect whether the correct data group file entries exist with respect to the data group object entries configured for a specified data group in your MIMIX configuration. When file entries and object entries are not properly matched, your replication results can be affected. Note: The preferred method of checking is to use MIMIX AutoGuard to automatically schedule the #DGFE audit, which calls the CHKDGFE command and can automatically correct detected problems. For additional information, see “Interpreting results for configuration data - #DGFE audit” on page 300. To check your file entry configuration manually, do the following: 1. On a command line, type CHKDGFE and press Enter. The Check Data Group File Entries (CHKDGFE) command appears. 2. At the Data group definition prompts, select *ALL to check all data groups or specify the three-part name of the data group. 3. At the Options prompt, you can specify that the command be run with special options. The default, *NONE, uses no special options. If you do not want an error to be reported if a file specified in a data group file entry does not exist, specify *NOFILECHK. 4. At the Output prompt, specify where the output from the command should be sent—to print, to an outfile, or to both. See Step 6. 5. At the User data prompt, you can assign your own 10-character name to the spooled file or choose not to assign a name to the spooled file. The default, *CMD, uses the CHKDGFE command name to identify the spooled file. 6. At the File to receive output prompts, you can direct the output of the command to the name and library of a specific database file. If the database file does not exist, it will be created in the specified library with the name MXCDGFE. 7. At the Output member options prompts, you can direct the output of the command to the name of a specific database file member. You can also specify how to handle new records if the member already exists. Do the following: a. At the Member to receive output prompt, accept the default *FIRST to direct the output to the first member in the file. If it does not exist, a new member is created with the name of the file specified in Step 6. Otherwise, specify a member name. b. At the Replace or add records prompt, accept the default *REPLACE if you want to clear the existing records in the file member before adding new records. To add new records to the end of existing records in the file member, specify *ADD. 8. At the Submit to batch prompt, do one of the following: • If you do not want to submit the job for batch processing, specify *NO and press Enter to check data group file entries. 283 Checking file entry configuration manually • To submit the job for batch processing, accept *YES. Press Enter and continue with the next step. 9. At the Job description prompts, specify the name and library of the job description used to submit the batch request. Accept MXAUDIT to submit the request using the default job description, MXAUDIT. 10. At the Job name prompt, accept *CMD to use the command name to identify the job or specify a simple name. 11. To start the data group file entry check, press Enter. 284 Data groups cannot be started Data groups cannot be started Two common reasons why a data group cannot be started are as follows: • The communications link between systems defined to the data group is not active. Use the procedure “Verifying a communications link for system definitions” on page 281. • The journaling environment for the data group has not been built. Verify that journaling environment defined in the journal definition exists. If necessary, use the appropriate procedure in the MIMIX Administrator Reference book. • The journal receiver has been deleted from the system. You can use WRKJRNA to determine if the journal receiver exists on the source system. 285 Cannot start or end an RJ link Cannot start or end an RJ link In normal operations, unconfirmed entries are automatically handled by the RJ link monitors. In the event of a switch, the unconfirmed entries are processed, ensuring that you have the latest updates to your data. However, there is a scenario where you may end up with a backlog of unconfirmed entries that can prevent you from starting or ending an RJ link. This problem can occur when all of the following are true: • The data group is not switchable or you do not want to switch it • A link failure on an RJ link that is configured for synchronous delivery leaves unconfirmed entries • The RJ link monitors are not active, either because you are not using them or they failed as a result of a bad link To recover from this situation, you should run the Verify Communications Link (VFYCMNLNK) command to assist you in determining what may by wrong and why the RJ link will not start. If you are using an independent ASP, check the transfer definition to ensure the correct database name has been specified. You also need to end the remote journal link from the target system. Ending the link from the target system is a restriction of the IBM remote journal function. Removing unconfirmed entries to free an RJ link Note: You should never remove unconfirmed entries from a switchable data group unless you are directed to by your MIMIX administrator or a CustomerCare representative. If you need to remove a backlog of unconfirmed entries, do the following: 1. Use the WRKRJLNK command to display the status of the RJ link. The status shown on the Work with RJ Links display is the status of the link on the system where you entered the command. (This system is identified at the upper right corner of the display.) An RJ link with unconfirmed entries will have a state of *INACTPEND. Note: You may need to access this display from the other system defined by the RJ link. 2. Ending the remote journal link on the system with unconfirmed entries will cause them to be deleted. Do the following: a. Type 10 (End) next to the link and press F4 (Prompt). b. The End Remote Journal Link (ENDRJLNK) display appears. Default values on this command ends the link from the source system. If there are unconfirmed entries on the target system, press F10 (Additional parameters). Then specify *TGT at the End RJ link on system prompt. c. To process the request, press Enter. 286 RJ link active but data not transferring RJ link active but data not transferring Following an initial program load (IPL), the RJ link may appear to be active when data cannot actually flow from the source system to the target system journal receiver. This is an operating system restriction. MIMIX does not receive notification of a failure. To recover, end the RJ lInk and restart it following an IPL. This can be included in automation programs. 287 Errors using target journal defined by RJ link Errors using target journal defined by RJ link If you receive errors when using a target journal defined by an RJ link, you may need to change the journal definition and journaling environment. This situation is caused when the target journal definition is created as a result of adding an RJ link based on a source journal definition which specified QSYSOPR as the threshold message queue. If you receive errors when using the target journal, do the following: 1. On the Work with Journal Definitions display, locate the target journal definition that is identified by the errors. 2. Type a 5 (Display) next to the target journal definition and press Enter. 3. Page down to see the value of the Threshold message queue. • If the value is QSYSOPR, press F12 and continue with the next step. • For any other value, the cause of the problem needs further isolation beyond this procedure. 4. Type a 2 (Change) next to the target journal definition and press Enter. 5. Press F9 (All parameters), then page down to locate the Threshold message queue and Library prompts. 6. Change the Threshold message queue prompt to *JRNDFN and the Library prompt to *JRNLIB, or to other acceptable values. 7. To accept the change, press Enter. 288 Verifying data group file entries Verifying data group file entries The Verify Data Group File Entries (VFYDGFE) command allows you to verify files from a specific library by verifying the current state of the file on the system identified in the data group as the source of data. This procedure generates a report in a spooled file named MXVFYDGFE. The information in the report includes whether each member for the specified search criteria is defined to MIMIX, the journal and library to which it is journaled, whether it uses after-image journaling or before- and after-image journaling, the apply session used. This information can help you verify that you have all the files you need from a library properly defined to MIMIX DB Replicator. To verify data group file entries, do the following: 1. On a command line, type VFYDGFE (Verify Data Group File Entries). The Verify DG File Entries display appears. 2. Specify the name of the data group at the Data group definition prompt. 3. At the System 1 file and Library prompts, specify the value you want and the library in which the files are located. 4. If you want to create a spooled file that can be printed, specify *PRINT at the Output prompt. Then press Enter. Verifying data group data area entries The Verify Data Group Data Area Entries (VFYDGDAE) command allows you to verify the data areas in a specific library defined to a data group definition. The audit report determines the data source for the data group and retrieves the appropriate information. This procedure generates a report in a spooled file named MXVFYDAE. The information in the report includes whether each data area for the specified search criteria is defined to MIMIX and the length of each data area. This information can help you verify that you have all the data areas you need from a library defined to MIMIX DB Replicator. To verify data group data area entries, do the following: 1. On a command line, type VFYDGDAE (Verify Data Group Data Area Entries). The Verify DG Data Area Entries (VFYDGDAE) display appears. 2. Specify the name of the data group at the Data group definition prompt. 3. At the System 1 data area and Library prompts, specify the value you want and the library in which the data areas are located and press Enter. Verifying key attributes Before you configure for keyed replication, verify that the file or files you for which you want to use keyed replication are actually eligible. 289 Verifying key attributes Do the following to verify that the attributes of a file are appropriate for keyed replication: 1. On a command line, type VFYKEYATR (Verify Key Attributes). The Verify Key Attributes display appears. 2. Do one of the following: • To verify a file in a library, specify a file name and a library. • To verify all files in a library, specify *ALL and a library. • To verify files associated with the file entries for a data group, specify *MIMIXDFN for the File prompt and press Enter. Prompts for the Data group definition appear. Specify the name of the data group that you want to check. 3. Press Enter. 4. A spooled file is created that indicates whether you can use keyed replication for the files in the library or data group you specified. Display the spooled file (WRKSPLF command) or use your standard process for printing. You can use keyed replication for the file if *BOTH appears in the Replication Type Allowed column. If a value appears in the Replication Type Defined column, the file is already defined to the data group with the replication type shown. 290 Working with data group timestamps Working with data group timestamps Timestamps allow you to view the performance of the database send, receive, and apply processes for a data group to identify potential problem areas, such as a slow send process, inadequate communications capacity, or excessive overhead on the target system. Although they can assist you in identifying problem areas, timestamps are not intended as an accurate means of calculating the performance of MIMIX. A timestamp is a single record that is passed between all replication processes. The timestamp originates on the source system as a journal entry, is sent to the target system, and then processed by the associated apply session. The timestamp record is updated with the date and time at each of the following areas during the replication process: • Created - Date and time the journal entry is created • Sent - Date and time when the journal entry is sent to the target system • Received - Date and time when the journal entry is received • Applied - Date and time when the journal entry is applied Note: For data groups that use remote journaling, the created and sent timestamps will be set to the same value. The received timestamp will be set to the time when the record was read on the target system by the database reader process. After all four timestamps have been added, the journal entry is converted and placed into a file for viewing or printing. You can view timestamps only from the management system. The system manager must be active to return the timestamps to the management system. Automatically creating timestamps The data group definition includes a parameter for automatically creating timestamps. MIMIX automatically creates a timestamp after the number of journal entries specified in the Timestamp interval (TSPITV) has passed. The timestamp entry created is placed at the end of all current entries in the journal receiver. You specify this value when you create or change a data group definition. You can change this value at any time. Note: Data groups configured for remote journaling will not automatically generate timestamps. To generate timestamps in this case, refer to “Creating timestamps for remote journaling processing” on page 292. Creating additional timestamps Note: By using the Create Data Group Timestamps (CRTDGTSP) command in a batch job, you can use timestamps to monitor performance at critical times in your daily processing. To create one or more timestamps, do the following: 1. From the Work with Data Groups display, type 41 (Timestamps) next to the data group you want and press Enter. 291 Working with data group timestamps 2. The Work with DG Timestamps display appears. Type a 1 (Create) next to the blank line at the top of the display and press Enter. 3. The Create Data Group Timestamps display appears. Specify the name of the data group and the number of timestamps you want to create and press Enter. Note: You should generate multiple timestamps to receive a more accurate view of replication process performance. Creating timestamps for remote journaling processing If you need to generate timestamps to monitor replication performance, you can set up automation to create them for remote journaling (RJ) data groups that you wish to monitor. In this procedure, you will create an interval monitor using the Create Monitor Object (CRTMONOBJ) command. This is accomplished by specifying *CMD for the interface exit program on the monitor object, and then Create Data Group Timestamps (CRTDGTSP) for the command (*CMD). You can also run CRTDGTSP manually or schedule a job to run the command in batch. For more information, see “Creating an interval monitor” in the MIMIX Monitor book. Do the following to create an interval monitor: 1. From the Work with Monitors display, type a 1 (Create) in the Opt column next to the blank line at the top of the list and press Enter. 2. The Create Monitor Object (CRTMONOBJ) display appears. Do the following: a. At the Monitor prompt, provide a unique name for the monitor. b. At the Event class prompt, specify *INTERVAL. c. At the Interface exit program prompt, specify *CMD. d. At the Time interval (sec.) prompt, specify how often the interval monitor should run and press Enter. By default, this monitor runs every 15 seconds. Use your data group time stamp interval (default is every 20,000 entries) to estimate how many entries you process a day. From there, determine how often you need to run the monitor in order to provide an adequate sample. 3. The Add Monitor Information (ADDMONINF) display appears. Do the following: a. At the Command prompt, type CRTDGTSP. b. At the Library prompt, type the name of your installation library and press F4 (Prompt). 4. The Create Data Group Timestamps (CRTDGTSP) display appears. Do the following: a. At the Data group definition prompts, specify the name of the RJ data group. b. At the Number of stamps to create prompt, specify the number of timestamps you want to create and press Enter. 5. From the Work with Monitors display, type a 9 (Start) next to the interval monitor you created. This allows you to start generating timestamps. For information about viewing timestamps, see “Displaying or printing timestamps” on page 293. 292 Working with data group timestamps Repeat this procedure for each RJ data group for which you want to generate timestamps. Deleting timestamps You can delete all timestamps or you can select a group of one or more timestamps to delete. To delete timestamps for a data group, do the following: 1. From the Work with Data Groups display, type 41 (Timestamps) next to the data group you want and press Enter. 2. The Work with DG Timestamps display appears. Type a 4 (Delete) next to the timestamps you want to delete and press Enter. 3. A confirmation screen appears. Press Enter. To selectively delete a range of timestamps, do the following: 1. Type the command DLTDGTSP and press F4 (Prompt). 2. The Delete Data Group Timestamps display appears. Specify values you want for the Data group definition prompt. 3. Specify the values you want for the Starting date and time prompt and for the Ending date and time prompt, then press Enter. Displaying or printing timestamps To display or print data group timestamps, do the following: 1. From the Work with Data Groups display, type 41 (Timestamps) next to the data group you want and press Enter. 2. The Work with DG Timestamps display appears. Do one of the following: • To display the timestamp information, type a 5 (Display) next to the data group you want. • To print the timestamp information, type a 6 (Print) next to the data group you want. 3. Press Enter. 4. If you selected to display, the Display Data Group Timestamps display appears. If you selected to print, a spooled file is created that you can print using your standard printing procedures. 293 Removing journaled changes Removing journaled changes If the necessary environment is available, MIMIX can support the Remove Journaled Changes (RMVJRNCHG) journal entry by simulating the Remove Journaled Changes process on the backup system. Note: This is a long running procedure and will affect your existing journal changes. Ensure that performing this procedure is appropriate for your environment. In order to use the Remove Journaled Changes journal entry, you must meet the following criteria: • You must be configured for both before and after image journaling. This can be defined as a default file entry option at the data group level or it can be defined for individual data group file entries. • You must be configured with *SEND as the value of the Before images element of the DB journal entry processing (DBJRNPRC) parameter of the data group definition. This permits the database apply process to roll back certain types of journal entries. • If you have large objects (LOBs), *YES must be the value for the Use remote journal link (RJLNK) parameter of the data group definition. • The target system (where replicated changes are applied) must have the log spaces that contain the original transactions. To ensure that the appropriate log spaces are retained, you can do one of the following: – Calculate how many log spaces need to be retained using the log space size and the size and number of the receivers containing the appropriate journal transactions. Then, set elements of the database apply processing (DBAPYPRC) parameter in the data group definition. – Use the Hold Data Group Log (HLDDGLOG) command to place a hold on the delete operation of all log spaces for all apply sessions defined to the specified data group. The log spaces are held until a request to release them with Release Data Group Log (RLSDGLOG) command is received. If you are changing an existing data group to have these values, you must end and restart the data group before you are able to use the RMVJRNCHG command. 294 Performing journal analysis Performing journal analysis When a source system fails before MIMIX has sent all journal entries to the target system, unprocessed transactions occur. Unprocessed transactions can also occur if journal entries are in the communications buffer being sent to the target system when the sending system fails. Following an unplanned switch, unprocessed transactions on the original source system must be addressed in order to prevent data loss before synchronizing data and starting data groups. The journal analysis process finds any missing transactions that were not sent to the target system when the source system went down and an unplanned switch to the backup was performed. Once unprocessed transactions are located, users must analyze the journal entries and take appropriate actions to resolve them. The time at which to perform journal analysis is when the original source system has been brought back up and before performing the synchronization phase of the switch (which synchronizes data and starts data groups). Analyze all data groups that were not disabled at the time of the unplanned switch. Note: The journal analysis tool is limited to database files replicated from a user journal. The tool does not identify unprocessed transactions for data areas, data queues, or IFS objects replicated through a user journal, or database files configured for replication from the system journal. From the original source system, do the following: 1. Ensure the following are started: a. The port communications jobs (PORTxxxxx) b. The MIMIX system managers using STRMMXMGR SYSDFN(*ALL) MGR(*SYS) TGTJRNINSP(*NO) IMPORTANT! Only the system managers should be started at this time. Do not start journal managers. Also, do not start data groups at this time! Doing so will delete the data required to perform the journal analysis. 2. From the Work with Data Groups display on the original source system, enter 43 (Journal analysis) next to the data group to be analyzed. The Journal Analysis of Files display appears. 3. Check the list area for a pop-up window and do the following: • If a pop-up window with the message “Journal analysis information not collected” is displayed, press Enter to collect journal analysis information, then go to Step 5. • If you do not see a pop-up window, information about files from a previous run of journal analysis exists, go to Step 4. 4. If you did not see a pop-up window in Step 3 and you need to clear data from a previous run of journal analysis and collect new information, do the following: a. If you want to keep information from a previous run, make a copy of file DM6500P located in the installation library. 295 Performing journal analysis b. Press F9 (Update display) to clear the screen and collect the new information. A pop-up confirmation window with the following message is displayed: “WARNING! The journal analysis journal entries file will be cleared!” c. Press Enter to submit the update request. 5. The request to collect journal analysis information is submitted by job RTVFILANZ using the job description MIMIXQGPL/MIMIXDFT. When the job completes, “LVI3855 Retrieval of affected files for journal analysis completed normally” appears in the message log. Press F5 (Refresh) to see the collected information. It may take a short time to collect the information. 6. Retrieve journal entries. The journal entries for the files identified on the display must be retrieved before you can use options to display or print statistics (5 and 6) or display journal entries (11). Do one of the following: • Press F14 (Retrieve all entries). A pop-up window stating “Confirm retrieval of ALL analysis journal entries” appears. Press Enter. (The retrieved information is placed in an internal file.This does not produce a spool file.) • If there are a large number of files listed on the display, you may want to retrieve entries for only a selected file at a time. Type option 9 (Retrieve journal entries) next to the file to retrieve journal entries for and press Enter. The retrieved journal entries are placed in a spool file named MXJEANZL. Message: “LVI3856 Retrieval of journal entries for journal analysis completed normally” appears in the message log. 7. Review the collected information using the following: • Use option 11 (Display journal entries) to view the entries for each file. • Use F21 (Print list) to print all entries for a file. • You can use options 5 (Display statistics) and 6 (Print statistics) to see the statistical breakdown of journal entries for a selected file member identified by journal analysis. The statistics include the number of adds, deletes, and updates, along with the related file transactions and dates of the first and last journal entries. Figure 36 shows an example of the information displayed by option 11 for one journal entry. 296 Performing journal analysis Figure 36. Sample of one journal entry Data group definition: Journal definition: <DGDFN> <SYS1> <SYS2> <JRNDFN> <SYSDFN> File identification File . . . . : <FILE> Library . : <LIB> Member . . . : <MBR> Journal header information Journal code . . . . . : Journal type . . . . . : Generated date . . . . : Generated time . . . . : Job name . . . . . . . : User name . . . . . . : Job number . . . . . . : Program name . . . . . : R DL 9/08/09 10:36:31 <JOB NAME> <USER> <JOB NBR> <PROGRAM> Record-level information Delete record Journal header information (continued) Record length . . . . : 607 Record number . . . . : 838 Operation indicator . : 0 Commit cycle ID . . . : 0 Journal identification Journal name . . . . . : Library . . . . . . : <JOURNAL> <JRNLIB> Receiver identification Receiver name . . . . : Library . . . . . . : Sequence number . . . : <RCVR> <RCVRLIB> <JOURNAL SEQUENCE> 8. Determine what action you need to take for each unprocessed entry. For example: • You may need to run the original job again on the current source system to reproduce the entries. • If a file has already been updated on the current source system (manually or otherwise), you may need to merge data from both files. If this is the case, do not synchronize the files. • If there are write changes (R-PT entries), these changes should be made on the current source system before running the synchronization phase of the switch or starting data groups in order to maintain Relative Record Number consistency within the file. If this is done after the data group has been started, the relative record numbers could become unsynchronized between the two systems. Note: It is the customer’s responsibility to fix the files. Removing journal analysis entries for a selected file You can use option 4 (Remove journal entries) to remove all journal analysis journal entries for a selected file member. A confirmation display appears to confirm your choices. When you continue with the confirmation, the journal entries for the selected 297 Performing journal analysis file member are immediately removed from the journal analysis information that is displayed. It does not delete any other information contained in other MIMIX files. 298 Interpreting audit results supporting information APPENDIX A Audits use commands that compare and synchronize data. The results of the audits are placed in output files associated with the commands. The following topics provide supporting information for interpreting data returned in the output files. • “When the difference is “not found”” on page 302 provides additional considerations for interpreting result of not found in priority audits. • “Interpreting results for configuration data - #DGFE audit” on page 300 describes the #DGFE audit which verifies the configuration data defined to your configuration using the Check Data Group File Entries (CHKDGFE) command. • “Interpreting results of audits for record counts and file data” on page 303 describes the audits and commands that compare file data or record counts. • “Interpreting results of audits that compare attributes” on page 306 describes the Compare Attributes commands and their results. 299 Interpreting results for configuration data - #DGFE audit Interpreting results for configuration data - #DGFE audit The #DGFE audit verifies the configuration data that is defined for replication in your configuration. This audit invokes the Check Data Group File Entries (CHKDGFE) command for the audit’s comparison phase. The CHKDGFE command collects data on the source system and generates a report in a spooled file or an outfile. The report is available on the system where the command ran. The values in the Result column of the report indicate detected problems and the result of any attempted automatic recovery actions. Table 50 shows the possible Result values and describes the action to take to resolve any reported problems. Table 50. CHKDGFE - possible results and actions to for resolving errors Result Recovery Actions *NODGFE No file entry exists. Create the DGFE or change the DGOBJE to COOPDB(*NO) Note: Changing the object entry affects all objects using the object entry. If you do not want all objects changed to this value, copy the existing DGOBJE to a new, specific DGOBJE with the appropriate COOPDB value. *EXTRADGFE An extra file entry exists. Delete the DGFE or change the DGOBJE to COOPDB(*YES) Note: Changing the object entry affects all objects using the object entry. If you do not want all objects changed to this value, copy the existing DGOBJE to a new, specific DGOBJE with the appropriate COOPDB value. *NOFILE No file exists for the existing file entry. Delete the DGFE, re-create the missing file, or restore the missing file. *NOMBR No file member exists for the existing file entry. Delete the DGFE for the member or add the member to the file. *RCYFAILED Automatic audit recovery actions were attempted but failed to correct the detected error. Run the audit again. *RECOVERED Recovered by automatic recovery actions. No action is needed. *UA File entries are in transition and cannot be compared. Run the audit again. The Option column of the report provides supplemental information about the comparison. Possible values are: *NONE - No options were specified on the comparison request. *NOFILECHK - The comparison request included an option that prevented an error from being reported when a file specified in a data group file entry does not exist. *DGFESYNC - The data group file entry was not synchronized between the source and target systems. This may have been resolved by automatic recovery 300 Interpreting results for configuration data - #DGFE audit actions for the audit. One possible reason why actual configuration data in your environment may not match what is defined to your configuration is that a file was deleted but the associated data group file entries were left intact. Another reason is that a data group file entry was specified with a member name, but a member is no longer defined to that file. If you use the automatic scheduling and automatic audit recovery functions of MIMIX AutoGuard, these configuration problems can be automatically detected and recovered for you. Table 51 provides examples of when various configuration errors might occur. Table 51. CHKDGFE - possible error conditions Result File exists Member exists DGFE exists DGOBJE exists *NODGFE Yes Yes No COOPDB(*YES) *EXTRADGFE Yes Yes Yes COOPDB(*NO) *NOFILE No No Yes Exclude *NOMBR Yes No Yes No entry 301 When the difference is “not found” When the difference is “not found” For audits that compare replicated data, a difference indicating the object was not found requires additional explanation. This difference can be returned for these audits: • For the #FILDTA and #MBRRCDCNT audits, a value of *NF1 or *NF2 for the difference indicator (DIFIND) indicates the object was not found on one of the systems in the data group. The 1 and 2 in these values refer to the system as identified in the three-part name of the data group. • For the #FILATR, #FILATRMBR, #IFSATR, #OBJATR, and #DLOATR audits, a not found condition is indicated by a value of *NOTFOUND in either the system 1 indicator (SYS1IND) or system 2 indicator (SYS2IND) fields. Typically, the DIFIND field result is *NE. Audits can report not found conditions for objects that have been deleted from the source system. A not found condition is reported when a delete transaction is in progress for an object eligible for selection when the audit runs. This is more likely to occur when there are replication errors or backlogs, and when policy settings do not prevent audits from comparing when a data group is inactive or in a threshold condition. A scheduled audit will not identify a not found condition for an object that does not exist on either system because it selects existing objects based on whether they are configured for replication by the data group. This is true regardless of whether the audit is automatically submitted or run immediately. Because a priority audit selects already replicated objects, it will not audit objects for which a create transaction is in progress. Prioritized audits will not identify a not found condition when the object is not found on the target system because prioritized auditing selects objects based on the replicated objects database. Only objects that have been replicated to the target system are identified in the database. Priority audits can be more likely to report not found conditions when replication errors or backlogs exist. 302 Interpreting results of audits for record counts and file data Interpreting results of audits for record counts and file data The audits and commands that compare file data or record counts are as follows: • #FILDTA audit or Compare File Data (CMPFILDTA) command • #MBRRCDCNT audit or Compare Record Count (CMPRCDCNT) command Each record in the output files for these audits or commands identifies a file member that has been compared and indicates whether a difference was detected for that member. What differences were detected by #FILDTA The Difference Indicator (DIFIND) field identifies the result of the comparison. Table 52 identifies values for the Compare File Data command that can appear in this field Table 52. Possible values for Compare File Data (CMPFILDTA) output file field Difference Indicator (DIFIND) Values Description *APY The database apply (DBAPY) job encountered a problem processing a U-MX journal entry for this member. *CMT Commit cycle activity on the source system prevents active processing from comparing records or record counts in the selected member. *CO Unable to process selected member. Cannot open file. *CO (LOB) Unable to process selected member containing a large object (LOB). The file or the MIMIX-created SQL view cannot be opened. *DT Unable to process selected member. The file uses an unsupported data type. *EQ Data matches. No differences were detected within the data compared. Global difference indicator. *EQ (DATE) Member excluded from comparison because it was not changed or restored after the timestamp specified for the CHGDATE parameter. *EQ (OMIT) No difference was detected. However, fields with unsupported types were omitted. *FF The file feature is not supported for comparison. Examples of file features include materialized query tables. *FMC Matching entry not found in database apply table. *FMT Unable to process selected member. File formats differ between source and target files. Either the record length or the null capability is different. 303 Interpreting results of audits for record counts and file data Table 52. Possible values for Compare File Data (CMPFILDTA) output file field Difference Indicator (DIFIND) Values Description *HLD Indicates that a member is held or an inactive state was detected. *IOERR Unable to complete processing on selected member. Messages preceding LVE0101 may be helpful. *NE Indicates a difference was detected. *NF1 Member not found on system 1. *NF2 Member not found on system 2. *REP The file member is being processed for repair by another job running the Compare File Data (CMPFILDTA) command. *SJ The source file is not journaled, or is journaled to the wrong journal. *SP Unable to process selected member. See messages preceding message LVE3D42 in job log. *SYNC The file or member is being processed by the Synchronize DG File Entry (SYNCDGFE) command. *UE Unable to process selected member. Reason unknown. Messages preceding message LVE3D42 in job log may be helpful. *UN Indicates that the member’s synchronization status is unknown. See “When the difference is “not found”” on page 302 for additional information. What differences were detected by #MBRRCDCNT Table 53 identifies values for the Compare Record Count command that can appear in the Difference Indicator (DIFIND) field. Table 53. Possible values for Compare Record Count (CMPRCDCNT) output file field Difference Indicator (DIFIND) Values Description *APY The database apply (DBAPY) job encountered a problem processing a U-MX journal entry for this member. *CMT Commit cycle activity on the source system prevents active processing from comparing records or record counts in the selected member. *EC The attribute compared is equal to configuration *EQ Record counts match. No difference was detected within the record counts compared. Global difference indicator. 304 Interpreting results of audits for record counts and file data Table 53. Possible values for Compare Record Count (CMPRCDCNT) output file field Difference Indicator (DIFIND) Values Description *FF The file feature is not supported for comparison. Examples of file features include materialized query tables. *FMC Matching entry not found in database apply table. *HLD Indicates that a member is held or an inactive state was detected. *LCK Lock prevented access to member. *NE Indicates a difference was detected. *NF1 Member not found on system 1. *NF2 Member not found on system 2. *SJ The source file is not journaled, or is journaled to the wrong journal. *UE Unable to process selected member. Reason unknown. Messages preceding LVE3D42 in job log may be helpful. *UN Indicates that the member’s synchronization status is unknown. See “When the difference is “not found”” on page 302 for additional information. 305 Interpreting results of audits that compare attributes Interpreting results of audits that compare attributes Each audit that compares attributes does so by calling a Compare Attributes1 command and places the results in an output file. Each row in an output file for a Compare Attributes command can contain either a summary record format or a detailed record format. Each summary row identifies a compared object and includes a prioritized object-level summary of whether differences were detected. Each detail row identifies a specific attribute compared for an object and the comparison results. For example, an authorization list can contain a variable number of entries. When comparing authorization lists, the CMPOBJA command will first determine if both lists have the same number of entries. If the same number of entries exist, it will then determine whether both lists contain the same entries. If differences in the number of entries are found or if the entries within the authorization list are not equal, the report will indicate that differences are detected. The report will not provide the list of entries—it will only indicate that they are not equal in terms of count or content. You can see the full set of fields in the output file by viewing it from a 5250 emulator. What attribute differences were detected The Difference Indicator (DIFIND) field identifies the result of the comparison. Table 54 identifies values that can appear in this field. Not all values may be valid for every Compare command. When the output file is viewed from a 5250 emulator, the summary row is the first record for each compared object and is indicated by an asterisk (*) in the Compared Attribute (CMPATR) field. The summary row’s Difference Indicator value is the prioritized summary of the status of all attributes checked for the object. When included, detail rows appear below the summary row for the object compared and show the actual result for the attributes compared. The Priority2 column in Table 54 indicates the order of precedence MIMIX uses when determining the prioritized summary value for the compared object. Table 54. Possible values for output file field Difference Indicator (DIFIND) Values1 Description Summary Record2 Priority *EC The values are based on the MIMIX configuration settings. The actual values may or may not be equal. 5 *EQ Record counts match. No differences were detected. Global difference indicator. 5 *NA The values are not compared. The actual values may or may not be equal. 5 1. The Compare Attribute commands are: Compare File Attributes (CMPFILA), Compare Object Attributes (CMPOBJA), Compare IFS Attributes (CMPIFSA), and Compare DLO Attributes (CMPDLOA). 306 Interpreting results of audits that compare attributes Table 54. Possible values for output file field Difference Indicator (DIFIND) Values1 Description Summary Record2 Priority *NC The values are not equal based on the MIMIX configuration settings. The actual values may or may not be equal. 3 *NE Indicates differences were detected. 2 *NS Indicates that the attribute is not supported on one of the systems. Will not cause a global not equal condition. 5 *RCYSBM Indicates that MIMIX AutoGuard submitted an automatic audit recovery action that must be processed through the user journal replication processes. The database apply (DBAPY) will attempt the recovery and send an *ERROR or *INFO notification to indicate the outcome of the recovery attempt. *RCYFAILED Used to indicate that automatic recovery attempts via MIMIX AutoGuard failed to recover the detected difference. *RECOVERED Indicates that recovery for this object was successful. 1 *SJ Unable to process selected member. The source file is not journaled. 1 *SP Unable to process selected member. See messages preceding message LVE3D42 in job log. 1 *UA Object status is unknown due to object activity. If an object difference is found and the comparison has a value specified on the Maximum replication lag prompt, the difference is seen as unknown due to object activity. This status is only displayed in the summary record. 2 Note: The Maximum replication lag prompt is only valid when a data group is specified on the command. *UN 1. 2. Indicates that the object’s synchronization status is unknown. 4 Not all values may be possible for every Compare command. Priorities are used to determine the value shown in output files for Compare Attribute commands. For most attributes, when the outfile is viewed from a 5250 emulator, when a detailed row contains blanks in either of the System 1 Indicator or System 2 Indicator fields, MIMIX determines the value of the Difference Indicator field according to Table 55. 307 Interpreting results of audits that compare attributes For example, if the System 1 Indicator is *NOTFOUND and the System 2 Indicator is blank (Object found), the resultant Difference Indicator is *NE. Table 55. Difference Indicator values that are derived from System Indicator values. Difference Indicator System 1 Indicator Object *NOTCMPD *NOTFOUND *NOTSPT *RTVFAILED *DAMAGED Found (blank value) Object Found *EQ / *NE / (blank value) *UA / *EC / *NC *NA *NE *NS *UN *NE *NA *NE *NS *UN *NE *NE / *UA *EQ *NE / *UA *NE / *UA *NE *NS *NE *NS *UN *NE *RTVFAILED *UN *UN *NE *UN *UN *NE *DAMAGED *NE *NE *NE *NE *NE System *NOTCMPD *NA 2 *NOTFOUND *NE / *UA Indicator *NOTSPT *NS *NE When viewed through Vision Solutions Portal, data group directionality is automatically resolved so that differences are viewed as Source and Target instead of System1 and System2. For a small number of specific attributes, the comparison is more complex. The results returned vary according to parameters specified on the compare request and MIMIX configuration values. For more information about comparison results for journal status and other journal attributes, auxiliary storage pool ID (*ASP), user profile status (*USRPRFSTS), and user profile password (*PRFPWDIND) see the see the MIMIX Administrator Reference book. Where was the difference detected The System 1 Indicator (SYS1IND) and System 2 (SYS2IND) fields show the status of the attribute on each system as determined by the compare request. Table 56 identifies the possible values. These fields are available in both summary and detail rows in the output file. Table 56. Possible values for output file fields SYS1IND and SYS2IND Value Description Summary Record1 Priority <blank> No special conditions exist for this object. 5 *DAMAGED Object damaged condition. 3 *MBRNOTFND Member not found. 2 *NOTCMPD Attribute not compared. Due to MIMIX configuration settings, this attribute cannot be compared. N/A2 308 Interpreting results of audits that compare attributes Table 56. Possible values for output file fields SYS1IND and SYS2IND Value Description Summary Record1 Priority *NOTFOUND Object not found. 1 *NOTSPT Attribute not supported. Not all attributes are supported on all IBM i releases. This is the value that is used to indicate an unsupported attribute has been specified. N/A2 *RTVFAILED Unable to retrieve the attributes of the object. Reason for failure may be a lock condition. 4 1. 2. The priority indicates the order of precedence MIMIX uses when setting the system indicators fields in the summary record. This value is not used in determining the priority of summary level records. For comparisons which include a data group, the Data Source (DTASRC) field identifies which system is configured as the source for replication. What attributes were compared In each detailed row, the Compared Attribute (CMPATR) field identifies a compared attribute. For more information about identifying attributes that can be compared by each command and the possible values returned, see the MIMIX Administrator Reference book. “Attributes compared and expected results - #FILATR, #FILATRMBR audits” on page 677 309 MIMIX procedures when performing an initial program load (IPL) IBM Power™ Systems operations that affect MIMIX APPENDIX B The following topics describe how to protect the integrity of your MIMIX environment when you perform operations such as IPLs and IBM i operating system upgrades. Only basic procedures for a standard one-to-one MIMIX installation are covered. If you are operating in a complex environment—if you have cluster, SAP R/3, IBM WebSphere MQ, or other application considerations, for example—contact your Certified MIMIX Consultant. Ultimately, you must tailor these procedures to suit the needs of your particular environment. These topics describe MIMIX-specific steps only. Refer to the user manuals that correspond to any additional applications installed in your environment. For instructions on performing IBM Power™ Systems operations, consult your IBM manuals or the IBM Information Center at http://publib.boulder.ibm.com/pubs/html/as400/infocenter.html. The following topics are included: • “MIMIX procedures when performing an initial program load (IPL)” on page 310 includes the MIMIX-specific steps for performing an initial program load (IPL) to help ensure the integrity of your MIMIX environment is not compromised. • “MIMIX procedures when performing an operating system upgrade” on page 311 describes when and how to perform recommended MIMIX-specific steps while performing a standard upgrade of IBM i. • “MIMIX procedures when upgrading hardware without a disk image change” on page 318 describes MIMIX prerequisites and procedures for performing a hardware upgrade without a disk image change. • “MIMIX procedures when performing a hardware upgrade with a disk image change” on page 321 describes prerequisites for saving and restoring MIMIX software when upgrading from one system to another. • “Handling MIMIX during a system restore” on page 325 includes prerequisites for restoring MIMIX software within a MIMIX system pair, to one system from a save of the other system when an environment meets the conditions specified. MIMIX procedures when performing an initial program load (IPL) An initial program load (IPL) loads the operating system and prepares the system for user operations. Performing the recommended MIMIX-specific steps can help ensure that objects are not damaged during the IPL and that the integrity of your MIMIX environment is not compromised. Notes: 310 MIMIX procedures when performing an operating system upgrade • This procedure describes an IPL performed under normal circumstances. It does not address IPL considerations for system switching environments. • Before beginning this procedure, review your startup procedures to determine whether subsystems will start after the IPL. This startup program is defined in the QSTRUPPGM system value. To perform an IPL in a MIMIX environment, do the following: 1. End MIMIX from either the source or target system. The End MIMIX (ENDMMX) command attempts to end the MIMIX processes for the installation, including the MIMIX managers and the data groups. Data groups can be ended in an immediate (*IMMED) or controlled (*CNTRLD) manner. ENDMMX ENDOPT(*CNTRLD) ENDRJLNK(*YES) Note: For more information about the ENDMMX command, see “Commands for ending replication” on page 184. 2. Ensure that all MIMIX jobs are ended before performing this step. End the MIMIX subsystems on both the source and target system. On each system, type the following on a command line and press Enter: ENDSBS SBS(MIMIXSBS) OPTION(*IMMED) 3. Perform the IPL. 4. If your subsystems do not start during the startup procedures defined in the QSTRUPPGM system value, start the MIMIX subsystems on both the source and target systems. On each system, type the following on a command line and press Enter: STRSBS SBSD(MIMIXQGPL/MIMIXSBS) 5. Verify the communication links start, using the Verify Communications Link (VFYCMNLNK) command. Note: For more information about the VFYCMNLNK command, see “Verifying a communications link for system definitions” on page 281. 6. Start MIMIX from either the source or target system. The Start MIMIX (STRMMX) command starts the MIMIX processes for the installation, including the MIMIX managers and the data groups. Note: For more information about the STRMMX command, see “Starting MIMIX” on page 179. MIMIX procedures when performing an operating system upgrade This topic describes when and how to perform recommended MIMIX-specific steps while performing a standard upgrade of the IBM i operating system (slip-install, where the IBM i release is upgraded without a restore of the user libraries). Performing these recommended steps can help ensure that MIMIX products start properly once the operating system upgrade is complete. 311 MIMIX procedures when performing an operating system upgrade Table 57 indicates which procedures are needed for different upgrade scenarios. Use these instructions in conjunction with the instructions provided by IBM for upgrading from one IBM i release to another IBM i release. Table 57. IBM i operating system upgrade scenarios and recommended processes for handling MIMIX during the upgrade To upgrade Perform these procedures Backup system only 1. Perform the preparation steps described in “Prerequisites for performing an OS upgrade on either system” on page 312. 2. Follow the procedure in “MIMIX-specific steps for an OS upgrade on the backup system” on page 313. Production system only 1. Perform the preparation steps described in “Prerequisites for performing an OS upgrade on either system” on page 312. 2. Perform one of the following procedures: • If you need to maintain user access to production applications during the upgrade, perform a planned switch as described in “MIMIX-specific steps for an OS upgrade on the production system with switching” on page 315. Your production operations will be temporarily running on the backup system. • If you have more flexibility with scheduling downtime, you can perform the upgrade without switching as described in “MIMIX-specific steps for an OS upgrade on the production system without switching” on page 316. Both backup and production systems 1. Perform the preparation steps described in “Prerequisites for performing an OS upgrade on either system” on page 312. 2. Upgrade the backup system first following the “MIMIX-specific steps for an OS upgrade on the backup system” on page 313. By doing this first, you can ensure that the backup system supports all the capabilities of the production system and you can work through problems or custom operations before affecting your production environment. 3. Once you have the verified that the backup system is upgraded and operating as desired, perform one of the following procedures to upgrade IBM i on the production system: • If you need to maintain user access to production applications during the upgrade, perform a planned switch as described in “MIMIX-specific steps for an OS upgrade on the production system with switching” on page 315. Your production operations will be temporarily running on the backup system. • If you have more flexibility with scheduling downtime, you can perform the upgrade without switching as described in “MIMIX-specific steps for an OS upgrade on the production system without switching” on page 316 Prerequisites for performing an OS upgrade on either system Before you start an upgrade of the IBM i operating system on either system, do the following: 1. Access Support information on the web as you perform the following steps to ensure that the system is ready to upgrade: 312 MIMIX procedures when performing an operating system upgrade a. Check the compatibility of the operating systems on the production and backup systems, ensuring the systems will meet the requirements of a MIMIXsupported environment once the IBM i operating system upgrade has occurred. b. Ensure the recommended IBM IBM i PTFs have been applied according to your IBM i version. c. Ensure the recommended MIMIX service packs have been applied according to your MIMIX version. Review the Read Me document that corresponds to the MIMIX service pack, and check the website for relevant Technical Alerts and FAQs. 2. Review your startup procedures to understand how your environment is configured to start after an IPL. This startup program is defined in the QSTRUPPGM system value. An IBM i upgrade may include rebuilding access paths, converting formats, or performing other operations that must be complete before MIMIX or other applications are started. The upgrade may not complete successfully if your QSTRUPPGM procedures start MIMIX or other applications during an IPL. Ensure that these processes are disabled before continuing with the IBM i upgrade. MIMIX-specific steps for an OS upgrade on the backup system Use this procedure to upgrade the operating system on the backup system. Notes: • If you plan to upgrade both the production and backup systems during the same scheduled maintenance period, upgrade the backup system first. • In the following steps, the terms production and backup always refer to the original roles of the systems before upgrading the operating system on either system. The icons at the beginning of some steps show the state of the systems and replication as a result of the action in the step. The arrow in the icon indicates the direction and state of replication for a classic production to backup environment. • MIMIX Model Switch Framework commands, such as RUNSWTFWK, are typically run from the backup system. To perform an operating system upgrade of the backup system in a MIMIX environment, do the following: 1. Ensure that you have completed any prerequisite tasks for your upgrade scenario. See Table 57 for a list of required tasks for different upgrade scenarios. 2. End all user applications, user interfaces, and operations actively running on the backup system. Be sure to address the following: • Disarm any monitors, such as MIMIX Monitor, robot jobs, or other job schedulers. • Make sure all users are off the system. Note: For more information, refer to your Runbook, Using MIMIX Monitor, and your applications’ user manuals. 313 MIMIX procedures when performing an operating system upgrade 3. End the data groups from either system using the command: ENDDG DGDFN(*ALL) ENDOPT(*CNTRLD) Note: For more information about ending data groups see “Commands for ending replication” on page 184. 4. Wait until the status of each data group becomes inactive (red) by monitoring the status on the Work with Data Groups (WRKDG) display. Note: For more information about the WRKDG display, see “The Work with Data Groups display” on page 99. 5. If you have applications that use commitment control, ensure there are no open commit cycles. For more information, see “Checking for open commit cycles” on page 183. a. If an open commit cycle exists, restart the data group and repeat Step 3, Step 4, and Step 5 until there is no open commit cycle for any apply session. 6. Use the following command to end other MIMIX products in the installation library, end the MIMIX managers, and end the RJ link: ENDMMX ENDOPT(*CNTRLD) ENDRJLNK(*YES) 7. Use the following command on the production and backup system to end the MIMIX subsystems: ENDSBS SBS(MIMIXSBS) OPTION(*IMMED) 8. Complete the operating system upgrade. Allow any upgrade conversions and access path rebuilds to complete before continuing with the next step. Note: During the IBM i upgrade, make sure you perform a system save on the system being upgraded. This step will provide you with a backup of existing data. 9. Ensure the names of the journal receivers match the journal definitions: a. From the backup system, specify the command: installation-name/WRKJRNDFN JRNDFN(QAUDJRN *LOCAL) b. Next to the JRNDFN(QAUDJRN *LOCAL) journal definition, specify 14 (Build) and press F4. Type *JRNDFN for the Source for values parameter and press Enter. 10. Start the MIMIX subsystems on both the production and backup systems using the following command from each system: STRSBS SBSD(MIMIXQGPL/MIMIXSBS) 11. Perform a normal start of the data groups from either system using the STRMMX command. This step also starts the MIMIX managers. 12. Perform your normal process for validating the IBM i release upgrade. Notes: • At your convenience, schedule a switch to verify that your applications function on the new operating system on the backup system. 314 MIMIX procedures when performing an operating system upgrade • After the IBM i upgrade, you may receive object errors for program object types (*PGM) if your source IBM i version is higher than your target IBM i version. MIMIX is unable to save/restore *PGM objects in this case. To avoid these errors, compile the *PGM objects on the source system using the Create Program (CRTPGM) command. On the Target release prompt, specify the IBM i version of your target system. MIMIX-specific steps for an OS upgrade on the production system with switching Use this procedure if you need to maintain user access to production applications during the production system upgrade. This procedure temporarily switches production activity to the backup system before the upgrade and switches back to normal operations after the production system upgrade is complete. Notes: • In the following steps, the terms production and backup always refer to the original roles of the systems before upgrading the operating system on either system. The icons at the beginning of some steps show the state of the systems and replication as a result of the action in the step. The arrow in the icon indicates the direction and state of replication for a classic production to backup environment. • MIMIX Model Switch Framework commands, such as RUNSWTFWK, are typically run from the backup system. You can find more information about using the RUNSWTFWK command in the Using MIMIX Monitor book. To perform an operating system upgrade of the production system in a MIMIX environment while maintaining availability, do the following: 1. Ensure that you have completed any prerequisite tasks for your upgrade scenario. See Table 57 for a list of required tasks for different upgrade scenarios. 2. Use the procedures in your Runbook to perform a planned switch to the backup system. Note: Do not perform the synchronize phase of your switch procedures. If you do not have a Runbook, you need to follow your processes for the following: • End all user applications, user interfaces, and operations actively running on the production system. Disarm any monitors, robot jobs, or other job schedulers and make sure all users are off the system. • Resolve any errors in MIMIX and perform a controlled end of the data groups. • Perform a planned switch to the backup system. 3. End MIMIX products in the installation library and end the RJ link using the command: ENDMMX ENDOPT(*CNTRLD) ENDRJLNK(*YES) Note: For more information about the ENDMMX command, see “Commands for ending replication” on page 184. 4. From the production system, use the following command to end the MIMIX 315 MIMIX procedures when performing an operating system upgrade subsystems: ENDSBS SBS(MIMIXSBS) OPTION(*IMMED) 5. Start applications on the backup system and allow users to access their applications from the backup system. 6. On the production system, complete the operating system upgrade. Allow any upgrade conversions and access path rebuilds to complete before continuing with the next step. Note: During the IBM i upgrade, make sure you perform a system save on the system being upgraded. This step will provide you with a backup of existing data. 7. Ensure the names of the journal receivers match the journal definitions: a. From the original production system, specify the command: installation-name/WRKJRNDFN JRNDFN(QAUDJRN *LOCAL) b. Next to the JRNDFN(QAUDJRN *LOCAL) journal definition, specify 14 (Build) and press F4. Type *JRNDFN for the Source for values parameter and press Enter. 8. Follow your Runbook procedures to perform a synchronization. If you do not have a Runbook, you need to follow your processes for the following: • Starting MIMIX subsystems • Starting data groups 9. Follow your Runbook procedures to perform a planned switch back to the production system and start replication. If you do not have a Runbook, you need to follow your processes to switch replication so that you return to your normal replication environment. 10. Perform your normal process for validating the IBM i release upgrade. Note: After the IBM i upgrade, you may receive object errors for program object types (*PGM) if your source IBM i version is higher than your target IBM i version. MIMIX is unable to save/restore *PGM objects in this case. To avoid these errors, compile the *PGM objects on the source system using the Create Program (CRTPGM) command. On the Target release prompt, specify the IBM i version of your target system. MIMIX-specific steps for an OS upgrade on the production system without switching Use this procedure if you have more flexibility with scheduling downtime and can perform the upgrade without switching. Notes: • In the following steps, the terms production and backup always refer to the original roles of the systems before upgrading the operating system on either system. The icons at the beginning of some steps show the state of the systems and replication as a result of the action in the step. The arrow in the icon indicates the 316 MIMIX procedures when performing an operating system upgrade direction and state of replication for a classic production to backup environment. • MIMIX Model Switch Framework commands, such as RUNSWTFWK, are typically run from the backup system. You can find more information about using the RUNSWTFWK command in the Using MIMIX Monitor book. To perform an operating system upgrade of the production system in a MIMIX environment without switching, do the following: 1. Ensure that you have completed any prerequisite tasks for your upgrade scenario. See Table 57 for a list of required tasks for different upgrade scenarios. 2. End all user applications, user interfaces, and operations actively running on the production system. Be sure to address the following: • Disarm any monitors, such as MIMIX Monitor, robot jobs, or other job schedulers. • Make sure all users are off the system. Note: For more information, refer to your Runbook, Using MIMIX Monitor, and your applications’ user manuals. 3. End the data groups from either system using the command: ENDDG DGDFN(*ALL) ENDOPT(*CNTRLD) Note: For more information about ending data groups see “Commands for ending replication” on page 184. 4. Wait until the status of each data group becomes inactive (red) by monitoring the status on the Work with Data Groups (WRKDG) display. Note: For more information about the WRKDG display, see “The Work with Data Groups display” on page 99. 5. If you have applications that use commitment control, ensure there are no open commit cycles. For more information, see “Checking for open commit cycles” on page 183. a. If an open commit cycle exist, restart the data group and repeat Step 3, Step 4, and Step 5 until there is no open commit cycle for any apply session. 6. Use the following command to end other MIMIX products in the installation library, end the MIMIX managers, and end the RJ link: ENDMMX ENDOPT(*CNTRLD) ENDRJLNK(*YES) 7. End the MIMIX subsystems on the production system and on the backup system. On each system, type the following on a command line and press Enter: ENDSBS SBS(MIMIXSBS) OPTION(*IMMED) 8. Complete the operating system upgrade. Allow any upgrade conversions and access path rebuilds to complete before continuing with the next step. Note: During the IBM i upgrade, make sure you perform a system save on the system being upgraded. This step will provide you with a backup of existing data. 9. Start the MIMIX subsystems on the production system and the backup system as 317 MIMIX procedures when upgrading hardware without a disk image change you would during the synchronization phase of a switch. From each system, type the following on a command line and press Enter: STRSBS SBSD(MIMIXQGPL/MIMIXSBS) 10. Ensure the names of the journal receivers match the journal definitions: a. From the production system, specify the command: installation-name/WRKJRNDFN JRNDFN(QAUDJRN *LOCAL) b. Next to the JRNDFN(QAUDJRN *LOCAL) journal definition, specify 14 (Build) and press F4. Type *JRNDFN for the Source for values parameter and press Enter. c. Record the newly attached journal receiver name by placing the cursor on the posted message and pressing F1 or Help. 11. Using the information you gathered in Step 10, start each data group as follows (This step also starts the MIMIX managers.): a. From the WRKDG display, type an 9 (Start DG) next to the data group and press Enter. The Start Data Group display appears. b. At the Object journal receiver prompt, specify the receiver name recorded in Step 10c. c. At the Object large sequence number prompt, specify *FIRST. d. At the Clear pending prompt, specify *YES. 12. Start any applications that you disabled prior to completing the IBM i upgrade according to your Runbook instructions. These applications are normally started in the program defined in the QSTRUPPGM system value. Allow users back on the production system. Note: After the IBM i upgrade, you may receive object errors for program object types (*PGM) if your source IBM i version is higher than your target IBM i version. MIMIX is unable to save/restore *PGM objects in this case. To avoid these errors, compile the *PGM objects on the source system using the Create Program (CRTPGM) command. On the Target release prompt, specify the IBM i version of your target system. MIMIX procedures when upgrading hardware without a disk image change This topic describes MIMIX prerequisites and procedures for a hardware upgrade without a disk image change that will change a model, feature, or serial number and require a new license key. Performing these steps can ensure that MIMIX products start properly once the hardware upgrade is complete. Considerations for performing a hardware system upgrade without a disk image change Before you start a hardware upgrade on either system, consider the following: 318 MIMIX procedures when upgrading hardware without a disk image change • Ensure the new system is compatible with and meets the requirements for a MIMIX-supported environment. For more information, see the Supported Environments Matrix in the Technical Documents section of Support Central. • Apply the latest MIMIX fixes on both systems. The fixes are available by product in the Downloads section of Support Central. • Obtain new MIMIX product license keys. These codes are required for products when a model, feature, or serial number changes. For more information, see “Working with license keys” in the License and Availability Manager book. • Determine whether a planned switch is required prior to the hardware upgrade. For example, a switch would be necessary if the source system is being upgraded and users need to continue working while the upgrade takes place. To perform a switch, follow the steps in your runbook. For more information, see “Switching” on page 244. • Determine if the transfer definitions need to be changed. For example, transfer definitions would need to be changed if the IP addresses or host names change. For more information, see “Configuring transfer definitions” in the MIMIX Administrator Reference book. MIMIX-specific steps for a hardware upgrade without a disk image change Use this procedure to restart your MIMIX installation when updating your hardware. If you have special considerations, contact your Certified MIMIX Consultant for assistance. Before you begin, ensure that “Considerations for performing a hardware system upgrade without a disk image change” on page 318 have been reviewed and completed where applicable. Hardware upgrade without a disk image change - preliminary steps To perform this portion of the upgrade process, do the following on the system prior to the upgrade: 1. Ensure MIMIX is operating normally before performing the upgrade. There should be no files or objects in error and all transactions should be caught up. See “Resolving common replication problems” on page 207 for more information about resolving problems. 2. Ensure users are logged off the system and perform a controlled shutdown of all MIMIX data groups. For more information, see “Ending a data group in a controlled manner” on page 195. 3. End all MIMIX products. For more information, see “Ending MIMIX” on page 179. 4. Print the status information for each data group by doing the following: a. From the Work with Data Groups display, type 8 (Display detail status) next to each data group. b. Press Enter. c. Press F7 for object status and print the display. Keep the information for later use. 319 MIMIX procedures when upgrading hardware without a disk image change d. Press F8 for database status and print the display. Keep the information for later use. 5. Optional step: Save the MIMIX software by doing a full system save or by saving all MIMIX installation libraries: • LAKEVIEW • MIMIXQGPL • MIMIX-installation-library • MIMIX-installation-library_0 • MIMIX-installation-library_1 • /LakeviewTech (directory tree) 6. Optional step: If upgrading the source system and performing a switch, follow the steps in your runbook. For more information, see “Switching” on page 244. Hardware upgrade without a disk image change - subsequent steps To perform this portion of the upgrade process, do the following on the system after the upgrade has been completed: 1. Optional step: Update any transfer definitions that require changes. For more information, see “Configuring transfer definitions” in the MIMIX Administrator Reference book. 2. Enter the new product license key on the system. Do the following: a. From the MIMIX main menu select option 31 (Product Management Menu). The License Manager Main Menu appears. b. Select option 1 (Update license key). The Update License Keys (UPDLICKEY) command appears. Follow the instructions displayed for obtaining license keys. For more information, see “Obtaining license keys using UPDLICKEY command” in the License and Availability Manager book. 3. Confirm that communications work between the new system and other systems in the MIMIX environment. For more information, see “Verifying a communications link for system definitions” on page 281. 4. Optional step: Perform a data group switch by following the steps in your runbook, then skip to Step 6. See “Considerations for performing a hardware system upgrade without a disk image change” on page 318 to determine if a switch is required. 5. Use the Start Data Group (STRDG) command to start all data groups. For more information, see “Starting and ending replication” on page 169. 6. Run your MIMIX audits to verify the systems are synchronized. See “Running an audit immediately” on page 131 for more information about running audits. 320 MIMIX procedures when performing a hardware upgrade with a disk image change MIMIX procedures when performing a hardware upgrade with a disk image change When a hardware upgrade is being performed on a system, MIMIX software may need to be saved from the system being replaced and then restored to the system that is its replacement. The saved MIMIX information must be restored on a system that performs the same role within MIMIX operations. For example, if the network system is being replaced, MIMIX software must be saved from the network system and restored on the new network system. A network system cannot be restored to a new management system. This topic describes steps to consider prior to saving and restoring MIMIX software when upgrading from one system to another. Performing these steps can ensure that MIMIX products start properly once the hardware upgrade is complete. IMPORTANT! To ensure the integrity of your data, contact your Certified MIMIX Consultant for assistance performing a hardware upgrade. Considerations for performing a hardware system upgrade with a disk image change Before you start a hardware upgrade on either system, consider the following: • Contact your contact your Certified MIMIX Consultant prior to performing the upgrade for instructions that may be specific to your environment. • Ensure the new system is compatible with and meets the requirements for a MIMIX-supported environment. For more information, see the Supported Environments Matrix in the Technical Documents section of Support Central. • Apply the latest MIMIX fixes on both systems. The fixes are available by product in the Downloads section of Support Central. • Obtain new MIMIX product license keys. These codes are required for products when a model, feature, or serial number changes. For more information, see “Working with license keys” in the License and Availability Manager book. • Determine whether a planned switch is required prior to the hardware upgrade. For example, a switch would be necessary if the source system is being upgraded and users need to continue working while the upgrade takes place. To perform a switch, follow the steps in your runbook. For more information, see “Switching” on page 244. • Determine if the transfer definitions need to be changed. For example, transfer definitions would need to be changed if the IP addresses or host names change. For more information, see “Configuring transfer definitions” in the MIMIX Administrator Reference book. • Copy all automation for MIMIX to the new machine, including exit programs. • Transfer any modifications of programs such as QSTARTUP to the new system. Modifications may be needed to start the MIMIX subsystem after an IPL. Refer to your Runbook for an overview of the required automation changes that need to be performed on the system. 321 MIMIX procedures when performing a hardware upgrade with a disk image change MIMIX-specific steps for a hardware upgrade with a disk image change Use this procedure to save and restore your MIMIX installation when updating your hardware with a disk image change. If you have special considerations, contact your Certified MIMIX Consultant for assistance. Before you begin, ensure that “Considerations for performing a hardware system upgrade with a disk image change” on page 321 have been reviewed and completed where applicable. Hardware upgrade with a disk image change - preliminary steps To perform the save portion of the upgrade process, do the following on the old system prior to the upgrade: 1. Ensure MIMIX is operating normally before performing the upgrade. There should be no files or objects in error and all transactions should be caught up. See “Resolving common replication problems” on page 207 for more information about resolving problems. 2. Ensure users are logged off the system and all applications have ended. Perform a controlled shutdown of all MIMIX data groups. For more information, see “Ending a data group in a controlled manner” on page 195. 3. Optional step: Perform a data group switch by following the steps in your runbook. See “Considerations for performing a hardware system upgrade with a disk image change” on page 321 to determine if a switch is required. 4. End all MIMIX products. For more information, see “Ending MIMIX” on page 179. 5. Ensure there are no open commit cycles. For more information, see “Checking for open commit cycles” on page 183. a. If open commit cycles exist, restart the data group and repeat Step 4 to end all MIMIX products. 6. Print the status information for each data group by doing the following: a. From the Work with Data Groups display, type 8 (Display detail status) next to each data group. b. Press Enter. c. Press F7 for object status and print the display. Keep the information for later use. d. Press F8 for database status and print the display. Keep the information for later use. 7. Print the list of system values. Type the following on a command line and press Enter: WRKSYSVAL SYSVAL(*ALL) OUTPUT(*PRINT) 8. Save the MIMIX software from the old system by doing a full system save or by saving all MIMIX installation libraries: • LAKEVIEW • MIMIXQGPL • MIMIX-installation-library 322 MIMIX procedures when performing a hardware upgrade with a disk image change • MIMIX-installation-library_0 • MIMIX-installation-library_1 • /LakeviewTech (directory tree) Hardware upgrade with a disk image change - subsequent steps To perform this portion of the upgrade process, do the following after you have upgraded and restored all user data, including all MIMIX libraries: Note: To ensure that journaling is properly started, restore journals and journal receivers before restoring user data. 1. Ensure the following system values are set the same way on the new system as they were on the old system: QAUDCTL, QAUDLVL, QALWOBRST, QALWUSRDMN, and QLIBLCKLVL. 2. On a command line, type LAKEVIEW/UPDINSPRD and press Enter. 3. Enter the new product license key on the system. Do the following: a. From the MIMIX main menu select option 31 (Product Management Menu). The License Manager Main Menu appears. b. Select option 1 (Update license key). The Update License Key (UPDLICKEY) command appears. Follow the instructions displayed for obtaining license keys. For more information, see “Obtaining license keys using UPDLICKEY command” in the License and Availability Manager book. 4. On a command line, type CALL MXXPREG and press Enter to register the MIMIX exit points in the system registry. 5. Update any transfer definitions that require changes. For more information, see “Considerations for performing a hardware system upgrade with a disk image change” on page 321. 6. Confirm that communications work between the new system and other systems in the MIMIX environment. For more information, see “Verifying a communications link for system definitions” on page 281. 7. Ensure all automation, including MIMIX exit programs, for MIMIX is available and configured on the new system. 8. Make any necessary modifications to the QSTARTUP program. This may need to be modified to start the MIMIX subsystem after an IPL. For more information, see “Considerations for performing a hardware system upgrade with a disk image change” on page 321. 9. Start the MIMIX subsystem with the following command: STRSBS SBSD(MIMIXQGPL/MIMIXSBS). 10. Optional step: Perform a data group switch by following the steps in your runbook, then skip to Step 13. See “Considerations for performing a hardware system upgrade with a disk image change” on page 321 to determine if a switch is required. 11. Start the system manager with the following command: STRMMXMGR SYSDFN(*ALL) MGR(*SYS). 323 MIMIX procedures when performing a hardware upgrade with a disk image change 12. Start MIMIX with the following: If the source system was upgraded a. On the source system, type WRKJRNDFN JRNDFN(*ALL *LOCAL) on a command line, and press Enter. b. Press F10 to verify the Receiver Prefix, Library, and all other parameters (option 5) are correct. Make any necessary changes from the MIMIX management system before continuing. c. For each journal definition that has an RJ Link parameter value of *SRC or *NONE do the following: • Type option 14 and press F4=PROMPT. • Type *JRNDFN for the Source for values parameter and press Enter. • Record the newly attached journal receiver name by placing the cursor on the posted message and pressing F1 or Help. d. For each data group, run the following Verify Journaling File Entry (VFYJRNFE) command to ensure that the file entries for that data group are journaled to the correct journal and that the journal options are the same as those configured for the data group: VFYJRNFE DGDFN(DGNAME) FILE1(*ALL). e. Start the data groups with a clear pending start from the receivers recorded in Step c of this procedure: STRDG DGDFN(data-group-name) DBJRNRCV(user-journal-receiver) DBSEQNBR2(*FIRST) OBJJRNRCV(security-journal-receiver) OBJSEQNBR2(*FIRST) CLRPND(*YES) f. Delete any old receivers with different library or prefix names. g. User and application activity can be resumed on the system. If the target system was upgraded a. On the target system, type WRKJRNDFN JRNDFN(*ALL *LOCAL) on a command line, and press Enter. b. Press F10 to verify the Receiver Prefix, Library, and all other parameters (option 5) are correct. Make any necessary changes from the MIMIX management system before continuing. c. Type option 14 for each journal definition that has an RJ Link parameter value of *SRC or *NONE. Do not press enter. d. On the command line, type JRNVAL(*JRNDFN) and press Enter to build a new journal receiver for the journal definitions. e. Type WRKJRNDFN JRNDFN(*ALL *LOCAL) RJLNK(*TGT)on a command line, and press Enter. f. For each journal definition listed, do the following: • Type option 17 (Work with jrn attributes) and press Enter. • Type option 15 (Work with receiver directory) and press Enter. 324 Handling MIMIX during a system restore • Type option 4 (Delete) for all receivers in the list. If message CPA7025 is issued, reply with an “I”. g. For each data group, run the following Verify Journaling File Entry (VFYJRNFE) command to ensure that the file entries for that data group are journaled to the correct journal and that the journal options are the same as those configured for the data group: VFYJRNFE DGDFN(DGNAME) FILE1(*ALL). h. Use this Start Data Group (STRDG) command to start all data groups with the information collected in Step 6 of “Hardware upgrade with a disk image change - preliminary steps” on page 322: STRDG DGDFN(data-group-name) DBJRNRCV(last-processed-data-basejournal-receiver) DBSEQNBR2(last-processed-data-base-sequencenumber) OBJJRNRCV(last-processed-object-journal-receiver) OBJSEQNBR2(last-processed-object-sequence-number) CLRPND(*YES) i. Delete any old receivers with different library or prefix names. 13. Run your MIMIX audits to verify the systems are synchronized. See “Running an audit immediately” on page 131 for more information about running audits. Handling MIMIX during a system restore Occasionally, an entire system may need to be restored because of a system failure. For example, if there is a processor or OS (operating system) failure. This topic includes prerequisites for restoring MIMIX software within a MIMIX system pair (two systems using the same MIMIX installation) to one system from a save of the other system. A system restore may need to be performed when the when the following conditions exist: • The original production system, including the Licensed Internal Code and the OS, has been recovered from the backup system by tape. • The IBM installed release level is the same on each system. IMPORTANT! To ensure the integrity of your data, contact your Certified MIMIX Consultant for assistance performing a system restore. For information about MIMIX-supported environments, see the Supported Environments Matrix in the Technical Documents section of Support Central. Prerequisites for performing a restore of MIMIX Before you restore MIMIX on a system, consider the following steps which help ensure that MIMIX products start properly once the restore is complete: • Contact your contact your Certified MIMIX Consultant prior to performing the restore for instructions that may be specific to your environment. • Locate your MIMIX product license keys. These codes may be required after the restore. For more information, see “Working with license keys” in the Using License Manager book. 325 Index Symbols *ATTN application group 65 managers for node 67 monitors 62 replication 69 *CANCEL, step status of 88 *CANCELED, procedure status of 82, 91 *FAILED activity entry 224, 227 *FAILED status procedure 82, 91 step 88 *HLD file entry 210 tracking entry 219 *HLDERR file entry 210 tracking entry 219 *HLDRLTD file entry 210 *INACTIVE application group 66 node managers 67 replication 69 *MSGW status procedure 81 step 87 *UNKNOWN 66 A accessing MIMIX Availability Status display 93 MIMIX Main Menu 24 activity entries, object confirm delay/retry cycle 228 failed, resolving 224 remove history 229 retrying 227 additional resources 13 application group resolving reported problems 64 status of 60 application group definition 17, 20 application node status 65 applications, reducing contention with 279 audit #DGFE considerations 127 #DLOATR considerations 127 #FILDTA considerations 127 #IFSATR considerations 127 #MBRRCDCNT considerations 127 after a configuration change 126 authority level to run 23 automatic starting of 124 before switching 126 best practice 23, 126, 145 bi-directional environment considerations 27 change history retention criteria 43 changing schedule 41 compare phase 123 comparison levels 53 compliance 144 compliance threshold 52, 53 definition of 17 differences, resolving 133 displaying compliance status 145 displaying history 137 displaying runtime status 129 displaying schedule 147 displaying time of next scheduled run 147 displaying when automatic audits run 147 ending 136 history 137 job log 135 last performed 145 last successful run 144 no objects selected 139 objects compared 139 policies which affect 36 policies, runtime behavior 36 policies, submitting automatically 37 prevent from running 45 priority selection example 139 priority, default settings of 37 problems reported in installation 99 recovery phase 123 results 133 results recommendations 127 retain history of 54 rule name 39 running immediately 131 schedule 147 schedule, changing 41 scheduled, default settings of 37 status from 5250 emulator 96 status, compliance 144 status, runtime 129 summary 129 three or more node considerations 27 when not to audit 28 326 audit history change criteria 43 audit level best practice 53 changing before switch 43 audit results 133 #DGFE rule 300 #FILDTA rule 303 #MBRRCDCNT rule 303 interpreting, attribute comparisons 306 interpreting, file data comparisons 303 resolving problems 133, 300 troubleshooting 135 auditing level, object set when starting a data group 174 used for replication 231 authority level for product access 23 AutoGuard, MIMIX 17 automatic error recovery replication, policies for 32 system journal replication 34 user journal replication 33 automatic recovery audits 50 concept 17 system journal replication 50 user journal replication 50 AutoNotify feature, MIMIX 163 Availability Status display, MIMIX 93 B backlog starting shared object send job 175 system manager 151 backlog, identifying a 115 backup node sequence changing 71 examples of changing 73 verifying 70 backup system 18 best practice audit frequency 145 audit level 53, 126 audit level before switch 43, 53, 126 audit threshold 52, 53 switch frequency 244, 253 switch threshold 56 switching 245 best practices auditing 126 bi-directional environment policy considerations 27 C cancel procedure 92 clear error entries processing 175 when to 181 clear pending entries check for open commits 183 open commit cycle prevents 183 processing 175 resolving open commits before 183 when to 181 cluster services 21 cold start, replacement for 175 collector services 21 ending 153 starting 152 status 149 collector services status 67 command, by name Work with Audit History 137 commands, by mnemonic CHGDG 270 CHGPROCSTS 89 CHKDGFE 283, 300 CNLPROC 92 CRTDGTSP 291 DLTDGTSP 293 DSPDGSTS 105 DSPDGTSP 293 DSPMMXMSGQ 208 DSPRJLNK 264 ENDAG 169 ENDDG 169, 184, 190 ENDJRNFE 236 ENDJRNIFSE 239 ENDJRNOBJE 242 ENDJRNPF 236 ENDMMX 169, 184, 192 ENDRJLNK 274 ENDSVR 261 HLDDGLOG 294 MIMIX 24 RLSDGLOG 294 327 RUNPROC 90 STRAG 169, 171 STRDG 169, 171, 174 STRJRNFE 235 STRJRNIFSE 238 STRJRNOBJE 241 STRMMX 169, 171, 179 STRRJLNK 274 STRSVR 260 SWTDG 255, 257 VFYCMNLNK 281, 282 VFYJRNFE 237 VFYJRNIFSE 240 VFYJRNOBJE 243 VFYKEYATR 289 WRKAG 60 WRKAUDHST 137 WRKAUDOBJ 139 WRKAUDOBJH 142 WRKCPYSTS 263 WRKDG 99 WRKDGACT 224 WRKDGACTE 225 WRKDGFE 210 WRKDGIFSTE 219 WRKDGOBJTE 219 WRKDGTSP 291 WRKDTARGE 68 WRKMMXSTS 93, 164 WRKMSGLOG 209 WRKNFY 160 WRKNODE 66 WRKPROCSTS 78 WRKRJLNK 265, 267 WRKSTEPSTS 83 commands, by name Cancel Procedure 92 Change Data Group 270 Change Procedure Status 89 Check Data Group File Entries 283, 300 Create Data Group Timestamps 291 Delete DG Timestamps 293 Display Data Group Status 105 Display Data Group Timestamps 293 Display MIMIX Message Queue 208 Display RJ Link 264 End Application Group 169 End Data Group 169, 184, 190 End Journal Physical File 236 End Journaling File Entry 236 End Journaling IFS Entries 239 End Journaling Obj Entries 242 End Lakeview TCP Server 261 End MIMIX 169, 184, 192 End RJ Link 274 Hold Data Group Log 294 MIMIX 24 MIMIX Availability Status 93 Release Data Group Log 294 Run Procedure 90 Start Application Group 169, 171 Start Data Group 169, 171, 174 Start Journaling File Entry 235 Start Journaling IFS Entries 238 Start Journaling Obj Entries 241 Start Lakeview TCP Server 260 Start MIMIX 169, 171, 179 Start RJ Link 274 Switch Data Group 255, 257 Verify Communications Link 281, 282 Verify Journaling File Entry 237 Verify Journaling IFS Entries 240 Verify Journaling Obj Entries 243 Verify Key Attributes 289 Work with Application Groups 60 Work with Audited Obj. History 142 Work with Audited Objects 139 Work with Copy Status 263 Work with Data Group Activity 224 Work with Data Groups 99 Work with Data Rsc. Grp. Ent. 68 Work with DG Activity Entries 225 Work with DG File Entries 210 Work with DG IFS Tracking Ent. 219 Work with DG Obj Tracking Ent. 219 Work with DG Timestamps 291 Work with Message Log 209 Work with MIMIX Availability Status 164 Work with Node Entries 66 Work with Notifications 160 Work with Procedure Status 78 Work with RJ Links 265, 267 Work with Step Status 83 commit cycles effect on audit comparison 303, 304 commit cycles, open checking for 183 checking for after a controlled end 196 preventing problems with 187 preventing STRDG request 183 328 commit mode change prevents starting with open commits 183 communications ending TCP sever 261 starting TCP sever 260 compare phase 123 compliance audit 144 concept 125 switch 253 switch, policies for 49 compliance status switch 253 concepts auditing 122 MIMIX 17 configuration audit after changing 126 determining data areas and data queues 272 determining, IFS objects 271 results of #DGFE audit after changing 300 configuration changes deployed 174 contacting Vision Solutions 14 contention with applications, reducing 279 controlled end confirm end 196 description 186 procedure 195 wait time 187 cooperative processing 20 copying active files 263 correcting file-level errors 216 record-level errors 217 CustomerCare 14 D data areas and data queues determining configuration of 272 holding user journal entries for 221 resolving problems 220 tracking entries 219 verifying journaling 243 data group 17 backlogs 115 controlled vs. immediate end 186 definition 20 determining if RJ link used 267 disabling 270 enabling 270 ending considerations 190 ending controlled 195 ending immediately 198 ending selected processes 198 indication of disabled state 269 recovery point cleared 190 starting selected processes 181 state, disabled or enabled 269 status from 5250 emulator 95 status, database view 112 status, detailed 105 status, merged view 106 status, object view 110 status, summary 99 switching 249, 255 timestamps 291 when to exclude from auditing 28 data group entry description 20 data resource group 68 replication status summary 68 database apply (DBAPY) status 113 database apply cache policy 51 database error recovery, automatic 33 definition application group 20 data group 20 journal 20 remote journal (RJ) link 20 system 20 transfer 20 definitions application group 17 delay/retry cycle, confirm object in a 228 differences, resolving audit 133 disabled data group 269 displaying data group spooled file information 262 data group status details 105 long IFS object names 262 RJ link 264 RJ link status 265 status 93 documents, MIMIX 11 E ending audit 136 329 collector services 153 MIMIX managers 152 MIMIXSBS subsystem 193 system and journal managers 152 target journal inspection 154 TCP server 261 ending data group clears recovery point 190 considerations when ending 190 controlled end 195 controlled end wait time 187 controlled vs. immediate 186 how to confirm end 196 immediate end 198 processes 187 processes, effect of 203 processes, specifying selected 198 when to end RJ link 188 ending journaling data areas and data queues 242 files 236 IFS objects 239 IFS tracking entry 239 object tracking entry 242 ending MIMIX 192 controlled vs. immediate 186 end subsystem, when to also 193 follow up after 193 included processes 188 using default values 192 using specified values 192 when to end RJ link 188 ending replication 169 choices 184 controlled vs. immediate 186 ending RJ link independently from data group 274 when to end 188 errors file level 216 record level 217 system journal replicated objects 224 target journal of RJ link 288 user journal replicated files 210 user journal replicated objects 219 example priority audit object selection 139 examples changing backup node sequence 73 F file file-level errors 216 hold journal entries 214 new 231 not journaled 102 record-level errors 217 replicated 210 file identifiers (FIDs) 273 file in error examine held journal entries 213 resolving 210 file on hold release and apply held entries 215 release and clear entries 216 release at synchronization point 215 H hardware upgrade MIMIX-specific steps 319 no disk image change 318 prerequisites 318 with a disk image change 321 held error (*HLDERR) file entry 210 preferred action for entry 211, 220 tracking entry 219 history audited object 142 completed audits 137 displaying audit 137 history log, removing completed entries 229 history of, retaining procedures 31 hold (*HLD) preferred action for held entry 211, 220 put file entry on hold 214 put tracking entry on hold 221 release a held file entry 215 release a held tracking entry 223 hold ignore (*HLDIGN) preferred action for ignored entry 211, 220 put file entry on hold ignore 214 put tracking entry on hold ignore 222 hold related (*HLDRLTD) 211 hot backup 15 I i5/OS upgrade 311 330 IFS objects determining configuration 271 file IDs (FIDs) 273 hold user journal entries for 221 path names 262 resolving problems 220 tracking entries for 219 verifying journaling 240 immediate end description 186 incomplete tracking entry 186 information and additional resources 13 inspection target journal 22 installation, status of from 5250 emulator 93 IPL 310 implicitly started 231 requirements for starting 231 starting for data areas and data queues 241 starting for IFS objects 238 starting for physical files 235 starting, ending, and verifying 230 verifying for data areas and data queues 243 verifying for IFS objects 240 verifying for physical files 237 journaling status data areas and data queues 241 files 235 IFS objects 238 J L job log for audit 135 jobs used by procedures 77 used by procedures, status of 83 journal 19 inspection on target system 22 journal at create requirements 231 requirements and restrictions 232 journal cache or state resolving problems 103, 119 status 117 journal definition 20 defined to RJ Link 268 journal entry description 19 unconfirmed 286 journal manager 21 ending 152 resolving problems 149 starting 152 status 149 journal receiver 19 journaling 19 cannot end 236 data group problem with 101 ending for data areas and data queues 242 ending for IFS objects 239 ending for physical files 236 last audit performed 144 last switch performed 253 log space 22 long IFS path names 262 K keyed replication verifying file attributes 289 M management system 19 manager journal 21 system 21 manager status 67 menu MIMIX Main 24 message queue, primary and secondary 208 messages ENDMMX 172 STRMMX 172 MIMIX AutoGuard 17 MIMIX CDP feature exclude from audit 28 recovery point cleared 190 MIMIX installation 17 MIMIX managers checking for a backlog 151 ending 152 resolving problems 149, 151 starting 152 MIMIX Model Switch Framework 22, 249 policy default 56 MIMIX rules 122 331 MIMIX subsystem (MIMIXSBS) starting 179 when to end 193 MIMIX Switch Assistant 22 setting default switch framework 48 setting switch compliance policies 49 MMNFYNEWE monitor 163 monitor for newly created objects 163 monitors nodes where needed 62 status of 61 N names, displaying long 262 network system 19 new hardware upgrade 321 MIMIX-specific steps 322 prerequisites 321 new objects IFS object journal at create requirements 231 journal at create selection criteria 232 newly created objects, notification of 163 node entries 66 node status application group 65 data resource group 68 nodes, policy considerations for multiple 27 notification status 63 notifications definition 18, 159 displaying 160, 164 new problems in installation 99 severity level 125, 161 status 160 O object audited history 142 object auditing concept 19 setting level with STRDG 174 used for replication 231 object error recovery, automatic 34 object send process considerations for starting a shared 175 objects audited object list 139 configuration of non-file 271 displaying long IFS names 262 displaying objects in error 108 displaying objects with active entries 108 in error, resolving 224 new 231 reducing contention 279 tracking entries for data areas and data queues 219 open commit cycles audit results 303, 304 prevent problems with 187 resolving before starting replication 183 shown in status 196 when starting a data group 183 operations common, where to start 98 less common 259 orphaned recoveries 167 output file fields Difference Indicator 303, 306 System 1 Indicator field 308 System 2 Indicator field 308 P path names, IFS 262 planned switch 245 policies 18 audit, automatically submitting 37 audit, runtime behavior of 36 changing values 29 for auditing 36 for replication 32 for switching 48 installation-level only 31 introduction 26 multi-node and bi-directional environment considerations 27 policy action for running audits 54 audit action threshold 53 audit history retention 54 audit level 53 audit notify on success 50 audit rule 50 audit schedule 57 audit warning threshold 52 automatic audit recovery 50 automatic database recovery 50 automatic object recovery 50 CMPRCDCNT commit threshold 56 332 data group definition 50 database apply cache 51 default model switch framework 56 independent ASP library ratio 56 journaling attribute difference action 51 maximum rule runtime 52 notification severity 50 object only on target 51 prioritized audit in effect 147 procedure history retention 57 run rule on system 54 switch action threshold 56 switch warning threshold 56 synchronize threshold size 55 system journal recovery success 50 third delay retry interval 56 third delay retry interval, number of 55 user journal apply threshold 51 user journal recovery success 50 PPRC replication status 65, 68 problems reporting a problem 278 troubleshoot 276 problems, journaling data areas and data queues 241 files 235 IFS objects 238 problems, resolving audit results 133, 300 common errors 207 common system level errors 149 data group cannot end 280 data group cannot start 285 files in error 210 files not journaled 102 journal cache or state 103, 119 objects in error 224 open commits when starting data group 183 RJ link cannot end 286 RJ link cannot start 286 switch compliance 254 system level processes 149 tracking entries 219 procedure acknowledging failed or canceled 89 begin at step 90, 173, 248 canceling 92 defined 22 displaying status 78 history retention 57 how to run 90 last run of all 78 multiple jobs 77 multiple jobs, status of 83 overriding step attributes 91 resolve problems 80 resuming canceled or failed 91 run type *USER 90 run type other than *USER 90 status 80 status history of a 79 step status 83 procedure history change criteria 31 procedures 77 change history retention criteria 31 history retention 31 processes system level 149 production system 18 publications, IBM 13 Q QDFTJRN data area restrictions 232 role in processing new objects 232 QSTRUPPGM system value 311, 313 R recommendations auditing 126 before planned switch 245 checking audit results 127 policies in bi-directional environment 27 policies in three or more node environment 27 starting shared object send 175 recoveries active in installation 99 definition 18, 160 detected database errors 33 displaying details 164 occurring in installation 164 orphaned 167 orphaned, identifying 167 orphaned. removing 168 recovery domain changing backup sequence 71 verifying sequence 70 333 recovery phase 123 recovery point cleared by ENDDG 190 release (*RLS) held file entry 215 held tracking entry 223 release clear (*RLSCLR) file entry 216 tracking entry 223 release wait (*RLSWAIT) file entry 215 tracking entry 222 remote journal i5/OS function 19 remote journal (RJ) link 20 remote journal environment processes ended by ENDDG 203 processes started by STRDG 199 unconfirmed journal entry 286 removing activity history entries 229 duplicate tracking entries 223 unconfirmed entries 286 reorganizing, active files 263 replication automatic error recovery 32 backlogs, identifying 115 before starting 171 commands for ending 184 commands for starting 171 direction of 18 ending 169 policies that affect 32 resolve replication errors 207 starting 169 status from 5250 emulator 95 status summary 65, 68 supported paths 15 switching 244 system journal 15, 21 user journal 15, 21 replication path 21 replication, problems troubleshoot 276 where to start 207 requirements audits 126 journal at create 231 journaling 231 resolving problems application group 64 application group *ATTN status 65 application group other problem status values 66 common replication errors 207 data resource group status 68 node entry status 66 system level jobs 149 troubleshooting 276 resource group, data 68 status 68 restore MIMIX prerequisites 325 restrictions journal at create 232 QDFTJRN data area 232 retry objects in error 227 retrying, data group activity entries 227 RJ link 20 displaying 264 ending independently 274 errors for target journal of 288 identifying data groups that use 267 journal definitions by an 268 operating without a data group 274 removing unconfirmed entries 286 status 265 when to end 188 rule #DGFE 39 #DLOATR 39 #FILATR 39 #FILATRMBR 39 #FILDTA 39 #IFSATR 39 #OBJATR 39 rules MIMIX 122 rules, MIMIX descriptions 39 run procedure 90 running audits immediately 131 S schedule automatically submitted audits 37 changing audit 41 334 scheduler auditing 124 servers ending TCP 261 starting TCP 260 service cluster 21 status collector 21 services collector, ending 153 collector, starting 152 status from 5250 emulator 96 severity level, notification 161 source system 18 spooled files, displaying MIMIX-created 262 standby journaling IBM i5/OS option 42 117 overview 117 starting collector services 152 MIMIX managers 152 procedure at step 90, 173, 248 RJ link independently 274 system and journal managers 152 target journal inspection 153 TCP server 260 starting data group at specified journal location 175 deploy configuration 174 prevented by open commit cycles 183 procedure 181 processes, effect of 199 set object auditing level 174 when to clear entries 181 starting journaling data areas and data queues 241 file entry 235 files 235 IFS objects 238 IFS tracking entry 238 object tracking entry 241 starting MIMIX included processes 171 procedure 179 starting replication 169 before 171 choices 171 status 60 active file operations 263 application group 60 audit compliance 145 audits 129 audits (runtime) 96 checking from 5250 emulator 93 collector services 67 data group detail 105 data group summary 95 database apply (DBAPY) 113 installation summary 93 journal cache or state 117 journaling data areas and data queues 241 journaling files 235 journaling IFS objects 238 journaling tracking entries 238, 241 monitors 61 node entries 66 notification 160 notification new in installation 96 notifications 63 procedures 78 recoveries active in installation 164 replication 95 replication, application group level 65 replication, data resource group level 68 replication, logical 68 replication, PPRC 68 RJ link 265 services 96 steps in a procedure 83 switch compliance 253 switching 104 system and journal managers 67 system-level processes 149 target journal inspection processes 155 Work with Data Groups display 99 step begin procedure at 90, 173, 248 defined 22 resolve problems 85 status 83, 85 subsystem, MIMIXSBS ended 217 ending 193 starting 179 Switch Assistant, MIMIX 22 switch framework disable policy when not used 49 specify a default 48 switching 244 application group 250 335 best practice 23, 244, 245, 253 change audit level before 43, 53 compliance 253 conditions that end 257 description, planned switch 245 description, unplanned switch 246 journal analysis after unplanned switch 295 last switch field 253 phases of a 245 policies for 48 problems checking compliance 254 reasons for 245 setting switch compliance policies 49 setting switch framework policy 48 switch framework vs. SWTDG command 249 SWTDG command details 257 unplanned, actions to complete an 246 using option 6 on MIMIX Basic Main Menu 251 using STRDG command 255 synchronize file entry 211 objects, system journal replicated 226 tracking entries 221 system definition 20 system journal replication 15, 21 detailed status 110 errors automatically recovered 34 journaling requirements 231 system level processes 149 system manager 21 backlog 151 ending 152 resolving problems 149, 151 starting 152 status 149 system roles management or network 19 production or backup 18 source or target 18 T target journal inspection 22 last entry inspected 158 results 156 starting 153 status 149, 155 target system 18 threshold audit action 53 audit warning 52 CMPRCDCNT open commit 56 switch action 56 switch warning 56 synchronize size 55 user journal apply 51 timestamps 291 automatically created 291 creating additional 291 deleting 293 displaying 293 printing 293 tips displaying data group spooled files 262 displaying long IFS object names 262 removing journaled changes 294 working with active file operations 263 tracking entry 21 file identifiers (FIDs) 273 IFS 219 incomplete 186 not journaled 102 object 219 removing duplicate 223 transfer definition 20 U unconfirmed journal entries, removing 286 unplanned switch 246 performing journal analysis 295 unprocessed entries 196 upgrade hardware, no disk image change 318 hardware, with a disk image change 321 new hardware 321 OS/400 311 user journal replication 15, 21 detailed status 112 errors automatically recovered 33 journaling requirements 231 non-file objects 271 tracking entries 219 tracking entry 21 V verifying communications link 281, 282 journaling, IFS tracking entries 240 336 journaling, object tracking entries 243 journaling, physical files 237 key attributes 289 viewing status, active file operations 263 W wait time, data group controlled end 187 wait time, data group controlled end during switch 257 337