Witness Recording Solution: Maintenance, Monitoring, and Backup Recommendations Maintenance & Monitoring Best Practices & Recommendations v2.3 All Rights Reserved. Proprietary and Confidential 1 Witness Systems Global Services Program Office Table of Contents Maintenance and Monitoring Recommendations…………………………………3 CSCM .…………………………………………….3 Balance ……………………………………………8 Viewer………….………………………………….9 Archive(CAM) ….………………………………..10 EWare and Balance SQL Databases ……………..11 Unify……….……………………………………..13 All Servers ..……………………………………...13 Monitoring Parameters .………………………….14 Application Specific Monitoring Points …………15 Resource Management Matrix.…………………..16 Maintenance & Monitoring Best Practices & Recommendations v2.3 All Rights Reserved. Proprietary and Confidential 2 Witness Systems Global Services Program Office Maintenance and Monitoring Recommendations These recommendations are primarily directed at a Contact Store+ system. They can be extrapolated for stand alone systems as well. NOTE: It is the responsibility of the customer to provide complete back-up and restore capabilities. The below recommendations may still require a Certified engineer to re-install the application, in case of a complete server crash, which would be billable. CSCM - Pulled from Witness ContactStore for Communication Manager Planning, Installation and Administration Guide 7.x Preventative Maintenance This section highlights a number of administrative tasks that should be performed on a regular basis to ensure the system continues to operate smoothly. Daily Unless you have fully automated alerting of these conditions, you should carry out the following procedures at the start of each day: Alarms Check the Alarms page for new problems. Disk capacity Check the available disk space. The disk where recordings are stored will appear to be at or near capacity. However, the system consistently maintains a level of 1 GB of free space by deleting older files. This maximizes the number of recordings that are available online to you. The Witness ContactStore for Communication Manager's disk manager thread deletes files on a FIFO (First In First Out) basis. Check the contents of the log files as described in Troubleshooting on page 189 of the Witness ContactStore for Communication Manager Planning, Installation and Administration Guide 7.x and examine any errors logged since the previous check. Look at all error and warning messages, not just those generated by the Witness ContactStore for Communication Manager services. System Status It is difficult to detect some problems automatically. Check the system status regularly via the Status > System Overview page and verify that all figures are in line with expectations as described in System Overview on page 148. Confirm channel status Use the Status > Channels page of the Administration application to confirm that the recording channels are in the appropriate states. Confirm recording and replay To confirm recording and replay: Verify that calls are being uploaded into the database. Operations, Administration & Maintenance Use the Replay page to select the most recent calls to verify that calls are accessible. Maintenance & Monitoring Best Practices & Recommendations v2.3 All Rights Reserved. Proprietary and Confidential 3 Witness Systems Global Services Program Office Confirm that the start time of these calls matches expectations. Verify that the start time corresponds to the most recent calls made on the extensions being recorded. Confirm that these calls are playable and that audio quality is good. Archive If using DVD+RW archive, check the current disk's available capacity. Change the disk when it fills. Weekly As you become comfortable with the normal operation of your recorder, you can reduce the frequency of the daily tasks. For example, if you know that the rate at which your disk is filling is not going to fill the available space for several months, you can check it weekly. Perform the following tasks each week: Disk capacity: main recording store When your recorder is first installed, the disk is almost empty. As it gradually fills, you should note the rate at which it is being used (at least weekly) and extrapolate to estimate when the disk will be full. At this point, the Witness ContactStore for Communication Manager will begin deleting the oldest calls to make room for new ones. If this happens to calls that are younger than planned, check the configuration of the recorder to ensure that only the anticipated calls only are being recorded. Add additional disk capacity to the partition before it fills. Disk capacity: other partitions Check the available space on any other disk partitions. Verify that these other drives have sufficient space. The recorder will warn you if they fall below 500MB of free space. Accumulated temporary files or log files can account for this drop in available space. You may need to purge them manually. ! Important: When you are purging files, remember that files you delete go to the Recycle Bin and that the space they occupy is not freed until you empty it. Important: Call detail database purging If you have enabled automatic purging of aged call detail records, you should still monitor the size of the calls database during the first few months of use. You can then predict how Preventative Maintenance large the database will get by the time old records begin to be purged. Many customers plan never to purge call detail records, but choose instead to add disk capacity every year or two as the database grows. If you do this, you should upgrade your server every few years to compensate for the increasing size of the database and the reduction in search and update speed. Configuration Backup Changes to system configuration that affect user access rights are stored in the PostgreSQL database. This means that the system configuration is backed up whenever the call detail records are. See Backing up the Database below. Monthly Check the following aspects of the system on a monthly basis: Loading trends Note the total call volumes recorded every month to be aware of gradually increasing traffic trends. To do this: Maintenance & Monitoring Best Practices & Recommendations v2.3 All Rights Reserved. Proprietary and Confidential 4 Witness Systems Global Services Program Office Note the number of calls recorded at the end of each month and compare with previous month's accumulated total. Note the age of the oldest call on the disk (only applicable once the disk has filled for the first time) Note the CPU load during busy hour If it appears that the load is increasing, consider purchasing extra licenses if required and/or increasing server specification or disk space. Every Six Months The recorder must perform a full vacuum of the database approximately once every six months. As this interval is reached, the recorder issues a daily warning message. This tells you that it will do a full vacuum on next restart - unless you postpone it by clearing the checkbox on the System Settings > Server page. You must restart the recorder and allow it to perform this database maintenance task within one month of being warned about it. Backup/Restore Due to the huge volume of new files created every day, a voice recorder is not backed up in the same way as most application servers. This section guides you through the issues around backing up the application, the call details database and the recordings. Application The recorder's configuration is stored in its database (using PostgreSQL), alongside the details of the call recordings. To preserve the configuration of the server, back up the database frequently as described below. If you have not installed other applications on the server, there is no need to backup the operating system or the recorder software. It is faster to reinstall these server components in the event of disk failure. You should therefore retain the installation media and license key that you used. Backing up the Database You can back up your recorder's database using a command line procedure. The procedure uses the PostgreSQL pg_dump command to extract data from the database. It must be executed while the database is running. Do not stop the Witness ContactStore for Communication Manager service or the Postgresql service before proceeding. To back up your postgres database: 1. Log on as root. 2. Become the database owner by typing su - postgres 3. Create a backup file by entering the command: pg_dump --format=c --compress=5 eware > backupfile Please observe the following guidelines concerning the compression factor: 5 is a modest compression factor. Maintenance & Monitoring Best Practices & Recommendations v2.3 All Rights Reserved. Proprietary and Confidential 5 Witness Systems Global Services Program Office using a higher number (maximum is 9) makes the backup slower and uses more resources. However, it results in a smaller backup file. using a smaller number makes the backup faster and uses fewer resources. However, it results in a larger backup file. Restoring data to a new PostgreSQL database Note: You can only restore data to the server from which you dumped it because the dump file stores the software serial number and license key information. These are tied to a MAC address on the recorder. Unless you can move the original NIC into the new server, you will need to obtain a new license key if you wish to restore to different hardware. The following process erases the default database that exists after a complete re-installation and replaces it with the database that you have backed up. To restore the database: 1. Re-install the operating system. 2. Log on as root and install the recorder as described in Installing Witness ContactStorefor Communication Manager on page 79. 3. Stop the Witness ContactStore for Communication Manager service. 4. Become the database owner by typing su - postgres 5. Drop the existing database by entering the following command: dropdb eware 6. Create an empty copy of the postgresql database by entering the following command: createdb eware 7. Restore the data by entering the following command: pg_restore --dbname=eware --use-set-session-authorization backupfile Note: Backing up Voice Recordings The Witness ContactStore for Communication Manager stores voice recordings in the /calls partition. This partition quickly fills up with thousands of directories and millions of files. When the partition is nearly full, the recorder maintains only a tiny amount of free space on the partition by deleting batches of 100 recordings (and the directory that catalogued them) at a time, as it requires space for new recordings. This causes a huge churn of files every day. Limitations of full and incremental backup procedures On a Witness Contact Recorder server, two issues make it difficult to back up voice files: Configuration the file size the rate of change of the voice recording files Together these issues make most traditional backup strategies for the voice recordings ineffective. Traditional full backups are required more frequently than normal, which wastes backup media, and incremental backups are larger than expected because of the large churn of creations and deletions. For a backup strategy to be successful, it must be easy to restore the data if necessary. Traditional "full plus incremental" backup solutions are ineffective because these backup solutions cannot complete fully. In the event of a complete disk failure, the process restores the full backup, then the increments in chronological order. Maintenance & Monitoring Best Practices & Recommendations v2.3 All Rights Reserved. Proprietary and Confidential 6 Witness Systems Global Services Program Office This procedure immediately overflows the disk when the restore program tries to create the increments because the partition holding the call is almost at capacity to begin with. The full plus incremental backup will fail because it runs out of disk space before it has processed the "removals" part of the procedure. Traditional restore procedures are also ineffective. If you use this solution to review a recording that has been deleted because of age, the recorder immediately deletes any restored file as part of its disk maintenance. Finally, traditional backup solutions often require locks on the disk while they work. This can seriously disrupt the working of the recorder. Two suitable strategies for audio backup DVD+RW archive This simplest and cheapest strategy is to use the built in DVD+RW archive mechanism. This is not only fully integrated with the workings of the recorder and its search and replay mechanism, but also is well suited to the incremental recording required for a recorder. As recordings are added to the calls path they are copied to DVD in an efficient manner. Even when they have been deleted from the hard disk, the recorder is still able to play them because it knows which DVD they are on and can replay directly from DVD, without an intervening 'restoration' step. Each DVD holds about 4GB, which means it can hold about 150 channel-days worth of recordings from a busy system. For less than a dollar a day, even a busy system can have limitless backup. Archive Server The second most effective strategy is to implement the Archive system. This is a rules-based system. It copies audio files from the Witness ContactStore for Communication Manager onto different, centralized disks. The data on these centralized disks is organized in a more permanent way subject to less "churn" Backup/Restore It is possible to pause the Archive Manager when required, so, if a backup process requires a disk lock, the downtime does not cause a problem with the server's operation. This pause feature, together with the way Archive organizes the audio on disk, makes this data much more appropriate for traditional full/incremental backup solutions. Maintenance & Monitoring Best Practices & Recommendations v2.3 All Rights Reserved. Proprietary and Confidential 7 Witness Systems Global Services Program Office Balance Daily Maintenance Daily automated restart of BDR, Command and Tomcat services. A Windows scheduled task should be configured to stop these three services. Then another scheduled task should be configured to start these three services 10 minutes later. For example, shut down the services at 4:30am and then start the services at 4:40am. This allows sufficient time to shut down completely and be restarted gracefully. Monitoring of performance parameters referenced in the section Application Specific Monitoring Points. Shutdownservices.cmd NET STOP "eQuality BDR Service" NET STOP “Apache-Tomcat” NET STOP “eQuality Command Server” Startservices.cmd NET START "eQuality BDR Service" NET START “Apache-Tomcat” NET START “eQuality Command Server” Weekly Maintenance Actuate – Schedule runacdefrag.bat which rebuilds and restarts the Actuate services. . Monthly Maintenance Reboot Folders to backup <Installdrive>: \Program Files\Witness\QM <Installdrive>: \Tomcat <Installdrive>: \Tomcat5025 <Installdrive>: \eCorder (may be on a separate server) Complete registry backup Maintenance & Monitoring Best Practices & Recommendations v2.3 All Rights Reserved. Proprietary and Confidential 8 Witness Systems Global Services Program Office Contact Viewer Daily Maintenance No scheduled maintenance is required over and above the monitoring performance parameters referenced in the section Application Specific Monitoring Points. Weekly Maintenance No scheduled maintenance is required over and above the monitoring performance parameters. Monthly Maintenance Reboot Folder to backup <Installdrive>:\Program Files\ComPlus Applications <Installdrive>:\Program Files\Common Files\Avaya <Installdrive>:\Program Files\Common Files\Witness Systems <Installdrive>:\Program Files\Avaya\Viewer Complete registry backup Maintenance & Monitoring Best Practices & Recommendations v2.3 All Rights Reserved. Proprietary and Confidential 9 Witness Systems Global Services Program Office Archive (CAM) Daily Maintenance No scheduled maintenance is required over and above the monitoring performance parameters referenced in the section Application Specific Monitoring Points. Weekly Maintenance Check Archive Manager Campaign Status Administration->Systems->Archive Systems -> Management -> Campaign Configuration -> Campaigns Status Check Archive Engineering Configuration Archive Engineering -> Configuration -> Configuration Checker Archive Engineering -> Status -> Recordings being fetched/Failure Counts/Last 50 failures/Graphs Monthly Maintenance Reboot Folder to backup <Installdrive>:\Program Files\Witness Systems Maintenance & Monitoring Best Practices & Recommendations v2.3 All Rights Reserved. Proprietary and Confidential 10 Witness Systems Global Services Program Office EWare and Balance SQL Databases <May reside on Contact Viewer and Balance servers> One Time Reconfiguration The following is a list of recommended changes that should be applied to all systems to bring them up to a baseline. 1. SQL Server should be configured to dynamically configure its own memory. Maximum amount of memory should not exceed 85% of total available system memory. 2. Minimum Query memory should be set to 1024 kilobytes. 3. Nested triggers should be enabled. 4. We recommend the databases be in Simple recovery mode. 5. We recommend that performance condition alerts be set up on the databases to monitor transaction log utilization. Once log utilization reaches 70% a transaction log backup and truncate should occur. 6. We recommend the default size of the transaction log be set to 20% of the database data size. 7. Statistics should be set to auto update. 8. Statistics should be set to auto create. 9. DB jobs should start and complete during non-production times. System performance will be affected with jobs that run during production times. Daily Maintenance 1. We recommend a full database backup twice a week on Sunday AM (Saturday night) and Wednesday AM (Tuesday night) for all system and Witness application database(s) including: eWare database(s):(Audit, Dictionary, EWareCalls, EWareConfig, EyretelSite, License, Media, NGA_SC & UnifyClient) and Balance database(s) (Witness) 2. We recommend Differential Backups occur once a day or every other day of the week for all Witness databases. 3. Explicitly update the statistics on all Witness databases. 4. Monitoring of performance parameters referenced in the section Application Specific Monitoring Points. Maintenance & Monitoring Best Practices & Recommendations v2.3 All Rights Reserved. Proprietary and Confidential 11 Witness Systems Global Services Program Office Weekly Maintenance 1. Remove unused space from Database only. Leave all log files as they are. Only reduce space if log space grows beyond 40% of data size. 2. Explicitly re-organize Index Data Pages on all Witness databases. 3. Check the database integrity on all Witness databases. Monthly Maintenance Reboot Defragment all system hard drives. Explicitly rebuild all Indexes on all Witness databases. Files to backup All databases and logs - .mdf and .ldf files Maintenance & Monitoring Best Practices & Recommendations v2.3 All Rights Reserved. Proprietary and Confidential 12 Witness Systems Global Services Program Office Unify Daily Maintenance No scheduled maintenance is required over and above the monitoring of performance parameters referenced in the section Application Specific Monitoring Points. Weekly Maintenance No scheduled maintenance is required over and above the monitoring performance parameters. Monthly Maintenance Reboot Folder to backup <Installdrive>:\Program Files\Eyretel\Unify All Servers – 14 day reboot – Balance, Database, Unify, Viewer o Use the command “tsshutdn 1 /reboot” from a scheduled job Monthly reboot – CSCM Server Witness services on all systems can be set with service recovery options set to Restart the Service on the first, second and subsequent failures after 1 minute. All services can be set to automatic o Balance servers with-out voice cards in their eRecorders OK o Any Balance server with voice cards should NOT have services set to automatic and should follow the steps below; Startup order after server is logged into as the witness account 1. eQuality eRecorder 2. eQuality eRecorder Audio (this may be set to disabled in most environments, leave as disabled) 3. eQuality eRecorder Video Then wait Approx 1 minute - For a Single box solution Then wait Approx 2 minute - For a Multi box solution Maintenance & Monitoring Best Practices & Recommendations v2.3 All Rights Reserved. Proprietary and Confidential 13 Witness Systems Global Services Program Office 4. eQuality BDR Service Shutdown is in the reverse order Monitoring Parameters – Common Across All Systems: Automate metric collection and notification through IP Sentry or similar product. CPU Utilization should not remain over 70% on average Disk space on every drive on every system. Must stay below 90% full. Monitoring for NETLOGON Error #5783 in the Windows NT System Event Log on the Witness servers (often shows up every two hours). This indicates an Active Directory, DNS or NT Domain authentication issue exists in the network. Viewer relies on integrated Windows authentication. Domain Controller connectivity or Active Directory DNS configuration problems will cause application outages. The NETLOGON error message is an indication that this condition exists. DB queries to monitor system usage patterns over time for potential sizing or usage trends Metric Check Viewer audit database query for # of replays in date range grouped by hour and username. SQL Check Viewer calls database query to monitor # of new calls recorded Check total number of calls in the Balance database SELECT count(*) FROM tblcalls WHERE startedat > '2005-12-01' and startedat < '2006-01-01' SELECT loginname, datepart(hour,eventtime) as hour, datepart(month,eventtime) as month, datepart(day,eventtime) as day, datepart(year,eventtime) as year, count (machine) as count FROM EventLogView WHERE eventtime > '2005-01-01' and eventtime < '2007-0101' and eventid = 1073741909 and parameter5 = 'audio' GROUP BY loginname, datepart(hour,eventtime), datepart(month,eventtime), datepart(day,eventtime), datepart(year,eventtime) SELECT count(*) FROM cust_cont Maintenance & Monitoring Best Practices & Recommendations v2.3 All Rights Reserved. Proprietary and Confidential 14 Use To Identify trends in replay use on the system and compare to monthly baselines. Loginname values of “eyr_contact7k” represent Balance replays. Other loginname values represent Viewer replays. A large change in volume should trigger a more in depth resizing. Monitor the total recording volume of the system and compare to monthly baselines. A large change in volume should trigger a more in depth resizing. Monitor the Balance for volume of Database Witness Systems Global Services Program Office Metric Check Balance database query for # of calls per folder SQL SELECT c.name as "Folder Name", count(*) as "# of contacts", c.expiration_days as "Purge Days", c.max_unrev_cont as "Max Unreviewed", c.I__created as "Creation Date", c.I__modified as "Last Modified" FROM cust_cont a, cont_fold_ass b, cont_cat_fold c WHERE a.cust_cont_pk = b.cust_cont_pk and b.cont_cat_fold_pk = c.cont_cat_fold_pk group by b.cont_cat_fold_pk, c.name, c.expiration_days, c.max_unrev_cont, c.I__created, c.I__modified ORDER BY "# of contacts" desc Use To Balance Database Application Specific Monitoring Points Balance – o Call Manager Connection Event Queue Size o Call Manager Connection Events processed/sec o Call Manager Session Active Contacts o BRE Outstanding events to process o BRE Events in/sec o WEPS Database Command Processor Outstanding Commands o LMPS Number of playback sessions o LMPS Number of live monitor sessions EWare Database o See queries above. Unify o Unify incoming stream o Unify debug Queue (CTI Studio Link) - Should be used to alert very quickly if CTI Studio is in use and the queue is backing up. o Recorder queue - Should be used to alert if any recorder queues are backing up o Failed and rejected messages to the recorder - Will help you identify network issues to the recorders and potential overloading. o Memory used by Unify - If this increases dramatically will signal a queue build up somewhere suggesting something has gone wrong. Viewer o Perfmon Alert – If Active Server Pages\Request Execution Time is over 200,000. This value is in milliseconds and indicates a hung Viewer COM+ request if it is this high. Maintenance & Monitoring Best Practices & Recommendations v2.3 All Rights Reserved. Proprietary and Confidential 15 Witness Systems Global Services Program Office o Active Requests counter in the ASP counters set to 8 Resource Management Matrix Recommendations Tech - Updates, Support Triage, Installs & Upgrade Admin - MAC, Rules, Reporting Sites Agents Business Types / Customers Number of Servers Number of Number of 1 1 1.5 1.5 2 2 2.5 2 3 3 3.5 3 4 4 1-3 < 2,000 < 10 < 10 4-8 < 4,000 < 20 < 20 9-12 < 8,000 < 25 < 25 13-16 < 12,000 < 30 < 30 17-20 < 17,000 < 35 < 35 21-25 < 22,000 < 40 < 40 26-30 > 25,000 40+ 45+ Information is intended as a management guide-line. Numbers will vary based on operations and organizational structure Maintenance & Monitoring Best Practices & Recommendations v2.3 All Rights Reserved. Proprietary and Confidential 16 Witness Systems Global Services Program Office