HP Storage Essentials 9.6 Operations Best Practices Guide

advertisement

HP Storage Essentials 9.6 Operations Best Practices

Guide

HP Storage Essentials 9.6

Operations Best Practices Guide

A document created to provide detailed operational instructions.

About this guide

This document contains technical operational details about the Storage Essentials software and its product features. It contains instructions and directions for the day-to-day operation of a business to ensure consistency and quality of the storage management service. It provides a solution for common problems, to ensure they have each time the same response.

This guide is intended to be used by the Operations team, who maintain the Storage Essentials management server.

This guide is based on Storage Essentials version 9.6. Please refer also to the SE 9.6 support matrix for further details on supported devices.

This guide will describe common operational tasks, such as:

Maintenance activities

User management

Event management

Device management

Discovery management

Monitoring discovery

Remote agent management

Monitoring log files

Maintenance activities

Daily activities

Verified Daily activity

The HP Storage Essentials application is running

Scheduled Tasks completed successfully – GAED discoveries (SE), Report generation (SE), Event clean up (SE)

Investigate any devices that have been moved to a Missing or Quarantined state (SE)

Verify report infrastructure

Verify that the HP Storage Essentials product is running:

Check that the AppStorManager service is running. Ensure that it is still possible to launch the application and successfully log in.

Verify that any scheduled tasks completed and view the results.

4AA3-xxxxENW, Month 20XX Page 1

HP Storage Essentials 9.6 Operations Best Practices Guide

The GAEDSummary.log file contains results for the Get All Element Details (GAED) discoveries that have occurred. This log file is a good place to check each day. This log file can be e-mailed out at the end of each

GAED that occurs. See the section on Discovery Management for more information.

 Look for any devices that have been placed in a “Missing” or “Quarantined” state via the Step 3 discovery as the devices may need to be investigated further.

Daily Maintenance Checklist

Verify Tasks and Events with focus on GAED and Report Refresh completion.

Go to Discovery->System Tasks to verify GAED, Garbage Collection, Data Rollup and Report Refresh.

Check daily start and completion times to establish a benchmark for GAED consistency (you can use Event

Manager or System Tasks for this). See GAED note below.

Open the Event Manager module and review all messages for errors, alerts and indications.

Check Event Manager or the GAEDSummary log for replication errors, GAED abort messages, exception messages and missing or quarantined elements.

Use the GAED troubleshooting guide for a more detailed approach to GAED problem isolation.

Investigate missing or quarantined elements.

Ensure that all elements on the discovery list are contacted and managed by SE.

Run hbatest –v on individual hosts to determine if a failing host element is configured properly for SE discovery.

Run the WBEMDISCO utility from the management server to verify that array and switch elements are configured properly for SE discovery.

If an element is missing, then go to Discovery->Step 1 and press the TEST button for the missing element.

 Be sure the element is ‘pingable’ and that credentials are set properly.

Verify that the CIM Extension is installed and working properly and that there are no port or security conflicts.

If an element is quarantined, then check for hung or zombie processes, file system issues, stale volume mounts, malfunctioning or misconfigured multi-pathing software, Windows WMI issues, or other OS security issues.

GAED NOTE: A GAED should take no longer than 5 hours to complete. GAED times will greatly vary depending on a number of factors like activity level of storage devices during GAED, miscellaneous job streams that take a higher priority if running at the same time as the GAED, number of mapped volumes per host, number of providers, etc. If the GAED is taking more than 5 hours, then several steps can be taken to identify the the SAN elements and associated CIM classes that are contributing to the problem. Contact HP support for assistance.

To find quarantined elements in the database:

SQLPLUS appiq_system/password@appiq

SQL> SELECT Address ip, elementtype,elementname,discoverygroup FROM MVC_DISCOVERYDETAILSVW WHERE status <> 0

AND enabled = 0

For a stable report infrastructure, verify all data collection tasks

Stable reports begin with stable data collection (collectors and GAED). Always verify GAED and collectors first.

Check Event Manager or appstorm logs for collector errors, replication errors or various exception messages.

Check the GAEDSummary log for errors, exceptions or abort messages.

Ensure that the collectors scheduled to run are necessary for your reporting requirements. Unnecessary collectors should be disabled.

Check Report Cache Refresh start and completion times

Verify report refresh start and completion times. Should be about 30 minutes upwards to 1 hour and 30 minutes l ong….depending on the size of the environment. Longer Report Refresh times are very possible and completely normal. Report Refresh times are dependent on the number of elements and the optimal state of the database. A poor performing database will likely affect the performance of the refresh and the overall performance of the product.

4AA3-xxxxENW, Month 20XX Page 2

HP Storage Essentials 9.6 Operations Best Practices Guide

Check database Materialized Views (these are the database tables that contain report data) in the appstorm logs or via the SQL script below:

1) Sqlplus appiq_system/password@appiq

2) SQL> spool mview_status.text

3) SQL> SELECT siteide,db_link FROM MVIEW_RELATION;

4) SQL> SELECT * FROM mviewcore_status;

5) SQL> SELCT * FROM mview_status;

6) SQL> spool off

Be sure that MV’s are being updated.

To clear Matrialized Views, run the TRUNCATEVIEW.sql script:

1) SQLPLUS /nolog

2) connect appiq_system/password@appiq

3) SQL>@truncateview.sql

4) Exit SQLPLUS and run a Report Refresh from the UI Configuration->Reports->Report Cache->Refresh

Now

5) Reports should be populated with data

Always check custom and base reports for data.

Weekly activities

Verified Weekly activity

Check the number of events that are present in the HP Storage Essential product.

Validate that there is still ample disk space for growth of the HP Storage Essentials database.

Investigate any devices that have been moved to a Missing or Quarantined state (SE).

Ensure events present in HP Storage Essentials are not excessive, as this can lead to long load times for the Event view panels, as well as impact the overall database performance. The event management functionality can be manipulated to control the overall size of the event database tables.

It has been found that a large number of events (several hundred thousand) will make deletion of existing events consume large amounts of time, as well as have an impact on the GUI performance for the event tables.

Ensure event maintenance schedule for HP Storage Essentials is configured and that the quantity of events in the database is not excessive:

Navigate to the HP Storage Essentials event viewer (Tools -> Storage Essentials -> Home -> Event Manager).

 Use the “Show Element Type” filter to select “All” events

 Notice the “Total” value directly over the “Element Type” column and verify that it is not excessive

Check the available disk space for the location of the Oracle database and the HP Storage Essentials application.

Navigate to the Product Health page for HP Storage Essentials (Options -> Storage Essentials -> Manage

Product Health) and select the results tab to see the results of any scheduled scans. This will monitor the disk space that is consumed by the Oracle database application.

The HP Storage Essentials product, by default, will keep the last 10 copies of the appstorm<time stamp>.log

(100MB each) and the last three copies of the cimom<time stamp>.log (30 MB each for each discovery group).

There should exist at least 2 GB of free space on the drive that has the HP Storage Essentials application to allow for log file growth.

4AA3-xxxxENW, Month 20XX Page 3

HP Storage Essentials 9.6 Operations Best Practices Guide

NOTE: The management station itself must be discovered within HP Storage Essentials, and the Disk Space monitoring enabled (default is to be enabled) for this data to be present. In addition, the management station must be discovered by adding the management server as a di scovered device via the “Monitoring Product Health” link as described in the user manual under ‘Configuring the Management Server”. The Monitoring Product Health link is only available from the HP-SE

Discovery Setup page. This page can be accessed via Tools -> Storage Essentials -> Home -> Discovery -> Setup.

Clicking the Monitoring Product Health link will cause a second dialog box to open, and this dialog box will have the option to “Add” the management station.

Once this method is used to add the HP-SE management station as a device within SE to manage, the HP-SE management station will show on the Discovery Setup and Discovery Get All Details pages of HP-SE. The device will be represented as a “local host” and the credentials that will be used are for a built-in user that the HP-SE product has defined for this. The management station is monitored via an “embedded” host CIM Extension that is installed upon the management station at installation time for HP-SE.

Weekly Maintenance Checklist

SAN Changes, Upgrades or Moves

Identify planned or unplanned SAN changes and the potential or visible impact on SE. Changes to SAN elements will greatly affect how Storage Essentials can successfully communicate with and collect data from a device or devices that have been changed or moved.

Ensure that all elements on the discovery list are contacted and managed by SE.

Spot check accuracy of reports.

Check report.log for errors after running selected reports.

Always check the report refresh status and Materialized Views first if reports do not contain data.

Spot check selected reports data with actual provider data or host element data.

Perform basic database maintenance and performance tuning

Check the Oracle alert_appiq.log file for errors (..\oracle\admin\APPIQ\bdump)

Check the database consistency:

1) SQLPLUS /nolog

2) connect sys/change_on_install as sysdba

3) startup force;

4) shutdown immediate;

5) startup

6) exit

The database is unstable if ORA errors occur after the “startup force;” command.

Run Oracle performance sql scripts (at the end of this document) to identify possible bottlenecks.

Perform the following performance analysis as well:

1. Check the cimom.log file for "PRODUCER" and see how long it is timing out. Anything > 1000 ms may indicate a potential performance problem in the database.

2. Run the following ADDMRPT SQL script to capture performance data: cd \Oracle\Ora10\RDBMS\ADMIN

SQLPLUS /nolog connect appiq_system/password@appiq

SQL> @addmrpt.sql

You will see a column of data like this: appiq APPIQ 2430 03 Jun 2008 00:00 1

2431 03 Jun 2008 01:01 1

2432 03 Jun 2008 02:00 1

2433 03 Jun 2008 03:00 1

You will be asked for a begin time for the snapshot.

Specify the time JUST BEFORE THE GAED by selecting the value (using the list that is displayed) from the left hand column that corresponds to that time. (example: a GAED start after 2:00 would require a value of 2432...see table above)

You will then be asked for an end time for the snapshot.

4AA3-xxxxENW, Month 20XX Page 4

HP Storage Essentials 9.6 Operations Best Practices Guide

Specify the time JUST AFTER THE GAED completes. Again, use the value in the left hand column that corresponds to that time.

Reset temp tablespace at least monthly (see below) or bi-monthly as best practice

Export the database weekly

Re-boot Windows based Storage Essentials management server at least bi-monthly

Monthly activities

Verified Monthly activity

Clean up the temporary table space consumed by the Oracle database.

Generate a cold backup of the HP Storage Essentials database in case of need.

Clean up temp data table and RMAN backup space from the Oracle database:

1. Stop the HP Storage Essentials service.

2. Utilize the dbAdmin utility to verify that the database is in an open state and that the listener is running.

NOTE: Default password for SYS in dbadmin is change_on_install.

3. Reset the temporary table space.

4. Export the database and save to a zip file. Be sure to include both the HP Storage Essentials schema. The default behavior is to clear the Report Cache, which will save size in the exported zip file, as well as time if the database is imported.

NOTE: Perform the following steps ONLY IF THE DATABASE is currently running in ARCHIVE mode. If the database is not in Archive mode, they following clean up steps are NOT necessary. The default behavior for a

6. fresh installation of HP Storage Essentials is to NOT BE running in Archive Mode (so it is in a no-archive mode).

5. If the database is running in “Archive Mode”, set the database to “No-Archive Mode” so that the archive files can be cleaned up. This will delete the “APPIQ_##.ARC” files that are located at oracle\oradata\APPIQ\archive where

“oracle” is the location of the “ORA_HOME” environmental variable.

Return the database to “Archive Mode” by selecting the “Change Archive Mode and RMAN Backup” radio button.

7. Restart the HP Storage Essentials services.

8. Force an RMAN backup to occur, which will place the most current back-up in the orace\ora92\rman\current directory, where “oracle” is the location of the “ORA_HOME” environmental variable.

9. Delete the backups that are present in the oracle\ora10\rman\backup1 and backup2 directories.

If you suspect that the wrong .arc files were deleted then use the following procedure to reset RMAN ” a. Shutdown Appstormanager b. Delete ALL the archive files (*.arc) c. Run SQLPLUS / nolog d. connect appiq_system/password@appiq e. SQL> @resetlogs.sql

User Management

Default Admin user for HP SE

The HP Storage Essentials product maintains its own list of user information. This is necessary because the roles a given user has in regards to HP Storage Essentials functionality is managed completely within the HP Storage Essentials product.

The default “admin” user within HP Storage Essentials has different capabilities than a regular user with the “Domain

Admin” role.

4AA3-xxxxENW, Month 20XX Page 5

HP Storage Essentials 9.6 Operations Best Practices Guide

Event Management

If you had changed SNMP parameters/ports in jboss.properties and CIMOMconfig.xml, those files will be overwritten by upgrading to a new release or SP in the future. Be sure to make backups of those files if you make or have made changes for SNMP port assignments.

Event Deletions

The HP Storage Essentials product has a specific configuration for management of all events that are present within the system. It is possible to set the interval at which events of all categories are cleared as well as deleted. It will be necessary to monitor the number and type of the events that are present within the HP Storage Essentials database (see

Weekly Maintenance Tasks), and manipulate these configuration settings to ensure the events of the desired category are maintained long enough, yet the database is not becoming overfull of event data. The HP SE event configuration page can be accessed via Tools -> Storage Essentials -> Home -> Configuration -> Events.

Device Management

Cluster Discovery

Constituents are discovered individually.

Clusters of Windows hosts using Microsoft Cluster Server are automatically inferred from constituents SE CIM Extension is “cluster safe” and/or “cluster aware”, but not “clustered” today, meaning we do not install it on the cluster hostname or cluster IP. For “full clustering” support, we install the agent on the clustername or IP and handle cluster failover scenarios.

A “Cluster Builder” in the UI allows users to define additional clusters: Assisted by leveraging storage pathing information.

Some things that are not supported:

Failover of CIM Extension in a cluster is not supported

 “One-click” deployment of CIM Extension to all cluster members is not supported. (I.e., CIM Extension needs to be installed on each cluster member in the usual way.)

Provisioning to a cluster name is not supported, only the nodes can be used for provisioning.

Device Deletion

Devices need to be manually deleted from the product. There is a difference between deleting an “Access Point” and deleting a single device. Devices need to be removed from any discovery lists maintained by the product.

When deleting a device, care must be taken to ensure the device is properly deleted from the HP Storage Essentials product. In addition, the device should be removed from any scheduled discovery task lists to ensure that it does not get re-discovered. Deletion of the device from the HP Storage Essentials product will remove it from the Step 2 Topology

Discovery and Step 3 Details Discovery lists, but will not remove it from the Step 1 Setup Discovery list.

The concept of “Access Point” can cause some confusion as to what is actually being deleted. Within HP Storage

Essentials a “Proxy” device is discovered and through this proxy device many other devices are also discovered. A good example of this is the discovery of the Brocade switches within the environment. With the Brocade SMI-S method of discovery, the HP Storage Essentials product is typically pointed at a single switch entity for an entire fabric. Through this single switch entity all the other Brocade switches in the fabric are discovered. Within the topology view (System

Manager) of HP Storage Essentials, it is possible to right click on any device and select an option to delete it. If you are deleting something that was discovered using one of the proxy discovery methodologies, but is not the proxy device itself, then it is possible that the device will be re-discovered the next time a discovery operation is executed against the proxy device. If you delete a proxy device, also known as the “Access Point”, then it will also effectively delete all the devices that were discovered via this access point.

To delete the device from HP Storage Essentials:

4AA3-xxxxENW, Month 20XX Page 6

HP Storage Essentials 9.6 Operations Best Practices Guide a. Go to the Step 3 Details Discovery page (Options -> Storage Essentials -> Discovery -> Run Discovery Data

Collection). b. Locate the device to be deleted and click the Trash can icon. If this device is an “Access Point” it will also delete all the individual devices that were discovered via this access point. NOTE: If the intent is to delete just a single device from a list that came via an access point (with the knowledge that the device will be rediscovered once the “Access Point” is re-discovered), then the single device can only be deleted from the topology view (System

Manager) with a right click on the device. c. Navigate to the topology view (Tools -> Storage Essentials -> System Manager). Verify that the device does not show up in the left hand navigational tree (expand the “All Elements” branch, and then search within the appropriate device type to validate that it is no longer present). d. From the topology view, navigate to the Step 1 Setup Discovery page (Discovery -> Setup from the tool bar heading). Locate the device in this list (might be represented by IP Address only), and use the Trash con icon on the far left to delete the device.

Save & restore

Independent of the physical removal of the device, the representation of the device on any list that may allow it to be rediscovered must also be maintained. Within the HP Storage Essentials product, this is found on the Step 3 Setup Discovery page.

The device will be on this list, but it should NOT have a check mark next to it. Data collection for the device will not be executed. However, in order to ensure that this device is not accidentally re-discovered within HP Storage Essentials it is a best practice to remove this device from the Step 1 Setup Discovery list.

Generic Devices

HP Storage Essentials is attempts to collect as much information as it possibly can from a given device, which can at some point lead to un-intended results. In the case of Brocade switch discovery, the HP Storage Essentials product is able to recognize that there is a storage array connected to a switch port. Unfortunately, HP Storage Essentials only gets enough information to recognize the device as a storage array, and not enough to completely identify it fully. In this case, the HP Storage Essentials product still makes a storage device representation of the device with a general description as a “Generic Storage System”.

Example: The first picture shows the topology view (System Explorer) within HP Storage Essentials where just the

Brocade fabric has been discovered. Within this discovery of the Brocade fabric, the HP Storage Essentials product was also able to discover three “generic” storage devices. The reason that these devices are seen as generic devices is that the proper

“Access Point” or credentials have not been added to the discovery process, and therefore HP Storage Essentials does not know how to properly communicate with the device. Within the topology view of HP Storage Essentials, it is possible to tell that this is a “generic” device representation by the question mark “?” above the device. This indicates that HP

Storage Essentials does not have the complete details to identify and manage the device.

Wi th a subsequent discovery operation, the proper “Access Point” and credentials has been added for the EVA disk array. This will allow HP Storage Essentials to correctly identify the array. During the HP Storage Essentials identification phase (discovery using the proper Access Point and credentials) for this device, HP Storage Essentials will be able to rectify the previously created “generic” representation of the EVA disk array. When this occurs, there will only be one representation of the storage array within HP Storage Essentials. The following picture shows the same Brocade fabric topology, but now we have discovered the Access Point for the EVA storage array (WIN1), therefore HP Storage

Essentials is able to properly identify the storage device as an HP array with the name “SELab_EVA8000”. Within the HP

Storage Essentials environment, the running product has deleted the old “generic” representation of this storage device and replaced it with one that contains much more information.

Discovery Management

Methods to Initiate Discovery

Automatic discovery Default Mechanism to use.

4AA3-xxxxENW, Month 20XX Page 7

HP Storage Essentials 9.6 Operations Best Practices Guide

Manual discovery

Host file

Automated discovery operations

Multiple devices to be discovered at once.

Global Credentials can be used

Only need to discover a single device.

Device Credentials set prior to discovery.

When you have the saved discovery list from a previous installation.

Automatic Discovery is the preferred way to discovery devices:

Provides visual feedback into the state of the discovery process

Can be scheduled on a repeating interval or only run once

Allows for multiply devices to be discovered at once, as well as scanning on a range of IP addresses.

Allows for repeated discovery attempts against a given IP address (necessary for updates to credential information)

Only allows for discovery by IP address

When using the Automatic Discovery, it is not possible to assign device specific credentials prior to the discovery of the device (so it is a two step discovery process to get the credentials associated with the device). However, the Automatic

Discovery will attempt to use any Global Credentials that have been defined, and associated them with the device if possible. If HP Storage Essentials is able to identify the device with the use of the Global Credentials, then it is not necessary that a two step discovery process be utilized with the Automatic Discovery process. This type of discovery only requires a two step process if the device to be discovered MUST have individual System Protocol credentials set prior to successful identification by HP Storage Essentials.

Manual discovery steps

Manual device discovery is a way to add a single device to HP Storage Essentials. This technique can not be used if the device as previously been discovered by either HP Storage Essentials. This technique does provide the ability to define the System Protocol Credentials that should be used to successfully identify and communicate with the device, as well as establish some “default” properties for the device. This technique does not provide real time feedback to the end user.

Only need to discover a single device.

Device Credentials set prior to discovery.

Can discover a device using either the systems DNS name or IP address

Steps involved in an Initial Discovery of devices:

SE attempts discovery of an element.

System identification (IP/DNS): addresses and credentials are entered.

Basic discovery is done by HP SE (ping, system type, etc).

If the device is discovered in SE, it is put in the list of known systems, and added to the SE discovery map.

If the system type is in the filtered list, such as server, switch, unknown, unmanaged, etc, SE proceeds with further discovery steps.

Hosts File Discovery

Host File discovery can be used to import a list of information concerning a set of devices, and have a discovery operation run against that list. The list must be in a specific format, and is typical a list that has been exported from a previous installation. This device list contains the IP address for the device, both the short and fully qualified domain name for the device, and an indication as to what sort of device it is (server, storage, switch, etc).

Three types of discovery within HP Storage Essentials

The product relies primarily on the Step 3 Details Discovery.

Test button functionality (to verify credentials) can be accessed via the Step 1 Setup Discovery page.

4AA3-xxxxENW, Month 20XX Page 8

HP Storage Essentials 9.6 Operations Best Practices Guide

Step 1 Setup discovery must be accomplished in order for HP Storage Essentials to manage a device

Only the devices that are present on the Step 2 Topology Discovery or Step 3 Details Discovery lists are being truly managed by HP Storage Essentials. The Step 1 Setup Discovery must be able to successfully identify the device and pass it to these two other lists prior to any true management of the device by HP Storage Essentials.

For the HP Storage Essentials product there are three distinct phases to the device discovery operation. First the device must be identified as to what type of device it is. This is accomplished with the “Step 1 Setup Discovery” operation, which will validate that for a given IP address (or DNS name), and a set of credentials, the HP Storage Essentials is able to successfully communicate with the device and determine what it is. The second operation is to determine just enough information about the device in order to allow for successful representation within the topology map (System Explorer).

The basic network connectivity for the device is determined. This is accomplished with the “Step 2 Topology Discovery” operation. Finally, it is necessary for HP Storage Essentials to determine all the information it can for a given device. This is accomplished with the “Step 3 Details Discovery”, and it is not until this type of discovery is executed that a true representation of the device will be present within HP Storage Essentials.

Step 1 Setup Discovery

Determines what type of device is at the end of a given IP address, based on the set of Credentials that are available for use

 Can be used to “test” a given set of credentials for validity.

C an be accessed from any “Out of Frame” HP Storage Essentials page.

Can not be scheduled

The Step 1 Setup Discovery is utilized by HP Storage Essentials to gather basic information for a given IP address (or

DNS name) and set of credentials for that IP address. If there are any Global Credentials defined, and for a given IP address there does not exist a known system protocol credentials, then each of the Global Protocol Credentials are tested against the given IP address.

Once a device has been processed on this Step 1 Setup Discovery list, and HP Storage Essentials is able to successful discovery the device that resides at the given IP address, then that device is passed to the Step 2 Topology Discovery and Step 3 Get All Element Details Discovery lists. The Step 1 Setup Discovery list is never cleared of devices that have already been processed. Instead, there is a check box associated with each device. If a device has been successfully processed in this Step 1 Setup Discovery list, the check box will be cleared. When you manually execute a discovery for this list, HP Storage Essentials will attempt to identify all devices present on this page that still have a “Check Mark” next to them.

Step 2 Topology Discovery

Provides element linking (Fiber channel linking)

Can be accessed via any “Out Of Frame” HP Storage Essentials page.

Can not be scheduled

The Step 2 Topology Discovery is utilized to realize the topology view of the managed SAN without the additional data collection steps that are necessary for a complete Step 3 Detail Discovery to occur. This type of data discovery will only gather enough information to determine which device is connected where. This allows for rapid representation of the managed SAN within the System Explorer view, but a the more comprehensive Step 3 Detail Discovery will need to be run for HP Storage Essentials to completely manage the SAN environment. This type of discovery within HP Storage

Essentials can be access from any “Out of Frame” HP Storage Essentials page.

Step 3 Detail Discovery

Comprehensive discovery of device information

Can be scheduled

4AA3-xxxxENW, Month 20XX Page 9

HP Storage Essentials 9.6 Operations Best Practices Guide

Step 3 Details Discovery is often referred to as a GAED operation (Get All Element Details). This is the process whereby

HP Storage Essentials gathers the comprehensive set of data from all devices that are present within the targeted discovery group. The frequency at which this type of discovery operation is run within the HP Storage Essentials product will have a direct impact up on the accuracy of data presented by HP Storage Essentials. Changes to the managed SAN environment will only be discovered via this Step 3 Details Discovery operation. Hence it is recommended that this discovery operation be scheduled to occur at a frequency that is adequate for the managed SAN environment to ensure the data presented by HP Storage Essentials is as accurate as possible. Ideally, this type of discovery operation is scheduled during time when there is the least amount of active on both the managed SAN environment and the HP

Storage Essentials management station itself (not a time of active use of the HP Storage Essentials product).

This Step 3 Details Discovery encompasses all discovery operations that occur with the Step 2 Topology discovery as well as gathering of the additional discovery information. This type of discovery operation has the ability to place a given device into a “Missing” or “quarantined” state. The GAED operation will place a device into a missing state if, at the time it is attempting to gather data from that device, it is not able to successfully contact the device. This could be due to a number of reasons such as network issues, the device itself is down, the CIM Extension (in case of a server) is not running, etc. If the GAED operation is able to connect to the device, but finds that a given device is not responding correctly, it will place it into a “quarantined” state. This is done to prevent future discovery operations from running into the exact same issues with the device.

Control the devices involved Step 3 Details Discovery

On the Step 1 Setup Discovery page there is a check box associated with each device that is present upon the Step 1 discovery list. If there is a check m ark in this box, when the “Start Discovery” button is pushed, that device will be included in the discovery operation.

On the Step 3 Details Discovery pages, there are also check boxes associated with each device, but these check boxed do not control the devices that are included in a discovery operation like they did with the Step 1 Setup Discovery page. In the case of the Step 3 Details discovery operation, the action of starting a discovery operation (Get Details) is to push the appropriate button, and then the operation is executed against the Discovery Groups that are selected. In the case of the

Step 3 Details Discovery, the default behavior is to have All Discovery Groups selected, which means that hitting the “Get

Details” button will attempt to perform a comprehensive discovery against all the devices in all the discovery groups. Prior to running the discovery operation, be sure to select the desired discovery group from the “Get details from one or more discovery groups” dialog window. When the “Get Details” button is clicked, and dialog box is presenting, confirming the selection of discovery group or groups against which Step 3 Details Discovery is to occur. The check boxes on this page are used solely for making a selection of devices to have some operation other than discovery performed upon. The available operations include “Move to Discovery Group”, “Set Quarantine”, “Clear Quarantine” and “Delete”.

Discovery Lists within SE

Discovery lists is a term used to define which devices are available for processing and management by HP Storage

Essentials. These lists are found on each of the respective HP Storage Essentials discovery pages.

The devices that are present on the “Step 1 Setup Discovery” list (accessed via the Tools -> Storage Essentials -> Home -

> Discovery > Setup) are the devices that have been presented to HP Storage Essentials as being “eligible” for management. The presence of a device on this list is not enough to indicate that HP Storage Essentials is actively managing the device. Basically, this list is used to validate that the credential information is correct for a given device. If a discovery operation has been executed against a given device, and that device does not show up on either the Step 2

Topology or Step 3 Details discovery lists, then this Step 1 Setup Discovery list is a place to look and see what sort of credentials are being utilized for the device. It is also possible to run a test against the device from this list.

The devices that are present on the “Step 2 Topology Discovery” and “Step 3 Details Discovery” should be exactly the same. These are the devices that have been successfully identified by HP Storage Essential, and HP Storage Essentials recognizes that it can successfully manage the device. If a device is expected to be managed by HP Storage Essentials, but it is not showing up in the HP Storage Essentials product, then check one of these two lists to verify that it is included.

Only the devices that are present on this list will show up in other areas of the HP Storage Essentials product.

Discovery Groups

4AA3-xxxxENW, Month 20XX Page 10

HP Storage Essentials 9.6 Operations Best Practices Guide

Devices managed by the HP Storage Essentials internal CIMOM are eligible to be placed into specific discovery groups.

 Devices managed by external CIMOM’s (all SMI-S supported devices) are represented within HP Storage

Essentials as belonging to a Discovery Group that contains only them.

Hosts are discovered individually via the CXWS provider and are no longer part of a discovery group.

Each Discovery Group that is utilized within HP Storage Essentials is a separate Java process operating as its own entity, and competing with the management server processes for resources. Discovery group processes are named storCIMOMdefault, storCIMOM1, storCIMOM2, etc.

Discovered devices, if not managed via an external SMI-S provider, are placed into the Default Discovery Group.

Movement of devices from one discovery group to another can take some time to complete.

Discovery Groups is a concept that is used to indicate how the HP Storage Essentials product is communicating with a given device, as well as a way to logically separate devices from each other and provide some control over the Get All

Details Discovery (GAED) operations against those devices. The Default Discovery group is always instantiated by the running HP Storage Essentials product, and will by default, hold all devices that are managed via the internal HP Storage

Essentials CIMOM object. There are 11 other discovery groups that can be instantiate, appropriately named Discovery

Group 1 through Discovery Group 11. While these discovery groups exist and are available within the product, they are not created until they are needed for use. So until a device is placed into a given discovery group, it is not utilizing any additional memory resources within the SE management server. Conversely, when all devices have been removed from a given discovery group, upon subsequent restart of the HP Storage Essentials service (AppStorManager service), that discovery group will not be instantiated and will not compete for memory resources within the SE management server.

When to use Discovery Groups

Separate discovery operations between physical devices

Isolation of potential problematic devices for further analysis

The comprehensive discovery operation (GAED) can be scheduled to occur at a fixed interval. This is necessary to ensure that any changes that do occur within the monitored SAN environment are correctly represented within the HP

Storage Essentials product. It may desirable or even necessary for this comprehensive discovery to occur nightly for the switch fabric environment or the server environment, but not sure critical that the array information be collected that frequently. In this case, it is possible to utilize on of the four discovery groups and move the arrays into that discovery group. This would leave just the switches and the servers within the Default discovery group. The schedule of comprehensive discovery could then be set such those devices in the Default discovery group are run nightly, where as the arrays only have this comprehensive discovery performed twice a week.

Another effective use of the discovery group capabilities within HP Storage Essentials is for the isolation of devices that seem to have issues (potentially) with the comprehensive discovery operations. When a GAED discovery occurs, all the devices that are present within the targeted discovery group are queried for the comprehensive information. The discovery technique that is used is to gather a single piece of data from all devices that report that type of data, then move on to the next piece of data. There is the potential that one single device may not be behaving properly, and this could cause the discovery against all other devices to fail. In order to ensure that valid data is being collected at the desired interval, it would be possible to move the device that is perceived as causing the issue into a separate discovery group, in which case it will no long be able to impact the discovery operations against the other devices Moreover, once the suspected device has been isolated, additional discovery operations and analysis can be performed just on the sole device, once again without impacting the rest of the devices in the environment.

Discovery Group Management

The management of the discovery groups within HP Storage Essentials is performed on the Step 3 Details Discovery page (Options -> Storage Essentials -> Discovery -> Run Discovery Data Collection). This page lists all the devices that are being actively managed by HP Storage Essentials as well as which discovery group those devices are currently a member of. When looking at this Step 3 Details Discovery page, the column “Discovery Group” indicates which group the device is currently part of. Devices that list something in this column like “https://129.176.185.192:5989” are being managed by an SMI-S provider that is external to the HP Storage Essentials installation, and hence are not eligible to

4AA3-xxxxENW, Month 20XX Page 11

HP Storage Essentials 9.6 Operations Best Practices Guide move to a different discovery group. Only the devices that are currently in one of the 12 available internal discovery groups (Default, Discovery Group 1 – 11) can be moved between discovery groups.

There are two ways to move devices from one discovery group to another. If you are moving multiple devices from one discovery group to another it is possible to select all the desired devices up front, and then move them in a single operation. This is done via the “Check boxes” on the left hand side of the Step 3 Details Discovery list. When the “Move to

Discovery Group” button is selected, all devices that have a check mark in this check box will be moved to the indicated discovery group. In the following example, there are two devices that have a check mark in their check box and there for will be moved into Discovery Group 3 has indica ted by the “Select Discovery Group” pop up window. When you select the

“Ok” button on the “Select Discovery Group” window, you get a follow on message about how it might take some time for the operation to complete which you must also acknowledge. Once you select OK on that screen, you are brought back to the Step 3 Details Discovery page.

When there is need to only move a single device from one discovery group to another, this can be accomplished using the

“Edit” functionality for that device that is present on the Step 3 Details Discovery page. As a matter of fact, the only aspect of the device that is available to be edited via this edit button functionality is the discovery group which the device resides in. In this example, the edit button was selected for the host sbmaui. This brought up the “Edit Discovered Element” dialog window, and this device is being moved from the Default discovery group to Discovery Group 2.

Discovery Group Considerations

There are several things that should be taken into consideration when making the decision to separate discovered devices into a subset of individual discovery groups. a. The HP Storage Essentials management station must have enough resources to support multiple discovery groups running simultaneously. Each discovery group will attempt to use up to 1.5 GB’s of real memory as well as processing time on the server’s CPU(s). b. Is the environment somewhat stable in terms of credential information? If their credentials will change frequently for the devices that are being monitored via HP Storage Essentials, then it will be necessary to move the devices back to the Default discovery group when it is needed to update the credential information. c. Is there a true need to separate the devices into multiple discovery groups? If the environment managed by HP

Storage Essentials is functional with all devices present within the Default discovery group then there is no reason to incur the additional resource overhead and complexity of multiple discovery groups.

Monitoring Discovery Operations

Set up automatic e-mail notification of GAED operation completion

Visual representation via Discovery Icon in the User Interface

Real time log view showing on-going discovery operations.

Setting up automatic e-mail notification of GAED operation completion

It is often desirable to schedule the comprehensive Get All Element Details (GAED) discovery to occur at times when the operational use of the HP Storage Essentials product is at a minimum usage level. These times do not often coincide with normal business hours. In order to effectively monitor the occurrence of these GAED operations that occur off hours

(or even during normal hours), it is possible to configure the HP Storage Essentials product to send out e-mail alerts to a defined list of individuals at the completion of all GAED operations. The information sent out will be a summary of the last

GAED that occurred, providing a summary list of the discovery groups that were queried for data, the length of time that the GAED operation took, and any issues that were recorded during this GAED operation. The ability to configure the HP

Storage Essentials product to automatically send out messages to a fixed set of e-mail addresses upon the completion of a GAED is found within the “Product Health” capabilities of HP Storage Essentials (Options -> Storage Essentials ->

Manage Product Health). By default, the property value to have HP Storage Essentials send an e-mail when a GAED operation completes is set to true. There is a second configuration parameter that can be set to establish the list of individuals that will receive the e-mail notification that a GAED operation has completed. This parameter is the

“gaedemail=” parameter, and accepts a semi-colon (;) separated list of e-mail address at which to send the GAED summary results. The steps to configure the sending of e-mail notification of GAED completion are:

1. Navigate to the “Manage Product Health” page via Options -> Storage Essentials -> Manage Product Health

4AA3-xxxxENW, Month 20XX Page 12

HP Storage Essentials 9.6 Operations Best Practices Guide

2. Click on the “Advance” tab on the left hand navigational screen

3. Select the “Show Default Properties” button.

4. In the resulting web page, search for the string “gaedemail”.

5. copy that line (should read #gaedemail=adim@company.com;xyz@company.com)

6. paste this line into the “Custom Properties” dialog box

7. Edit the line by removing the “#” sign in front so that it will be seen as an active parameter, and put in the comma

1. separated list of desired recipients of the GAED operation information.

8. Restart the HP Storage Essentials product (the AppStorManager service).

The following example has already covered steps 1 – 6. It shows the result within the “Custom Properties” dialog box of a successful setup of a list of e-mail address that will receive the GAED complete notification.

Discovery Status Icon

The discovery status icon shows the current state of discovery operations within HP Storage Essentials.

When HP Storage Essentials is performing one of the possible discovery operations, some other aspects of the HP

Storage Essentials product functionality may be impacted, hence the need to be able to monitor when HP Storage

Essentials is performing discovery. The process of gathering discovery information from the devices will pre-empt any other discovery type operations that may be scheduled to occur for the devices. For example, during a Get All Element

Details (GAED) discovery the scheduled Performance and Report data collectors will not be able to run against those devices that are involved in the GAED operation. In addition, while a GAED discovery is in operation, the Report Cache

Refresh, if it is scheduled to run, will be paused until the GAED has completed.

HP Storage Essentials provides two ways to visually verify what state of discovery operation the product is currently involved it . There is a “Discovery Status Icon” within the GUI which defines the various states of discovery. There is also the ability to view the “Discovery Logs” as discovery is occurring, which refresh on a fix schedule to show the progress of a given GAED operation.

The status icon can either be green – Normal, yellow – Discovery / Getting Topology or red -> Getting Details.

Discovery Log

 Provides “real time” tracking of current discovery operations within HP Storage Essentials

A filtered view of the appstorm*.log file which maintains the comprehensive results of the discovery operation

(appstorm* refers to the new appiq.log file name convention. The * is a global character representation of this new convention which takes the form appstorm20080820-074637.log, is actually appstorm following by the date and time stamp of when the file was created.

When a discovery operation is begun within HP Storage Essentials, the user has the ability to monitor this operation via the “Discovery Logs” screen of the UI. Whenever a discovery operation is started from within an HP Storage Essentials page, no matter if it is a Step 1 Setup, Step 2 Topology or a Step 3 Details discovery that is executed, the user will be automatically re-directed to the Discovery Log page. This page tracks the progression of the given discovery task within

HP Storage Essentials. It lists that operations that are underway, as well as provides notification when the discovery operation has completed. This page will refresh every 60 or 90 seconds, depending on the type of discovery operation that is occurring, but can also be made to force refresh by pushing the “Get Latest Messages” button. This log is a filtered view into the appstorm*.log file which maintains the comprehensive results of all discovery operations.

Events Associated with Discovery Operations

Every time a comprehensive discovery (GAED) is run, there will be messages sent to the event manager detailing this operation. There is an event associated with both the start and completion of the GAED, and making use of this information within the event manager can be an effective way to analyze the results of a given GAED operation. The steps for using the Event Manager to monitor a discovery operation are:

1. Navigate to the HP Storage Essentials Event Manager (Tools -> Storage Essentials -> Home, then select the

Event Manager icon)

4AA3-xxxxENW, Month 20XX Page 13

HP Storage Essentials 9.6 Operations Best Practices Guide

2. In the “Show Element Type” selection box, select “HP Storage Essentials”

3. The start of a Step 3 Detail Discovery (GAED) operation has an event associated with it that has a Summary

Text field value of “Getting All Details started”. The completion of the GAED operation will have an event with a

Summary Text field of “Getting All Details completed”. Search within the Event Manager for the latest occurrence of these two events.

4. The start of a Step 1 Setup Discovery (identification) operation has an event associated with it that has a

Summary Text field value “Discovery Started”. The completion of the identification operation will have an event with a Summary Text fil ed of “Discovery Completed”.

5. Any errors that occurred while attempting to collect specific device information will then be located within the

Event Manager between the start and stop events for the GAED (the events by default are in time order).

GAED (Get All Element Details) Common Errors

GAED errors can be located in several places within the product. Two of these are The “Event Manager” as mentioned above and the GAEDSummary.log file. The GAEDSummary.log file is found in \HP\StorageEssentials\Logs. You must understand the data collection process thoroughly (as outlined earlier in this document) to comprehend the meaning of the common error messages on a daily basis. The following is a list of errors, cause and actions:

Replication Errors CIMOM error

Partial Replication Errors CIMOM error

Data collected from a managed element could not be written (replicated) to the Storage

Essentials Oracle database. Usually this is because the SE Managed Object Format is unable to filter and interpret the query results from the managed device. Always check the support matrix to verify that the managed device is fully supported. Check the device firmware, drivers, credentials, missing or hung CIMOM or cxws agent (CIM

Extension), supported SNIA library, access point provider (SMI-S), or SAN security/port availability.

Partial data collected from a managed element could not be written (replicated) to the

Storage Essentials Oracle database. This is usally a benign erorr that is due to the fact that some of the objects (CIM instances) could not be managed. In other words, SE doesn’t know what to do with them because there are no data strcutures within the SE

MOF to interpret and store them.

SQL Exceptions

Java Exceptions

Database error

JBOSS error

Out of Memory

GAED aborted

NO_CIMOM

CIM_ERR_NOT_FOUND CIMOM error

CIM_ERR_FAILED

Database or

Java error

CIMOM error

CIMOM error

CIMOM error

SQL Exceptions are Oracle related. Look at the alert_appiq.log file first. Often times this type of error will have to be investigated by an HP designated DBA.

There are literally hundreds of these messages in the Java error framework. SE for example, will send ConnectExceptions, NullPointer Exceptions, and Socket exceptions

(as well as many other types) from time to time. ConnectExceptions are usually due to a corrupt Oracle REDO log. Socket Exceptions are caused by timeouts, or nulls that SE cannot handle. Most will have to be investigated on a case by case basis. See the troubleshooting section towards the end of the SE Installation Guide for more details.

This is a critical error condition caused by Oracle or JBOSS threads that are consuming more memory resources that are available or have been allocated. There is an OOM whitepaper available that will explain how to identify and troubleshoot this condition.

These errors must be elevated to HP Support.

Anytime you receive a GAED abort message then there is a critical failure within the product. You will need to elevate this to HP Support for resolution.

The NO_CIMOM message indicates that the managed device can no longer be contacted for data collection. This is most likely due to a hung CIM extension, missing CIM Extension, missing SMI-S access point, a failure within the SE

CIMOM that is preventing a complete data collection (you will see replication errors with this error as well), credentials may have changed, hardware has been updated or is not supported, or security within the SAN has changed.

This error means that a specific requested CIM object could not be found. We have seen this error when something in the SAN has changed, an element has been updated or modified, or an element has been moved.

Similar to the NO_CIMOM error and may be seen along with that error.

Monitoring log files

4AA3-xxxxENW, Month 20XX Page 14

HP Storage Essentials 9.6 Operations Best Practices Guide

On the management server, the following log files rollover on Startup, at the start of a new day (midnight), and by size:

 appstorm.<timestamp>.log

AppstormProvisioning.log

AppstormRemoteConsole.log

Discovery.log

GAEDSummary.log

LicenseChanges.log

 userAudit.log

All CIMOM logfiles

GAEDSummary.log file contains an overview of all GAED operations that have occurred, and it is a filtered version of this log file that is mailed to all recipients on the GAED mailing list.

The following provides information about log file timestamps, sort criteria, configurable parameters, and adding a trace for the XML received from a CIMOM:

Log file timestamp - A timestamp (YYMMDD-HHMMSS) is inserted into the filename at its creation, making its origin more quickly identified. (i.e., appstorm.20071012-122025.log).

Log file sort criteria - Logfiles sort in order of their creation, based upon the timestamp in their Filename

Log file configurable parameters - Configurable parameters for all log files are these:

Maximum size of the logfile before it rolls over (MaxFileSize). This parameter resides in log4j.xml and is used to limit the size of an individual log file.

Maximum total amount of space the logfile can use (MaxTotalSize). This parameter resides in log4j.xml is is used to limit the total size of a set of log files (e.g., all appstorm logs).

Log file “appenders” manage the log file rollover when the MaxFileSize and MaxTotalSize parameters are reached.

These parameters can be changed for any log file by using the appstorm.<timestamp>.log appenders at the following directory:

<management_server_install_directory>/JBossandJetty/server/appiq/conf/log4j.xml

At the log4j.xmldirectory indicated above, change the appender values to the new desired values:

<parem name=”MaxFileSize” value=”100MB”/> <!--Max size of a file before it is rolled over -->

<parem name=”MaxTotalSize” value=”900MB”/> <!--Max size of a all these log files, oldest is deleted when size is exceeded -->

Example 1 (Log file rollover based on size)

Assume the appstorm.<timestamp>.log file

MaxFileSize=100MB and MaxTotalSize=900MB.

If the size of the current appstorm.<timestamp>.log file exceeds 100MB before the next day starts, a new appstorm.<timestamp>.log file is created.

If any rollover occurs, and the total size of all appstorm.<timestamp>.log files exceeds 900 MB, the oldest appstorm.<timestamp>.log files are deleted until the total size is below 900 MB.

Whenever a time-based or size-based rollover occurs, a footer is appended to the current file, and a header is placed on the new file. These headers and footers describe why the rollover occurred and the logfile to, or from, which it is being rolled.

Example 2 (Log file rollover based on time)

Assume a new day occurred.

The current logfile (appstorm.20071012-154625.log) would receive this footer:

****Log File Rollover due to Time****

*****Next Log File:C:/hp/StorageEssentials/logs/appstorm.20071013-000055.log****

190 Configuring the Management Server

The next logfile (appstorm.20071013-000055.log) would receive this header:

****Log Fle Rollover due to Time****

4AA3-xxxxENW, Month 20XX Page 15

HP Storage Essentials 9.6 Operations Best Practices Guide

****Previous Log File:C:/hp/StorageEssentiaals/logs/appstorm.2007-154625.log****

Adding trace for XML received from CIMOM - Traces are normally very large files. For that reason, the trace is turned “off” by default.

To add a trace, go into the properties file in the following directory:

%/JBOSS4_DIST%\server\appiq\conf

At the conf directory, uncomment the line shown below by deleting the pound sign (#):

#wbem.debug.sml=1

After uncommenting the line, set the level to at least 3 for the XML traces to be written. After they are written, a user can go to the /JBOSS4_DIST%\bin directory to view them.

The HP Storage Essentials log files are maintained within the “logs” directory of the HP Storage Essentials installation path,

HP\StorageEssentials\logs.

Management station operational logs

File name File content File frequency appiqProvisioning.log boot.log dbAdmin.log

GAEDSummary.log report.log userAudit.log

Storage Essentials application itself. It will day by default.

Detail when discovery operations are occurring, hourly data collections, errors in communication, database connectivity issues. It is one of the main logs used in debugging efforts.

This log file will detail the provision tasks that are created and executed.

This log tracks the boot of the Java Virtual

Machine used by HP Storage Essentials and the basic start up of the application.

There is a new boot.log file created with each start of HP Storage Essentials.

This log file tracks the operations that the dbAdmin tool performs on the database.

This log file is contains the basic summary information concerning all GAED operations that have occurred. It tracks when the GAED operation starts, the discovery groups that participate in the discovery operation, and

This log file is appended to each time the dbAdmin tool is run (single instance of this log file).

This log is appended to each time a GAED operation occurs. The last GAED operation to occur will be the last entry in this file.

The information sent as an e-mail to all recipients of the GAED mail is coming from down the results of the GAED operation into three distinct error categories: file for just that particular GAED operation.

Critical Events, Quarantined Events, and Other

Error Events.

This log file contains information concerning the running, generation and display of reports via the

Report Manager aspect of HP Storage Essentials.

This log file will track the users that are logged into HP Storage Essentials and the specific UI pages that this user accessed. It will also record any reports that this user caused to be created.

4AA3-xxxxENW, Month 20XX Page 16

HP Storage Essentials 9.6 Operations Best Practices Guide wrapper.log This log file records operations that are conducted in support of the Java Virtual Machine that is used by HP Storage Essentials.

References

For more information, go to www.hp.com/go/storageessentials.

4AA3-xxxxENW, Month 20XX Page 17

Download