Monitoring and Managing PowerEdge 1655MC High Performance Computing Clusters Dell White Paper By Scalable Systems Group April 2003 Contents Introduction: Modular Computing in HPCC ........................................................................ 3 PowerEdge 1655MC Overview................................................................................................. 4 Dell’s Management Solution for PowerEdge 1655MC HPC Clusters .............................. 5 In-band Monitoring and Management .................................................................................. 7 IT Assistant (ITA) ................................................................................................................ 7 Ganglia ................................................................................................................................... 8 Out-of-Band Monitoring and Management ........................................................................ 12 ERA/MC ............................................................................................................................... 12 Digital KVM ....................................................................................................................... 14 Conclusions ............................................................................................................................... 16 References ........................................................................................................................... 16 Figures Figure 1: PowerEdge 1655MC Chassis - Front View .............................................................................. 4 Figure 2: PowerEdge 1655MC Chassis - Rear View ................................................................................ 4 Figure 3: 66-blade PowerEdge 1655MC HPC Cluster Configuration................................................... 6 Figure 5: At-a-Glance View of Ganglia ..................................................................................................... 9 Figure 6: Information about One Node ...................................................................................................10 Figure 7: Web Based ERA/MC Console ...................................................................................................13 Figure 8: OSCAR screen on 2161DS .........................................................................................................15 Figure 9: ERA/MC and KVM Controller Card .......................................................................................15 April 2003 Page 2 Dell Enterprise Product Group Section 1 Introduction: Modular Computing in HPCC Modular computing solutions target environments in which the servers are consolidated into one physical location, which is most commonly the case with clusters. Some elements – the power supply, the cabling, and the systems management – do not need to be replicated for every server, and can be shared among the modular pieces. The Dell™ PowerEdge™ 1655MC is the first product in Dell’s Modular Computing or “blade server” product line. Blade server architecture introduces several self-contained servers, known as blades, within a server chassis. Each blade has its own processor(s), memory, I/O subsystem, a set of hard drives, an operating system, and other basic components. The chassis provides redundant infrastructure components, such as power supplies, fans, and switches. The concept of modular computing has the potential to increase server density, improve manageability, lower power consumption, and enhance deployment and serviceability, all resulting in lower TCO (Total Cost of Ownership). Furthermore, the PowerEdge 1655MC modular design adds the following advantages compared to integrated servers, which make it an ideal element for constituting a high performance computing cluster: April 2003 Low heat production Low power consumption Lower space requirements (0.5U/server) Easy deployment and simplified cable management Ease of service and replacement Ease of adding computing resources Page 3 Dell Enterprise Product Group Section 2 PowerEdge 1655MC Overview The Dell PowerEdge 1655MC features up to six server blades in one chassis in a 3U form factor. Each blade functions as an individual server utilizing its own memory, 2 CPUs and 2 internal SCSI hard drives. The chassis includes power supplies, network module, fans, and a management module. The PowerEdge 1655MC optionally ships with a USB CDROM/Floppy drive. The chassis also contains two Gigabit Ethernet network switches, which connect internally to two network interface cards (NICs) embedded on each blade. Additionally, Dell embedded remote access (ERA) hardware and firmware are integrated in the chassis. The ERA module monitors all the shared infrastructure components of the chassis. Figure 1 and 2 show the PowerEdge 1655MC front view and back view respectively. For detailed information regarding Dell PowerEdge 1655MC, refer to http://www.dell.com/us/en/esg/topics/esg_pedge_rackmain_servers_1_pedge_16 55mc.htm Figure 1: PowerEdge 1655MC Chassis - Front View Figure 2: PowerEdge 1655MC Chassis - Rear View April 2003 Page 4 Dell Enterprise Product Group Section 3 Dell’s Management Solution for PowerEdge 1655MC HPC Clusters Dell’s PowerEdge 1655MC HPCC solution provides four methods of managing and monitoring the cluster: Dell OpenManage™ IT Assistant, Ganglia, digital KVM and ERA. IT Assistant and Ganglia are the two in-band management tools that use the cluster fabric, or intra-cluster network, for monitoring and management traffics. IT Assistant is Dell’s server management solution that provides a centralized management console used to discover nodes on the network and examine hardware sensor data to prevent failures at the system level. Ganglia is an OS-level cluster monitor that can be used to look at resource usage, detect node failures, and troubleshoot performance problems. Both ITA and Ganglia require OS support and use the cluster fabric for communication. Figure 3 shows a sample of PowerEdge 1655MC HPC cluster configuration formed by 66 blades as the compute nodes. The Cluster Fabric in the diagram is constructed by using three Dell PowerConnect™ 5224 Gigabit Ethernet switches. Four Gigabit Ethernet links are used as a network trunk from each PowerEdge 1655MC chassis to one of the PowerConnect 5224 switches. A dedicated IT Assistant node – a PowerEdge 1650 as the IT Assistant monitoring and management station – is connected to one of the switches as well as to the ERA Fabric. The ERA Fabric is constructed by using a PowerConnect 3024 Fast Ethernet switch. The ERA ports on PowerEdge 1655MC chassis are connected to the PowerConnect 3248 switch. The master node, a PowerEdge 1650 server is also connected to the 3024 switch, so that both the ITA node and the master node can perform out-of-band ERA monitoring and management functions. The other out-of-band fabric called KVM Fabric is going through a digital KVM switch, the Dell 2161DS Remote Console Switch. The KVM ports on the PowerEdge 1655MC chassis, the master node, and the ITA node are connected to the 2161DS switch. The Ethernet ports on the 2161DS switch is connected to the LAN outside the cluster to form a complete out-of-band management network – independent to the cluster fabric and the ERA fabric. For detail information about utilizing the 2161DS switch, refer to: http://www.dell.com/us/en/biz/topics/power_ps3q02-avocent.htm. For information regarding PowerEdge 1655MC HPC clusters, please visit the Dell HPCC web site at: http://www.dell.com/us/en/esg/topics/products_clstr_gb1655_pedge_configs_165 5_cluster_hpcc.htm April 2003 Page 5 Dell Enterprise Product Group Figure 3: 66-blade PowerEdge 1655MC HPC Cluster Configuration April 2003 Page 6 Dell Enterprise Product Group Section 4 In-band Monitoring and Management It is important for an HPCC system administrator to be able to monitor a cluster at the hardware level especially in a large cluster environment. Dell HPC cluster solution offers two methods of in-band management: Dell OpenManage™ IT Assistant (ITA), a Web-based tool for managing Dell servers, and Ganglia, an open source monitoring tool, developed at the University of California, Berkley. IT Assistant (ITA) Using the OpenManage IT Assistant, a web browser-based tool that supports all of the PowerEdge 1655MC components through the Simple Network Management Protocol (SNMP), allows cluster administrator the ability to manage and monitor the hardware of an entire cluster, and to perform day-today cluster management tasks from a centralized location using a GUI. SNMP provides the communication between the management console and the nodes, with every system component running an SNMP agent. IT Assistant provides the following functionality: Discovery of the chassis and chassis components (see Figure 4) Support for hot swapping blades Summary and status information for all chassis components and support for system inventory and search Launch of management applications for chassis components Management of events generated by chassis components Page/e-mail when an event occurs One-to-many centralized console All of the functions mentioned above are crucial to the management of a HPC cluster. One of the most basic system administration tasks, discovery and identification of nodes, is performed by IT Assistant, as well as discovery of chassis components – the embedded Ethernet switch and the ERA module. IT Assistant allows the administrator to hot swap any blade in the chassis without interrupting the other blades, which allows maintenance to be performed without shutting down entire blades in a chassis. As the cluster grows April 2003 Page 7 Dell Enterprise Product Group in size, the node status information becomes even more important to monitor in order to simplify administration. IT Assistant provides such information as system name, IP address, MAC address, versions of components, memory size, chassis service tag, chassis asset tag, blade slot number and blade service tag. IT Assistant provides one-to-many functionalities such as remote shutdown, flash BIOS, configuration of server alert functions as well as inventory for all components. IT Assistant includes an event management system (ESM) for capturing any event that is generated by the modules through SNMP traps. Administrators can associate actions with specific events, including email, paging or application launching. Figure 4: Summary of PowerEdge 1655MC Chassis Information Ganglia Another in-band management tool available in a Dell PowerEdge 1655MC cluster offering is Ganglia, an open source OS-level cluster monitor. Out of the box, Ganglia monitors and automatically graphs over 20 metrics such as the node’s load average, number of running processes, number of incoming and April 2003 Page 8 Dell Enterprise Product Group outgoing network packets, total and free memory on every node of the cluster, etc. Ganglia provides several levels of cluster information. At-a-glance view (Figure 5) shows the overall status of the cluster and summarizes total node count, number of nodes that are up, overall load average, and CPU and memory utilization for the cluster. Color-coding is used to represent CPU utilization to enable quick identification of overloaded systems. A crossbones icon indicates a node is down. Selecting a different metric in this view redisplays the screen with the value of this metric for each node, and uses the metric as a sort index when displaying the nodes. Figure 5: At-a-Glance View of Ganglia Clicking on an individual node icon displays all available information for this node (Figure 6). This view summarizes the static information such as the version of the OS, system usage, IP address, machine type, and graphs those metrics that April 2003 Page 9 Dell Enterprise Product Group change over time, such as memory and CPU usage, network traffic stats, number of running process, disk usage, etc. Figure 6: Information about One Node Using Ganglia allows administrators define and add other parameters in the cluster that they want to monitor. Ganglia’s GUI will automatically graph those values in addition to the pre-set metrics for every node. Ganglia also simplifies cluster management by providing a remote execution environment. This feature is used for remote management, and to execute commands in parallel on multiple nodes. April 2003 Page 10 Dell Enterprise Product Group Additionally, Ganglia provides the ability to monitor multiple clusters. This is especially useful in large compute centers where computational resources are grouped in smaller clusters for specialized use. The centralized console enables an administrator to monitor multiple clusters at once, while maintaining high level of security by defining trust relationships. April 2003 Page 11 Dell Enterprise Product Group Section 5 Out-of-Band Monitoring and Management During heavy communication between application components or blade-server nodes, in-band management and monitoring can inaccurately report network or server problems, since they share the fabric with the applications. In addition, any monitoring/management traffics will consume resources that are used by parallel applications. Finally, if a machine’s operating system (OS) is not responding, neither method guarantees access to the node and ability to fix the problem since both methods rely on the OS support. In these situations, system administrators can use the out-of-band network management methods to communicate with the cluster hardware, and diagnose or fix problems. Dell’s HPCC solution provides two out-of-band management routes: digital KVM and ERA. ERA/MC The Dell Embedded Remote Access/MC Controller for Dell PowerEdge 1655MC provides remote systems management for the modular computing blades. ERA/MC provides an out of band management route by utilizing its own dedicated processor, memory, bus and network port, without consuming the cluster computing or network resources. If the cluster blades become unresponsive, ERA/MC allows the administrator to view and access the nodes remotely to troubleshoot the system. ERA/MC provides the following functionality to the PowerEdge 1655MC system: Initial configuration of chassis and blades Scripting for automation Local and remote management of chassis and blades Configuration of blades, network switches and the digital KVM through console redirection Remote firmware updates Remote monitoring of fans and sensors Remote power cycle, power down and power up The use of ERA/MC within an HPC cluster simplifies cluster management and allows a system administrator to monitor the hardware components remotely either through a CLI (through the serial port) or a web-based GUI console. The April 2003 Page 12 Dell Enterprise Product Group main utility used in ERA is racadm (remote access control and administrator), which provides the interface for monitoring and configuring the system. The racadm utility can be used through a serial port using communications program such as minicom or HyperTerminal, through a remote interface or through a web-based console across the network. Through the serial interface, the administrator can view or modify the configuration settings on the chassis or the blades. For instance, the administrator can change the IP configuration of the ERA/MC port to be able to access the GUI available on the web console (Figure 7). Also through the serial interface, console redirection can be used for configuring the blades, switches and KVM. An administrator can use the automated scripting feature to run configuration commands on multiple nodes. This proves to be a useful tool for making identical changes within large cluster configurations. The remote interface is currently only supported through the MSDOS environment using Windows, allowing the use of the racadm command for managing the nodes. This CLI here provides the only means in modifying the properties of the ERA/MC on the PowerEdge1655MC and the automated scripting can be used here as well. The web interface can be accessed through any supported web browser using the ERA/MC IP address or through IT Assistant. It allows the user to utilize the features of the ERA/MC in a graphical interface. Figure 7: Web Based ERA/MC Console April 2003 Page 13 Dell Enterprise Product Group One of the main features of out-of-band management is the ability to control and monitor the hardware from a remote location. The racadm commands on the PowerEdge 1655MC allow the administrator to view the health status of the chassis and blades within the cluster. By allocating appropriate IP addresses to the ERA/MC ports of all the chassis within a cluster, the administrator can assign names to each system and to each blade, allowing access to individual blades in order to utilize specific resources. Using racadm, there are multiple commands to use to troubleshoot a cause of a failure. For example, the administrator can view information on the modules within the chassis: the blades, the network switches, the fans; the sensor information about rpms of the fans, the status of the power supplies and much more. For the blades, administrators can power-cycle the nodes individually, reset configurations and cause LEDS to blink or glow to easily identify systems within a cluster. Digital KVM The PowerEdge1655MC contains an embedded digital KVM switch, which allows video and keyboard and mouse access to each blade. All access to the blades is from the management card on the chassis, which can either be through the standard analog PS2 keyboard, mouse and video ports or through the Analog Rack Interface port with a CAT5 cable. The Analog Rack Interface port can be connected directly to a port on the Dell 2161DS Remote Console Switch with a CAT5 cable, which cascades the switches and allows them to be accessed from one central place. In large cluster configurations with several PowerEdge 1655MC chassis, this can greatly minimize the cable organization and management. The 2161DS Remote Console Switch pulls together both analog and digital technologies to provide a central point of access to an entire cluster. Each KVM switch has 16 ports to attach machines or other switches and can be networked over a LAN connection to provide remote access to these machines. Each machine must use a System Interface POD (SIP) for converting the keyboard, video and mouse signals to Ethernet. This considerably reduces the groups of KVM cables that are usually associated with HPC clusters. The switch comes with a cross-platform software that allows the administrator to manage the switch, install a new 2161DS switch or launch video sessions to a system server. The administrator can view multiple machines from this access point and use the keyboard and mouse on the individual machines. A 2161DS KVM switch with one or more chassis attached allows the view of all the blades from a centralized location. The KVM switches use OSCAR (OnScreen Configuration and Activity Reporting interface) to select the nodes; with multiple chassis are cascaded, the user is able to see all nodes on one interface. April 2003 Page 14 Dell Enterprise Product Group Figure 8: OSCAR screen on 2161DS Each PowerEdge 1655MC chassis contains the internal KVM switch so it appears on the main 2161DS OSCAR interface as cascaded. In Figure 8, ‘Edmond’ is the third blade on the fourth chassis; each node appears with this numbering scheme. Figure 9: ERA/MC and KVM Controller Card April 2003 Page 15 Dell Enterprise Product Group Section 4 Conclusion Dell blade cluster solution provides four management routes, two in-bands and two out-of-bands. The in-band management tools, ITA and Ganglia, provide easy access to cluster status information, such as load, utilization, number of dead nodes, as well as individual machine’s hardware sensor data, including fan speed, voltage, etc. In-band management routes share the cluster fabric with cluster applications. Out of band management routes include KVM and ERA, and are useful to administrators when the cluster is under heavy load, since both methods use a dedicated fabric and do not interfere with applications running on the cluster. These four different methods provided in Dell’s HPCC solutions help to: make a PowerEdge 1655MC cluster easier to manage and monitor; reduce the possibility of failure; and lower the TCO. References DellTech/Support http://delltech.us.dell.com/support/ Ganglia: Distributed Monitoring and Execution System http://ganglia.sourceforge.net/ Serial and Remote Execution of CLI Commands for Blade Server Management http://www.dell.com/us/en/esg/topics/power_ps3q02-suniti.htm THIS WHITE PAPER IS FOR INFORMATIONAL PURPOSES ONLY, AND MAY CONTAIN TYPOGRAPHICAL ERRORS AND TECHNICAL INACCURACIES. THE CONTENT IS PROVIDED AS IS, WITHOUT EXPRESS OR IMPLIED WARRANTIES OF ANY KIND. Dell, PowerConnect, PowerVault and PowerEdge are trademarks of Dell Computer Corporation. Other trademarks and trade names may be used in this document to refer to either the entities claiming the marks and names or their products. Dell disclaims proprietary interest in the marks and names of others. ©Copyright 2003 Dell Computer Corporation. All rights reserved. Reproduction in any manner whatsoever without the express written permission of Dell Computer Corporation is strictly forbidden. For more information, contact Dell. Information in this document is subject to change without notice. April 2003 Page 16 Dell Enterprise Product Group