Enterprise Survivable Servers (ESS): Architectural Overview written by Greg Weber and Aaron Miller Section 1 - Introduction 1.1 What is ESS ? Enterprise Survivable Servers (ESS) is an offering which takes an existing Avaya Communication Manager (CM) system to a higher level of availability and survivability. ESS achieves this by allowing media servers to be used as alternate controllers within a system by leveraging IP control of port network gateways and being completely independent of the main servers both functionally and geographically. ESS protects the communication system against a catastrophic main server failure and, at the same time, provides service to port network gateways that have been fragmented away from their current controlling entity. 1.2 How is this paper organized ? Section #1 – The Background section gives the technical background needed to fully understand and appreciate ESS by going into a deep dive of how ACM works today in a non-ESS environment. It provides detailed definitions of some basic infrastructure terms and goes into tremendous depth of how media servers control port network gateways. If the only objective of reading this paper is to obtain a high level overview of ESS, then this section can be skipped. However, if the desire is to understand the “whys” of the ESS operation, then this section builds that foundation. Section #2 – The Reliability – Single Cluster Environments section covers many of the methods service can be supplied to a port network gateway while still only having a single controlling cluster. The ESS offering does not replace these methods, but rather builds on top of them. The methods described in this section attempt to restore service if faults occur within the communication system. Only if all of these methods, including server interchanges, IP Server Interface (IPSI) interchanges, control fallback, and more, are unsuccessful in restoring service will the ESS offering begin to take effect. Section #3 – The Reliability – Multiple Cluster Environments section covers some of the current methods of protecting the communication system against main server failures. All of these offerings, such as Survivable Remote Processors (SRP), ATM WAN Spare Processors (ATM WSP), and Manual Back-up Servers (MBS) are being replaced by ESS. This section provides a brief overview of the operation for each one of these offerings, making it clear that ESS provides at least the same level and in most cases a higher level of protection. Section #4 – The ESS Overview section introduces the ESS offering by showing, at a high level, how ESS servers increase overall system availability by providing protection against catastrophic main server failures and extended network fragmentations. In addition, this section emphasizes that the ESS offering is one of last resort and that it works in conjunction with all existing recovery mechanisms by reviewing a number of failure scenarios which are resolved by methods other than ESS. Section #5 – The How Does ESS Work section does a deep technical dive into the operational steps required to make ESS operate. The section describes how ESS servers register with and acquire translation updates from the main cluster. Furthermore, it also covers, in detail, how IPSIs prioritize various ESS clusters within a system and when it will use this ranking to get service. The information in this section gives the “how’s” and “why’s” of the ESS offering’s operation. 1 Section #6 – The ESS in Control section examines what happens to existing calls when port networks failover under ESS control. While there is no feature debt incurring or performance degradation running under ESS control, there are subtle operational differences. The section concludes with a discussion of these differences including possible call-flow changes and access to centralized resources and voicemail. Section #7 – The ESS Variants Based on System Configuration section inspects the subtle differences that occur when a system needs to utilize ESS clusters for control based on various port network connectivity configurations. For each type of port network connectivity configuration available, failure examples, including catastrophic server outages and severe network fragmentations, are simulated and thoroughly explained. Section #8 – The ESS in Action - A Virtual Demo section allows the reader to experience the ESS offering in a real world environment. The demo is given on a communication system that consists of three geographically separate sites interconnected in various ways. After a thorough explanation of the setup and how the system operates in a non-faulted environment, catastrophic server failure and network fragmentations are introduced. Each stage of the demo is clearly explained and the viewpoints of the system from the main servers and the ESS servers are shared through administration status screens. The demo concludes with an execution of the steps required to restore the PBX switch back to normal operation after the networks is healed and the main cluster fixed. Section #9 – The More Thoughts and Frequently Asked Questions section covers various topics involving the ESS offering. These topics include details about license files, re-use of MBS servers as ESS servers, the alarming differences between main servers and ESS servers, how non-controlling IPSIs work with ESS, and much more. Section 2 - Background 2.1 What is an Avaya Media Server ? The Switch Processing Element (SPE) is the controlling entity of an Avaya Communcation System. The SPE responsibilities range from making any and all intelligent decisions to control of every endpoint within the switch. In a traditional DEFINITY PBX, the SPE software resides on proprietary hardware (UN/TN form factor circuit packs) and runs on a proprietary operating system called Oryx/Pecos. Avaya’s next generation Communication System needs to have the ability to efficiently integrate faster processors as they become available. One of the main objectives of the Avaya Communication Manager was to transport this powerful, feature-rich, reliable SPE onto an off-the-shelf server running an open operating system. This objective was realized by the migration of the SPE onto media servers running the Linux operating system. The following table shows the different types of Avaya media servers which are currently available. Server Name S8720 S8710 S8500 S8700 S8400 S8300 Chip Set AMD Opteron – 2.8 GHz Intel Xeon – 3.06 GHz Intel Pentium-4 – 3.06 GHz Intel Pentium-3 – 850 MHz Intel Pentium-M 600 MHz Intel Mobile Celeron – 400 MHz Duplicated Yes Yes No Yes No No Port Network Support Yes Yes Yes Yes Yes No Table 1 – Types of Media Servers In the table above, the server types are listed in descending order of processing power. This is a function of the chip set on which they run. Duplication is another distinguishing property of the server types. The S87XX series of media servers are always duplicated, where each server in the server pair has an SPE running on it. The standby SPE, running on one of the servers, is backing up the active SPE on the other server in the pair. Refer to the What is a Server Interchange section for more information on server duplication. 2 The Avaya product line has many different types of gateways including H.248 gateways, H.323 gateways, traditional DEFINITY cabinetry, and 19” rack mountable cabinets. Port networks (explained in the section below), which are made up of traditional DEFINITY cabinets and 19” rack mountable cabinets, are not supported by all media servers. Since the ESS project is designed to provide survivability for port networks, the S8300 media server will be omitted throughout the rest of this paper unless specifically noted since it does not support the ESS offering. Also, the scope of the ESS product does not include support for the S8400 media server at this time. 2.2 What are Port Networks ? All endpoints connect to the communication system through circuit packs when utilizing the S8720, S8710, S8700, or S8500 Media Servers. For example, digital phones are wired directly to digital line circuit packs (TN2124), PRI trunks terminate at DS-1 circuit packs (TN464), and IP endpoints utilize CLAN circuit packs (TN799) as their gatekeepers. All of these TN form factor circuit packs are housed in cabinets. These cabinets, listed in the table below, support circuit packs by providing them power, clocking, and bus resources. The bus resources, a time division multiplexed (TDM) bus and a packet bus, are used as both a control conduit to the circuit packs and a bearer interconnectivity medium. Cabinet Type MCC/SCC G600 G650 Duplicated IPSI Support Yes No Yes EI Support Yes No Yes Rack Mountable No Yes Yes Table 2 – Types of Cabinets A port network is the grouping of cabinets that have physically connected buses or, in other words, a port network is a collection of cabinets that share the same TDM bus and the same packet bus. In addition, since all circuit packs within a port network share a TDM bus, they need to be synchronized. For that reason, each port network contains one active tone clock residing on either a dedicated tone clock board (TN2182) or an IP Server Interface (IPSI) card (TN2312, described later in this section). 2.3 How are Port Networks Interconnected ? The interconnected buses, within a set of cabinets making up a port network, provide a direct communication medium between all endpoints associated with that port network. However, endpoints in disperse port networks also need to be able to establish a communication medium when needed or, in other words, there needs to be support of Port Network Connectivity (PNC). The PNC provides an interconnection of port network buses when required. There are a variety of PNC types supported by ACM and are shown in the table below. PNC Type Center Stage Switch (CSS) Direct Connect ATM IP Port Network Interconnectivity Device EI Board (TN570) EI Board (TN570) ATM EI Board (TN2305/6) MEDPRO (TN2302) Ability to Tunnel Control Yes Yes Yes No Resource Manager SPE SPE ATM Switch IP Network Max # of Port Networks 45 3 64 64 Table 3 – Types of PNC When a communication path is needed between two port networks, the SPE will set up the path. For the PNC types which have the SPE as the resource manager such as Center Stage Switch (CSS) and direct connect, the SPE will not only inform the Expansion Interface (EI) boards which path to use, but also create the path through the PNC. For example, to set up a communication path between two port networks using the CSS PNC, the SPE will inform the EI in port network #1 to use path A, set up path A through the CSS, and inform the EI in port network #2 to also use path A. When the SPE needs to create the actual communication path, the PNC is referred to as SPE managed. For the PNC types which do not have the 3 SPE as the resource manager such as ATM and IP, the SPE will only need to tell the ATM EI boards (or MEDPRO boards) to interface with their peer ATM EI board (or MEDPRO board) and the creation of the actual path is left to either the ATM switch (if ATM PNC) or the IP network (if IP PNC). For example, to set up a communication path between two port networks using the ATM PNC, the SPE will inform the ATM EI in port network #1 to communicate with the ATM EI in port network #2 and the ATM switch figures out how this is done. When the SPE does not need to create the actual communication path, the PNC is referred to as self-managed. PN #1 PN #2 EI EI PN #1 EI SNI Direct Connect PN #2 PN #1 EI PN #2 EI EI ATM SWITCH SNI Center Stage Switch ATM PNC PN #1 PN #2 MEDPRO MEDPRO IP NETWORK IP PNC Figure 1 – Connectivity of Different Types of PNCs 2.4 How are Port Networks Controlled ? As described above, a port network consists of a number of circuit packs and two transport buses, a TDM bus and a packet bus. Each circuit pack is locally controlled by resident firmware called an angel. Therefore, in order to utilize the functions of a circuit pack, the SPE needs to interface with that circuit pack’s angel. Some circuit packs contain enhanced angels, specifically EI boards and IPSI boards (described later in this section), which have the capability of becoming an arch-angel when activated. An activated angel, or arch-angel, provides the communication interface to all other angels in the port network over the TDM bus for the SPE. Hence, in order for the SPE to control all the circuit packs and the TDM bus in a port network, the SPE must be able to establish a communication path to the port network’s archangel. Some other circuit packs in the system, such as CLAN boards for example, need additional control links for various reasons. These control links traverse the port network’s packet bus in the form of Link Access Protocol Channel-D (LAPD) links to the circuit pack. The SPE utilizes a Packet Interface (PKTINT) to manage the port network’s packet bus and terminate the other end of the LAPD links. Therefore, in order to control these types of circuit packets, a LAPD link needs to be established between the PKTINT and the circuit pack and the SPE needs a communication path to the PKTINT. The communication path between the SPE and the port network’s arch-angel takes the form of a special LAPD link called an Expansion Arch-angel Link (EAL). An EAL begins in the PKTINT and terminates at the port network’s arch-angel. Consequently, the SPE can control everything it needs to be in command of in a port network, all control links and both buses, if it has a communication path to a PKTINT that can serve that port network. In a traditional G3r DEFINITY system, the SPE and the PKTINT both reside in the Primary Port Network (PPN) and are connected via the carrier’s backplane. The PKTINT serves all of the Expansion Port Networks (EPN) by having EALs go through the PNC (either CSS, ATM, or direct connect) and terminate to arch-angels on each port network. EI cards contain these enhanced angels which can be activated into arch-angels. The figure below shows how the SPE located in a DEFINITY PPN can control its EPNs. 4 SPE PPN EPN #1 EPN #2 PKTINT EI EI EI EAL #1 EAL #2 PNC Figure 2 – SPE Control of EPNs in Traditional G3r DEFINITY System However, in the new CM configuration, the SPE no longer resides on the proprietary hardware in the PPN, but rather externally on the media servers. Therefore, the SPE cannot communicate with the PKTINT through the carrier’s backplane. To solve this problem, the Server Interface Module (SIM) was created to provide a physical connection to the PKTINT and an IP interface to communicate with the SPE. The IP Server Interface (IPSI) card (TN2312) is a conglomeration of a number of components mentioned thus far: a PKTINT, an enhanced angel (which can be activated into an arch-angel), a SIM, and a tone clock. The figure below shows how the SPE located on media server can control its EPNs. SPE LAN IPSI IPSI IPSI EAL #1 EPN #1 EAL #3 EAL #2 EPN #2 EI EI EPN #3 EI PNC Figure 3 – SPE Control of EPNs in an ACM Configuration (IPSIs in all EPNs) While the above figure shows every EPN in the system having an IPSI, this does not necessarily have to be the case. Every control link in the system goes between a PKTINT and a circuit pack containing an archangel. If all of the control links went through a single PKTINT, as it does in the traditional G3r DEFINITY system, a bottleneck would be created. However, the PKTINT resident on the IPSI card has more than enough capacity to support its own port network control links and control links for other EPNs. If the PNC supports tunneled control, then not every port network is required to have an IPSI for control. The figure below shows the control of EPNs with some having IPSIs and others not. Based on IPSI processing capacity and reliability issues, which is discussed in the next section, it is suggested that there be one IPSI for every five port networks with a minimum of two per system. 5 SPE LAN IPSI IPSI EAL #1 EPN #1 EAL #2 EPN #2 EI EPN #3 EI EI EAL #3 PNC Figure 4 – SPE Control of EPNs in an ACM Configuration (IPSIs in some EPNs) 2.5 What are SPE Restarts ? The SPE can go through many different types of restarts which can be grouped into three categories – cold restarts, warm restarts, and hot restarts. Each type of restart has different memory state requirements and different end-user effects. The most drastic restart of an SPE is a cold restart. The cold restart clears all memory which deals with call state and re-initializes the entire system. The end-user effect of a cold restart is that every call in the system is dropped and that all endpoints are reset. A cold restart has virtually no memory requirements in order to achieve this type of restart. A much more graceful SPE restart is a warm restart. This restart re-initializes some of the system, but keeps all memory which deals with call states intact. The end-user effect of a warm restart is much less severe than during a cold restart; all calls that are stable remain up, however, some stimuli (e.g. end-user actions or endpoint events) may be lost. This means that an end-user simply communicating on a call over an established bearer channel is unaffected, but an end-user in the process of dialing may lose a digit. Therefore, a warm restart has a memory requirement where all the call state information is valid and up-todate in order to preserve calls. While a warm restart has some minor effects on end-users, a hot restart of the SPE does not. A hot restart does not alter or clear any memory and will only restart the SPE software processes using the intact memory. There is no end-user effect of a hot restart; all calls are preserved and all stimuli occurring during the restart are received by the SPE and appropriate actions are taken on them. To accomplish this, a hot restart has the requirement that all of its memory must remain completely intact through the restart. The following table summarizes all the different types of restarts. Restart Type Hot Warm Cool Cold Reboot End-user Effect None Minimal Minimal High High Stable Calls Survive ? Yes Yes Yes No No Approximate Time for Completion <1 second 5-9 seconds 45-60 seconds see below see below Table 4 – SPE Restart Summary 6 Executable ACM Command reset system 1 reset system 2 reset system 3 The cool restart introduced in the table above is a form of a warm restart that can be used across different versions of software. It allows the SPE to be actively running and get reset into a new software version with all stable calls remaining up through the restart. A cool restart is the mechanism to upgrade the SPE’s release without losing any calls. As previously stated, cold restarts clear all call state memory but, based on the level of the cold restart, it may or may not reload the translations. A low level cold restart clears all call state memory, while keeping its translation memory intact. A high level cold restart, or a reboot, clears both the call state memory and reloads translations. This implies that if an administrative action is done on the switch, it will be preserved through a cold restart. However, if the same administrative action is done and the translations are not saved before a reboot occurs, the changes are lost since the translations are reloaded from the disk. Restart times are measured from the beginning of the restart until all endpoints are operational and the SPE is again processing end-user stimuli. Hot restart, warm restarts, and cool restarts never have any endpoints go out of service and need only wait for the SPE to begin processing once again. This has the implication that the restart time for these types of resets does not vary too much on the size of the system or the number of endpoints and therefore approximate restart times can easily be determined without any knowledge of the system. Cold restarts and reboots reset all endpoints and cannot be deemed completed until all of them are operational. Determining an approximate restart time mainly depends on three major factors – the types of endpoints (analog, digital, BRI/PRI, IP …), how many of each is in the system, and the processing speed of the server. For very small systems running on S87XX media servers, the restart completion time is on the order of minutes, one to three. For larger systems with a low number of IP & PRI endpoints, the restart completion is approximately three to six minutes. For systems with a large number of IP & PRI endpoints, the system may not be fully operational for up to ten to fifteen minutes. During this time, all other endpoint types will be functioning while the remaining IP & PRI endpoints are coming back into service. 2.6 What are EPN restarts ? The SPE restarts, described in the previous section, affect the entire system. For example, a cold restart of the SPE causes all endpoints in the system to be reset and every call to be dropped. SPE restarts are used as a recovery method for a subset of failures, such as a server hardware malfunction or a software trap. The type of failure dictates which restart level is needed for recovery. However, there are other types of failures which need recovery where an entire system restart is not warranted. For example, if a single port network has an issue which needs a restart to return to normal operation, then only that PN should be reset as opposed to the entire system. These types of restarts are referred to as EPN restarts. While EPN restarts are implemented differently than system restarts, the same classifications exist for them – hot, warm, and cold. An EPN cold restart is the most extreme PN restart. In an EPN cold restart all circuit packs within the PN are reset which causes all associated endpoints to reset and all calls to be dropped. The basic concept of an EPN warm restart is to resynchronize the SPE and the EPN after some event occurs that caused them to become slightly out of sync in terms of resource allocation. That said, a warm restart of a PN preserves all established talk paths that it is supporting and does not cause its associated endpoints to be reset. However, any end-user actions that took place during the time of the fault through the completion of the EPN warm restart are lost. If, for example, an IPSI temporarily loses its connectivity with the SPE for less than 60 seconds, it requires an EPN warm restart to return to normal operation. Any calls in progress within and through that PN would be unaffected (unless of course the connectivity that was lost was also being used for the bearer traffic), but any end-user stimuli (e.g. dialing or feature activation) that happened during that time are lost. A hot EPN restart is used to interchange IPSIs within a PN in a non-faulted environment. IPSI duplication and IPSI interchanges are described in the next section in detail, but it is important to understand that there are no adverse effects on any endpoints or any talk paths through a hot EPN restart. 7 Section 3 - Reliability – Single Cluster Environments 3.1 What is a Cluster ? As described in the What is an Avaya Media Server section, some of the available servers can be duplicated. This duplication allows one server to run in active mode and the other server to run in standby mode. The What is a Server Interchange section below describes how the standby server could take over for the active server if needed. If the standby server is in a refreshed state immediately before the failure, it could take over without affecting any of the stable calls. In other words, the standby server shares call state with the active server via memory shadowing and can take over the system in a way that would preserve all stable calls. A cluster is defined to be a set of servers that share call state. In the case of a server which can only be simplex, the server itself is a cluster. In the case of a server which can be duplicated, the cluster is the combination of both of the servers. 3.2 How do single cluster environments achieve high availability ? The basic structure of the Avaya Communication Manager consists of a decision maker (the SPE), an implementer (the port network gateways), and a communication pathway for the decision maker to inform the implementer what actions to carry out. If the SPE ceases to operate or fails to communicate with its port networks, then the PBX will be inoperable since the SPE makes all the intelligent decisions for the PNs. For example, if the SPE was powered off, then all of the control links to the arch-angels would go down. If an arch-angel has no communication path to the SPE, then all messages sent to it from angels on other boards in the port network would be dropped. If the angels on the TN boards cannot send messages anywhere to be processed, then endpoint stimuli get ignored. If all endpoint stimuli are dropped, then service is not being supplied to the end user. Higher system availability is achieved by providing backups to the system’s basic structural components. In the case of the SPE, if the servers support duplication, then the standby SPE within a cluster can take over for its active SPE in fault situations. Once a stable active SPE is achieved, a control link is brought up to each port network. These control links can traverse different pathways to compensate for communication failures and can terminate at different locations to compensate for TN boards failures. IPSI interchanges and EI-control fallback are two methods of achieving this and are described in detail at the end of this section. 3.3 What is a Server Interchange ? In the What is an Avaya Media Server section above, a number of server types were listed along with some of their characteristics. One of these characteristics was duplication. Understanding the catastrophic nature of an SPE failure, it is important to be able to provide SPE duplication to avoid a single point of failure. If the servers are duplicated, then one server runs in an active mode and the one server runs in a standby mode. The server in active mode is supporting the SPE which is currently controlling the Communcation System. The server in standby mode’s SPE is backing up the active SPE. If there is a failure of the active SPE, then the standby SPE takes over the system. One feature of duplicated servers is the ability to shadow memory from the active SPE to the standby SPE. One technique of achieving this is by having a dedicated memory board (DAJ or DAL board) in each server and interconnecting them with a single-mode fiber. This method of memory shadowing is referred to as hardware memory duplication. Another means for shadowing memory between a pair of servers is to transmit the information over a high-speed, very reliable IP network. This method of memory shadowing is referred to as software memory duplication and is not available for all types of media servers. For performance reasons and bandwidth considerations, only a percentage of the system’s total memory is duplicated between the servers. During steady-state operation, any changes made to the active server’s call state memory are transmitted to the standby server. After these transmitted changes are applied to the current state of the standby server’s memory, the standby server’s memory will be up-to-date with the active server’s memory. The key, however, is that transmitted changes are applied to the current state of 8 the standby server’s memory and for that reason, if the current state was not up-to-date with the active server before the changes were made, the transmitted changes have no context and therefore are useless. If a standby server’s memory is up-to-date with the active server, the system is refreshed. If the standby server’s memory is not up-to-date, the system is in a non-refreshed state. A standby server becomes nonrefreshed if the communication between the servers has a fault or if either SPE goes through a restart. If the system is non-refreshed, the active and standby servers go through a coordinated refreshing process to get the standby server up-to-date and prepared to accept transmitted changes. If the standby SPE is required to take over the system, it needs to transition to become the active SPE. The process of a standby SPE becoming active is done through an SPE restart. As described earlier, there are various types of restarts with each having different end user effects. The ideal restart level for an SPE interchange is a hot restart, whereby the interchange would be completely non-service affecting. However, this type of restart requires that the standby server have 100% of the memory state of the active server before the failure. Unfortunately, the system does not achieve 100% memory duplication which implies that a hot restart interchange is not supported between SPEs running on media servers. However, enough memory is shadowed to allow a warm restart during the interchange process. This implies that a standby SPE can take over for an active SPE through a system warm restart if the standby server’s memory is current (or refreshed). However, if the standby server is not refreshed, the standby SPE can take over for an active SPE, but only via a cold restart. In other words, if a standby server is refreshed, then it can take over for an active server without affecting any stable calls in the system. If the standby server is not refreshed, it can take over the system, but the interchange process will cause all calls in the system to be dropped. SPE A SPE B SPE SPE B LAN IPSI LAN IPSI IPSI EAL #1 EPN #1 IPSI EAL #1 EAL #2 EPN #2 EI EI EPN #3 EPN #1 EI EAL #2 EPN #2 EI EI EPN #3 EI EAL #3 EAL #3 PNC PNC SPE A in Active Mode SPE B in Active Mode Figure 5 – SPE Interchange 3.4 What is Server Arbitration ? The decision to determine which server should be active and which server should be standby within a cluster is the job of the Arbiter. An Arbiter is local to each server within a cluster and instructs its coresident SPE which mode to run based on a number of state-of-health factors. Arbitration heart-beats are messages passed between two peer arbiters via the IP duplication link and the control networks (for redundancy) containing the state-of-health vector. The state-of-health vector is primarily comprised of hardware state-of-health, software state-of-health, and control network state-of-health. The hardware stateof-health is an evaluation of the media server’s hardware (e.g. fan failures, major UPS warnings, server voltage spikes, etc.) done by the Global Maintenance Manager (GMM) and reported to the Arbiter. The 9 software state-of-health is an evaluation of the SPE software processes (e.g. process violations, process sanity failures, etc.) done by the Software Watchdog (WD) as well as the Process Manager and reported to the Arbiter. The control network state-of-health is an evaluation of how many IP connections to IPSI connected port networks the media server has compared to how many IP connections it expects to have. This information is determined by the Packet Control Driver (PCD) and reported to the Arbiter. While the decision tree that an Arbiter goes through has many complex caveats (including anti-thrashing measures, comparison algorithms for state-of-health vectors, tie-breaker procedures, etc.), it can be summarized by three basic rules: 1) If an Arbiter and its peer determine that both SPEs in a cluster are in active mode, one Arbiter in the pair, the one which has been active the longest, will instruct its associated SPE to go into standby mode. 2) If an Arbiter cannot communicate with its peer, it will instruct its associated SPE to go into active mode. 3) If an Arbiter has a better state-of-health vector than its peer, it will instruct its associated SPE to go into active mode (if it is not already) and ensure that the peer SPE is no longer running active mode. 3.5 What is an IPSI Interchange ? In order for the SPE to be able to service a port network, a LAPD control link, an EAL, must be established and functioning. A port network’s EAL begins at a PKTINT, which may or may not be present in its own carriers, and terminates at an arch-angel, which must reside within the port network itself. If a failure occurs on a PKTINT which supports the port network, another one must be found to support the EAL if the port network is to remain operable. There are two basic ways to find another supporting PKTINT for a port network - transition to another port network’s PKTINT (which is referred to as fall-back and is discussed in the next section) or interchange to another PKTINT resident in the port network (which is referred to as an IPSI interchange and is described below). In the What are Port Networks section above, a number of cabinet types were listed along with some of their characteristics. One of the traits mentioned was the ability to support duplicated IPSIs which implies the ability to support duplicated PKTINTs since the PKTINT is one of the components of an IPSI. The SPE chooses which IPSI will be active and which IPSI will be standby within a pair. The active PKTINT supports all of the LAPD links (including the EALs and other control links) that are needed to service the port networks that it supports. The standby PKTINT is available to take over support of the LAPD links if needed. There are two types of IPSI interchanges that exist, each of which has different end user effects - a planned migration and a spontaneous interchange. A planned IPSI migration will not have any effect on the end users and can occur during periodic maintenance or via an administrator’s command. A planned migration occurs in non-fault situations and carefully transitions link support from the active PKTINT to the standby PKTINT in a method whereby no messages are lost. The standby PKTINT is transitioned to an active PKTINT via a hot restart. A spontaneous IPSI interchange will have no effect on stable calls, but will possibly drop end user actions during the switchover. A spontaneous IPSI interchange is an action that SPE takes if an active PKTINT failure occurs or a network fault occurs between the SPE and the active IPSI. A spontaneous IPSI interchange will inform the standby PKTINT of all existing links and then transition it to the active PKTINT via a warm restart. The left side of Figure #5 below shows the control links for port networks #1 and #2. Port network #1 (PN #1) has a resident IPSI and the controlling link for this port network, the EAL, goes from the IPSI’s PKTINT to the IPSI’s arch-angel. The other port network does not have its own IPSI and therefore is utilizing another port network’s PKTINT for service (the next section explains how this is possible). PN #2’s EAL goes from port network #1’s IPSI’s PKTINT to the EI board residing in its own carrier. When a failure occurs on the active PKTINT, a communication fault in this example, the SPE will transition over to the standby IPSI’s PKTINT in PN #1. The right side of Figure #5 shows the new instantiations of the 10 control links after the spontaneous IPSI interchange. Notice that for the non-IPSI controlled port network, PN #2, the EAL’s near end termination point has shifted from the A-side PKTINT to the B-side PKTINT, but the far end remains at the same location, on the EI board. However, both termination points of the EAL for the IPSI controlled port network, PN #1, have shifted. Like the non-IPSI port network’s EAL, the near end shifted from the A-side PKTINT to the B-side PKTINT, but unlike the non-IPSI port network, the far end has also shifted from the A-side IPSI to the B-side IPSI. Due to historical reasons, if both IPSIs and PKTINTs are healthy, SPE maintenance preferences back to the A-side IPSI (unless it is locked down to the B-side) even though there is no performance effect from being on one side or the other. SPE SPE LAN LAN IPSI-A IPSI-B IPSI-A IPSI-B EAL #1 EPN #1 EPN #2 EI EPN #1 EI EAL #1 EPN #2 EI EI EAL #2 EAL #2 PNC PNC A-Side IPSI Active B-Side IPSI Active Figure 6 - IPSI Interchange Also, the determination of which IPSI will be active and which will be standby is completely independent of which server is in active mode and which is in standby mode. Since there is no longer a tight association between the active server side and the active IPSI side (e.g. the B-side server can be in active mode and the A-side IPSI can be in active mode at the same time), the SPE can independently select which side (A or B) will be active for each IPSI pair (e.g. some IPSI pairs could have the A-side be active while others are concurrently active on the B-side). 3.6 What is EI Control Fallback ? In the section above which describes how port networks are interconnected, different types of PNCs are discussed. Some of the PNC types not only support bearer traffic between port networks, but also provide a medium for tunneling control from one port network to another. This tunneling ability allows for some port networks to be supported by a PKTINT which is located in another port network. In these cases, the port network has an EAL control link which goes from another port network’s PKTINT to an arch-angel residing on an EI board within the port network itself. Port networks which are being serviced in this fashion are referred to as EI controlled. Port networks which are being serviced from a co-resident PKTINT are referred to as IPSI controlled. EI fallback is a recovery method whereby a port network transitions from IPSI controlled to EI controlled due to a fault. If the port network has a simplex IPSI, a fault of that board will cause an EI fallback (provided the PNC type supports it). If the port network has duplex IPSIs, a fault of the active board will cause an IPSI interchange. However, if the newly activated board has an additional fault and the pair IPSI board has not returned to an error-free state, the port network will fallback to EI control. 11 The left side of Figure #6 shows three port networks getting controlled before the introduced fault. Port network #1 and port network #2 have simplex IPSIs and currently have their EALs go from their coresident PKTINTs to their co-resident arch-angels (both of which are contained within the IPSI). Port network #3 does not have an IPSI and is getting its control indirectly from the SPE through PN #2. In this example, if a fault occurs in PN #2’s PKTINT, then the EALs for PN #2 and PN #3 need to transition off of the faulted PKTINT. The near end of Port network #3’s EAL shifts from the PKTINT in PN #2 to the PKTINT in PN #1, but the far end remains at its EI board. Port network #2 moves from IPSI controlled to EI controlled by having the near end of the EAL shift from PN #2’s PKTINT to PN #1’s PKTINT and having the far end of the EAL shift from the IPSI’s arch-angel to the EI board’s arch-angel. In order to shift the port network’s EAL from the IPSI to the EI board, a PN warm restart is required. SPE SPE LAN IPSI LAN IPSI IPSI EAL #1 EPN #1 IPSI EAL #1 EAL #2 EPN #2 EI EI EPN #3 EPN #1 EI EPN #2 EI EI EPN #3 EI EAL #2 EAL #3 EAL #3 PNC PNC Normal Operation Faulted Fallback Operation Figure 7 - EI Fallback Recovery After the fault in the PKTINT is cleared, the SPE attempts to control that port network through its IPSI once again. The automated process of maintenance shifting a port network’s EAL from the EI board back to the IPSI is called fall-up. Fall-up, like fall-back, requires a port network warm restart to shift the far end termination point of the EAL. Section 4 - Reliability – Multiple Cluster Environments Thus far, a complete failure of the controlling cluster would render the communication system inoperable. As shown in the Reliability - Single Cluster Environments section, there are reparation methods for many types of failures that keep the system operational and port networks functioning. For instance, a failure of an active SPE can be addressed by having the standby SPE, within its cluster, takeover for it. This method works on the premise that failures of the active SPE and the standby SPE are independent. If there is a long mean time between failures of SPEs and one SPE is backing up the other one, then statistically there will always be a functional SPE. Therefore, given a functional SPE and a viable control path to the port networks, the system will be operational. However, true independence of SPE server failures within a cluster cannot usually be achieved due to the lack of geographical separation. For example, all the hardware and software reliability numbers calculated for the SPE servers are completely meaningless if the data room supporting them has an incident whereby everything in the room is destroyed. 12 There is a direct correlation between the availability of a port network and the number of different pathways that can be used by the SPE to control the PN. While there are recoveries which try different pathways for controlling purposes, there are situations when all of them fail. For example, if a port network is in a building and all communication pathways into that building are non-functional, then the system’s controlling cluster is unable to support that isolated port network. In this case, the port network ceases to provide service to its endpoints and is useless until a control link is re-established. The ESS offering addresses theses types of cataclysmic failures. This section covers the previous offerings which provide the Communcation System protection from catastrophic main cluster failures and port network isolation using multiple clusters. All of these offerings, including Survivable Remote Processors, ATM WAN Spare Processors, and Manual Back-Up Servers, are being replaced with ESS. 4.1 How long does it take to be operational ? The first question always asked in reference to survivability is “How quickly does the PBX provide service to end users after a fault has occurred?” Before answering this question in terms of an exact time, which is done in each of the survivable options sections below, the question “What are the critical factors in determining a recovery timeline?” is addressed. A recovery timeline for any problem can be broken down into three logical segments – fault detection time, recovery option evaluation time, and recovery action time. The fault detection time is the period of time it takes to determine that something is not operating normally within the system. Some of the survivability options have automated fault detection. For example, the ATM WAN Spare Processors are continually handshaking keep-alive messages with the main server complex. If it does not receive a keep-alive response message in an appropriate amount of time, then the ATM WSP determines there is a fault. Other survivability options, such as Manual Back-up Servers, require manual detection that a fault has occurred. For automated detections, the time frame for fault detection is usually on the small order of seconds; however, no such time frame can be easily characterized for manual detections. After a fault is detected, a number of issues need to be resolved before taking any recovery actions. The first issue that needs to be addressed is determining if the fault detection is a false positive. For example, if two entities are continually handshaking over an IP network, they could detect something is awry if an abnormal round trip delay is detected. However, if this is a one time occurrence (which is completely normal for IP networks), then the “fault” detected is not really a fault and nothing should be done. The next issue to deal with is weighing recovery speed against the effect on end users. If a port network and its controlling server have a communication fault, then no service is supplied to end users supported by that PN. As previously discussed, if the fault is remedied within 60 seconds, then the server and the port network will continue operation without dropping any active calls. If a Survivable Remote Processor takes over a port network, it will cold restart that PN and, therefore, drop all calls supported by that PN. The question becomes, if a real fault is detected between the server and the port network, should the PN be immediately taken over by the Survivable Remote Processor and drop all calls in the process or should the main SPE be given the chance to re-establish connectivity with the PN and thereby possibly not dropping any calls if accomplished fast enough. The general rule of thumb is that if the recovery process will not disrupt service any further, then the recovery option evaluation time can be relatively short as it is with recovery methods shown in the Reliability - Single Cluster Environments section. If the recovery process does not preserve calls, as is the case with ESS and the offerings below, then the recovery option evaluation time should be long enough to give the non-call effecting recoveries every chance. The last issue that needs to be addressed is the trade off between recovery times versus being a fragmented system. Running in a fragmented mode has some drawbacks discussed at the end of the Reliability Multiple Cluster Environments section. If a fault prevents the server from communicating with a PN for an extended period of time, then that PN goes out of service. All the recovery options at this time involve resetting the port network and tearing down all existing calls, so fragmentation becomes the only issue. If it was known that the fault would be fixed within two minutes for example, it may be wise to prevent the 13 PN from being fragmented and deal with no service during that period of time because if the PN fragments it will be operating as its own island away from the rest of the PBX and may require another PN restart to agglomerate it back with the rest of the system. Unfortunately, there is no crystal ball that exists which can be queried to determine the outage time in advance. At this point, the recovery option evaluation time becomes a no service time since the PN is out of service longer than it needs to be to prevent fragmentation. The last part of the recovery timeline is the recovery action time portion. After the fault is detected and the decision has been made to take a particular recovery, that recovery takes place. The period of time that the recovery action takes is dependant on what is being done. For example, if port network goes into fall back because of an IPSI fault, only a single EPN warm restart is done which is very fast. However, when a Manual Back-Up Server takes over a system, it requires a complete system reboot (a cold restart) of the entire SPE which takes much longer. 4.2 What are Survivable Remote Processors (SRP) ? A communication system configured to use a center-stage switch PNC has its port networks interface with the CSS through EI boards. Ideally, a point-to-point fiber connects the PN’s EI board directly to a Switch Node Interface (SNI) board (TN571) in the CSS. However, direct fiber runs have distance limitations and geographically disperse port networks maybe desired. Therefore, a port network can utilize the PSTN to interface its EI board with its peer SNI. This type of EPN, called a DS1C remoted port network, has its EI board connected to a dedicated DS1 board which communicates with another DS1 board residing in the CSS. This second DS1 board is connected to the EI board’s peer SNI. This DS1C remoted PN communicates with the rest of the system through this connection – both for bearer traffic and for tunneled control. If the main processing complex ceases to operate or the connectivity between the sites has a fault, then the DS1C remoted port network has the potential to become a very expensive paper weight. An alternate controlling complex, called an SRP, can be used to control a DS1C remoted port network in these types of situations. The maintenance board in the remoted EPN is continuously monitoring the control connection link through the PSTN. If a fault is detected, it switches the fiber connecting the EI board to the DS1 to connect the EI board with the SRP. The port network’s arch-angel is still resident on the remoted EPN’s EI board, but the other terminating side of the EAL is now on the SRP as opposed to the IPSI at the main controller’s site. On the left side of the figure below, the DS1C remoted EPN is being controlled from the main site during normal operation. The right side of the figure shows a fragmentation fault and the failover of the DS1C remoted EPN to a local SRP. SPE SPE DS1C REMOTED EPN #2 LAN DS1C REMOTED EPN #2 LAN EI EI IPSI IPSI EAL #1 EAL #1 PSTN EPN #1 PSTN EPN #1 EAL #2 EAL #2 EI DS1 EI DS1 SNI SNI CSS DS1 DS1 SNI SNI CSS SRP Normal Operation Fragmented Operation Figure 8 – SRP Control of DS1C Remoted EPN 14 SRP As discussed previously, the SRP will take over the remoted DS1C port network only after all other recovery methods have been exhausted (server interchanges, IPSI interchanges, PNC interchanges, etc). There are two main reasons for ordering the recoveries this way – call preservation and port network isolation. Before introducing the possibility of multiple controlling clusters, all control links to port networks came from the main active SPE. If another control pathway is needed, then the control link is transferred to a new path. If the logical EAL link is transferred onto another physical medium, the port network and the SPE become slightly out of synch. A port network warm reset is done to re-synch the EPN and the SPE. This resynchronization consists of the SPE verifying its view of resource allocation with the actual usage in the EPN. For example, if the SPE believes a call is using some resources on the TDM bus within the port network and an audit of those resources shows that they are still in use, then the SPE assumes that the call associated with those resources is still active and continues to allow it to proceed normally. However, if an audit shows that the resources are no longer in use, the SPE derives the conclusion that the call has been terminated (e.g. a user went on hook) and updates its resource usage maps. The What is a Server Interchange section described how a standby SPE can takeover for an active SPE without dropping any stable calls since call state was shadowed between the SPEs within the main cluster. Unfortunately, there is no call state shadowing between SPEs which are in different clusters (the main SPEs are in the main cluster and the SPE being used as an SRP is in another cluster). When an SRP assumes control of an EPN, it needs to synchronize with it in order to assume control. However, since the SRP has no call state information it has nothing with which to audit the EPN against. Therefore, the only way to synchronize the SRP SPE and the EPN is via a cold restart which tears down all existing calls. Once an SRP assumes control of a DS1C remoted port network, that EPN becomes an isolated entity. In other words, an SRP controlling an EPN becomes a stand-alone PBX separated from the rest of the main system. The effects of this isolation can be summarized in the following call flow. Assume that user A is supported by some PN at the main site and that user B is supported by the DS1C remoted port network. In a non-faulted environment, if user A dials user B’s number, the SPE will attempt to route the call. The SPE knows the status of user B, on hook in this example, since it controls the DS1C remoted EPN and therefore can successfully route the call to the end user through the port networks involved and the PNC. However, if the remoted EPN is currently isolated away under the control of an SRP, the call flow changes dramatically. When user A dials user B’s number the SPE determines that it cannot route the call. The SPE unfortunately does not know the status of user B since the support EPN is not within its control and therefore routes the call to an appropriate coverage path. In other words, when a user is not under control of the SPE, it assumes that the user’s phone is out-of-service even if it is being supported by another controlling entity such as the SRP. In addition, the isolated system consisting of the SRP and the EPN no longer has access to centralized resources at the main location. For example, if the interface to the voicemail system is located on a different PN at the main site, then the isolated fragment will not have access to it. This will cause intercept tone to be played to any end user which is supposed to be routed to voicemail coverage. Also, any calls that are generated or received by the SRP controlled EPN will not be part of the overall system’s call accounting. This implies that a call made by a user off an EPN being controlled by an SRP will not appear in the main system’s Call Detail Records (CDR). The SPE on the SRP is always operational and is continuously attempting to bring a control link up to the remoted EPN. Under normal operation, there is no pathway from the SRP to the EI, where the control link terminates, since the maintenance board has chosen to have the fiber connection from the EI go to the supporting DS1 board. Determining the recovery time for this port network therefore breaks down into the following time periods – maintenance board detection of a control link failure from the main SPE, maintenance board no service time waiting for a possible quick network recovery, and SRP bringing up an EAL after the maintenance board switches over the fiber and cold restarting the port network. The maintenance board can detect an anomaly in the control link very quickly (within 2-3 seconds), but will wait in a no service time for approximately two minutes. Once the fiber switchover takes place, the SRP 15 will create an EAL within 180 seconds since it is a periodic attempt and the pathway now exists. The EPN cold restart will complete within approximately 1-2 minutes. For these reasons, the complete recovery time for a DS1C remoted EPN from a catastrophic fault, either due to a network fault or a main server crash, is approximately 7-10 minutes. As shown in Figure 8, the SRP controls an EPN via a point-to-point fiber which terminates at the EPN’s EI board. Since there is no CSS PNC involved with this connection, the SRP can only control one port network. Therefore, it can be said, an SRP provides survivability for a port network, but not survivability for an entire system. In fact, since the SRP can only control one port network, it does not need translation information for the rest of the system. This has the advantage that SRPs can have distinct translations from the main system which gives the administrator the ability to have the EPN operate differently in survivability mode if desired (e.g. completely different route patterns or call vectors). However, since the SRP uses distinct translations, it has the disadvantage of needing double administration in some cases. In order to add a user to the main system, the translations on the main server need to be edited. Since there is no auto-synchronization of translations between the main server and the SRP, the same translation edits have to be done manually on the SRP. Once the connectivity between the main server and the DS1C remoted EPN has been restored, it is up to the system administrator to manually put the system back together. The main server has no knowledge that an SRP has taken over the remoted port network. Therefore, it is continuously attempting to bring that EPN back into service by instantiating an EAL to it for control. During the network fragmentation, it is obvious that every EAL creation attempt will fail (no pathway available). After the network fragmentation has been repaired, the EAL creation attempts will still fail since the fiber connectivity to the EI has been physically shifted. When the system administrator determines it is time to agglomerate the system, they will issue a command on the SRP forcing the maintenance board to swing the fiber back to the EI board from the DS1 board. The next periodic EAL creation attempt will now succeed since there is a viable pathway. It is important to keep in mind that when the main server resumes control over the DS1C remoted port network all current calls on that EPN will be dropped since the main server has no information about them. A cold restart of the EPN is required to bring it back online with the rest of the system. 4.3 What are ATM WAN Spare Pairs (ATM-WSP) ? A communication system configured to use an ATM PNC has its port networks interface with an ATM backbone switch through ATM EI boards (TN2305/6). ATM WSP is an offering for DEFINITY G3r which provides alternate sources of control for EPNs that are unable to communicate with the main SPE. This communication failure can occur due to two reasons – an ATM network fragmentation or a complete main server complex malfunction. In the first case, an ATM network fragmentation, the control links from the main server have no viable pathway to get to the port network. It is important to note however, that if the ATM PNC is duplicated with the critical reliability offer, then both ATM networks need to fragment in order to have PNs become isolated. For example, if the A-side ATM network is fragmented, the main SPE would transfer the controlling EAL to traverse the B-side ATM network. The second case, a complete main server complex failure, also prevents EPNs from receiving service since the controlling entity is unable to perform its task. If the SPEs are duplicated with the high or critical offer, then both SPEs need to have complete failures in order to leave the PN out-of-service. For example, if the A-side SPE has incurred a failure, then the B-side SPE would takeover the system through a server interchange. ATM WAN Spare Processors are added strategically throughout a system to provide alternate sources of control if needed. They consist of an SPE and an interface into the ATM network. All of the ATM WSPs in a system are priority ranked and are continuously heart-beating with each other. The figure below shows an ATM PNC PBX leveraging ATM WSP for higher system availability. 16 EPN #1 SPE PPN EI EAL #1 PKTINT EPN #3 EI EAL #4 EI ATM WSP #2 PKTINT EI ATM PNC EAL #2 SPE SPE SPE EAL #3 ATM WSP #1 PKTINT EI EI ATM WSP #3 PKTINT EI EI EPN #2 EPN #4 Figure 9 – G3r with ATM WAN Spare Processors If an ATM WSP loses heart-beats with the main server for 15 minutes, it will then assume control of all EPNs it can reach if it is the highest ranked spare processor on its fragment. The figure below shows a catastrophic main server failure and control shift for all EPNs to ATM WSP #1. The other ATM WSPs in the system did not takeover any port networks since there is a higher ranking spare processor with which to communicate. In addition, since all the port networks failover to the same SPE, as it did before, the system is providing 100% equivalent service. EPN #1 SPE PPN EPN #3 EI PKTINT EI EAL #4 EAL #1 EI SPE ATM WSP #2 PKTINT EI ATM PNC SPE ATM WSP #1 PKTINT EI SPE EAL #3 EI EI EAL #2 EPN #2 ATM WSP #3 PKTINT EI EPN #4 Figure 10 – G3r Failure with ATM WSP Takeover Figure 11 shows an ATM network fragmentation and the failover of EPNs #3 and #4 to ATM WSP #2. ATM WSP #1 can not takeover those EPNs since no viable pathway exists to communicate with them and ATM WSP #3 does not takeover those EPNs since it can communicate with ATM WSP #2 which is ranked higher. While all port networks are receiving service from SPEs that have equivalent power as the main SPE, the system is running in a handicapped fashion. As discussed in the What are Survivable Remote Processors section above, some centralized resources may not be accessible by everyone and call flows between the fragmentations are disturbed. Each ATM WSP will provide equivalent service to EPNs that the main SPE did, but there is no guarantee that the ATM WSPs will control all system resources. 17 EPN #1 SPE PPN EAL #1 PKTINT EPN #3 EI ATM WSP #1 PKTINT EI ATM WSP #2 PKTINT EI EI EAL #4 EI SPE SPE ATM PNC EAL #2 EAL #3 SPE EI EI EPN #2 ATM WSP #3 PKTINT EI EPN #4 Figure 11 – ATM Network Fragmentation with ATM WSP Takeover ATM WSPs are not designed to protect the system against all types of failures especially since the ATM WSPs does not attempt to assume control of any port networks until after a lack of communication with the main SPE for 15 minutes. If the communication between the main server and an EPN is severed for less than 60 seconds, the main SPE should continue to control the EPN once the connectivity is restored. If the ATM WSP took over the EPN too quickly, then an unnecessary fragmentation and EPN cold restart occurs. ATM WAN Spare Processors are designed to protect the system against catastrophic failures only. They are not designed to shield against temporary outages or software glitches. With that said, upon a catastrophic failure, an EPN is taken over by an ATM WSP within 15 minutes provided connectivity exists and then it goes through a cold restart which takes approximately 1-2 minutes. Therefore, an EPN which has its control transferred due to a fault from a main server to an ATM WSP will be operational within approximately 17 minutes. At all times the main server is attempting to control all EPNs regardless if the EPNs are being controlled by a spare processor. The control link that the main server is trying to instantiate will be blocked by the ATM EI board. The ATM EI board only allows one EAL link at a time to be terminated to it and when the ATM WSP took over the port network an EAL is created. Therefore, in order to agglomerate the system back to the main server controlling ATM WSPs need to be reset. If an ATM WSP is reset, then all EALs which it is using to control port networks are dropped. Once the EAL is dropped, the main server’s cyclical attempt to terminate an EAL to that EPN will succeed and allow the main server to resume control. 4.4 What are Manual Backup Servers (MBS) ? Communication Manager is the next generation of the DEFINITY PBX whereby the SPE is no longer resident within the system, but rather controls all EPNs either directly or indirectly over IP. Manual Backup Servers (MBS) is an offering that provides an alternate source of control to the PNs if a catastrophic event prevents the primary cluster from providing service by leveraging this IP control paradigm. MBS was created as an interim offer until ESS became generally available. The figure below shows how an MBS cluster can be positioned into a system to provide an alternate control source which is independent of the main cluster. This independence is crucial to increasing system availability because if the MBS is not independent then the event which rendered the main servers useless may also affect the MBS servers. While the example in the figure below shows a CCS PNC, the MBS offering works in conjunction with any of the PNC options available (CSS PNC, ATM PNC, Direct Connect, or IP PNC). 18 SPE MBS SPE MBS LAN IPSI LAN IPSI IPSI EAL #1 EPN #1 IPSI EAL #1 EAL #2 EPN #2 EI EI EPN #3 EPN #1 EI EAL #2 EPN #2 EI EPN #3 EI EAL #3 EI EAL #3 PNC PNC Normal Operation MBS Compensating for Main SPE Failure Figure 12 – CM with MBS Servers Upon a failure scenario which prevents the main servers from controlling the port networks, the end users of the Communication System shall be out-of-service. There are many types of failures, described in the Reliability - Single Cluster Environments section, which can be resolved by the main servers without any intervention. However, if the failure is such that the main servers are not going to be operational for an extended period of time and all other recovery methods fail, then the MBS servers can be manually activated to take over the system. Under normal situations, the MBS servers are in a dormant state with translation sets synchronized to it on a demand basis. The activation of an MBS cluster is the process of taking the dormant SPE into an active state, or, in other words, doing a complete system reboot without any call state (no memory shadowing exists between the main cluster and the MBS cluster). This implies that when an MBS takes over control of the system, all EPNs go through a cold restart and any calls supported by these port networks are dropped. While the concept of the MBS servers was the basis for the ESS offering, there are many pitfalls with it most notably a non-deterministic recovery time. While the actual reboot time of the system (the method the MBS servers use to recover the system) can be assumed to be on the small order of minutes (see Table 4 in the What are SPE Restarts section), the other parts of the overall recovery time, detection time and no service time, are based on non-automated procedures and therefore have a tremendous amount of variance and no upper bound due to the human factor. There are some situations where the communication system is being tightly monitored at all times and, in this case, the complete system recovery time can hover around 10 minutes. However, if the system is not being closely monitored, it could takes tens of minutes to just detect that there is a problem and then a time consuming scramble to manually activate the MBS. Another pitfall is that the MBS servers are not continually proving their viability since MBS servers remain in a completely dormant state during normal operation. Ideally, the MBS would always be running diagnostics on its hardware and resident software and checking its connectivity to all of the port networks’ IPSIs. Without this information, there is no way of knowing ahead of time if the MBS servers will be able to provide service to the port networks if needed and no way of alarming the issues so they can be addressed before the MBS is ever activated. Furthermore, the MBS offering does not scale well and, in fact, there can only be one MBS for a main server. This restriction limits the complexity of disaster recovery plans and the number of different catastrophic scenarios which can be protected against. 19 MBS servers also have operational pitfalls. The key to the MBS servers successfully taking over the system is that the main cluster is completely eliminated from the equation. The MBS servers are exact duplicates, from a software point of view, of the main servers and are unaware of the main servers’ presence in the system. Therefore it cannot operate in conjunction with the main servers for any length of time. For example, if a situation arose whereby the MBS servers were activated while the main servers were still operational, then the port networks would consistently thrash back and forth between getting control from the main cluster and the MBS cluster (this would render the port network inoperable). Furthermore, suppose the MBS is activated at an appropriate time while the main cluster is inactive, but at a later time the main cluster is fixed and comes back on-line. This would cause the same thrashing problem. This leads to the largest downfall of the MBS offering – it does not, although it appears to, provide protection against network fragmentation. If the network fragments as shown in the diagram below, the port networks unable to communicate with the main server will be out-of-service (fallback is unavailable since there is no CSS or ATM PNC in this example). If the MBS is activated, it will assume control of the port networks it can reach. During the time that the network is fragmented, both the main and MBS servers are attempting to control all port networks in the system but are blocked from doing so due to a lack of connectivity. However, this implies that the system is only stable while the fragmentation continues to exist because when the network is healed, both clusters will be able to connect to all port networks and therefore fight over control of all them. It cannot be stated strongly enough that MBS servers should never be used to protect against network fragmentation faults. The ESS offering provides all the advantages of MBS without any of the drawbacks. Stable only while network is fragmented. The fragmentation is preventing contention. SPE MBS LAN IPSI IPSI IPSI EAL #1 EPN #1 EAL #3 EAL #2 EPN #3 EPN #2 IPSI EAL #4 EPN #4 Figure 13 – MBS Attempting to Assist in Network Fragmentation Fault Section 5 - ESS Overview 5.1 What are Enterprise Survivable Servers (ESS) ? ESS is an offering which simultaneously protects Avaya Communication Systems against both catastrophic main server failures and network fragmentations. This is achieved by having ESS servers placed strategically within an enterprise as an alternate source of control for port network gateways when control from the primary source is disrupted for an extended period of time. Unlike the MBS offering, the control transfer between the primary source and the alternate source is automatic in the ESS offering and multiple ESS servers may be operational concurrently. 20 ESS servers consist of media servers executing the SPE software. While the SPE software running on ESS servers is identical to the SPE software running on the main server, it is licensed differently which drastically affects its operational behavior. Some of these behavioral variations include different alarming rules, capability to register with a main server, and ability to receive translation synchronization updates. Currently, ESS licensed software can only execute on a subset of the media servers available – S87XX and S8500 Media Servers. 5.2 What is ESS designed to protect against ? Upon a main cluster failure, port networks lose their control source. If this failure is catastrophic in nature, whereby the main servers will not be operational again for an extended period of time, port networks can search out and find ESS servers that will provide them service. Therefore, ESS protects port networks against extended failures of the main controlling cluster. For most scenarios, the desired failover behavior caused by a catastrophic main server failure is to have all surviving port networks find service from the same ESS server in order to keep the system in a non-fragmented state. The figure below shows a catastrophic main cluster failure with all of the port networks failing over to a single ESS cluster. ESS #1 Main SPE IP Network EPN #1 ESS #2 EPN #2 EPN #3 EPN #4 Figure 14 – ESS Protecting Against Catastrophic Server Failure Upon a connectivity failure to the main cluster, port networks lose their control source. If this fragmentation continues to exist over an extended period of time, port networks can search out and find ESS servers on their side of the fragmentation that will provide them service. Therefore, ESS protects port networks against significantly long fragmentations away from the main controlling cluster. For most scenarios, the desired failover behavior caused by a network fragmentation is to have all port networks which can still communicate with each other to agglomerate together forming a single stand-alone system. The figure below shows a network fragmentation with all of the port networks on the right side of the fragment grouping together to form a single autonomous system. ESS #1 Main SPE IP Network EPN #1 ESS #2 EPN #2 EPN #3 EPN #4 Figure 15 – ESS Protecting Against Network Fragmentation It cannot be emphasized enough that ESS increases the availability of a communication system when faced with catastrophic server failures or extended network outages. ESS builds upon all of the existing fault recovery mechanisms which are already integrated into the system. For example, when a network outage is 21 detected, the PBX attempts to re-establish PN control by interchanging IPSIs (if the port network has duplicated IPSIs), fall-back control through another PN (if the port network is leveraging CSS or ATM PNC), and reconnecting to the PN and warm restarting it (if the outage is persistent for less than 60 seconds) before ESS servers ever get involved. ESS is the last line of defense to provide service to port networks while in these severely handicapped situations. 5.3 What is ESS NOT designed to protect against ? The ESS offering is designed to protect systems against major faults which exist over a prolonged period of time. ESS builds upon all of the existing fault recovery mechanisms which are already integrated into the communication system. This section will present a number of common faults which can occur and explain how they are resolved without ESS. Attempting to have ESS address these types of faults causes overall system reliability to actually be decreased since race conditions between ESS and existing recoveries are introduced. Fault Example: Main Active Server Hardware Failure Resolution: Server Interchange Description: When the active server encounters a hardware fault, either the local Arbiter will be informed of the degraded hardware state of health (SOH) or the standby server’s Arbiter will lose contact with its peer. Regardless of the detection mechanism, the decision will be made for the standby server to take over control of the system from the currently active server. In addition, resolving this problem by interchanging servers will preserve all calls within the system. ESS will not get involved with the recovery unless the main server is a simplex server or the standby server is in an inoperable state. Fault Example: Extended Control Network A Outage Resolution: IPSI Interchange or Control Fall-back (depending on system configuration) Description: When the control network A outage is detected, the system will attempt to transition its PN control link from the active IPSI, which is connected through control network A, to the standby IPSI, which is connected though control network B provided the port network has duplicated IPSIs. If the IPSI interchange fails or the standby IPSI does not exist, the system attempts to transition its PN control link from an IPSI local in the PN to an IPSI which it can communicate with in another PN provided the PNC supports control link tunneling (CSS or ATM PNC). If the problem is resolved by either of these methods, all calls within affected port networks will be preserved. Only if both IPSIs experience extended network faults simultaneously (control network A and B) and fall-back control is not supported through the PNC (IP), would the system need ESS to take over control for that port network. Fault Example: Short Control Network Outage Resolution: EPN Warm Restart Description: When a control network outage occurs and no other recovery mechanism can work around the problem (discussed in the fault example above), the controlling server continuously attempts to establish a new connection to the port networks. If the network outage is less than 60 seconds, the SPE will reconnect to the PN’s IPSI and resynchronize resource allocation with the port network. This resynchronization process will preserve all of the calls (both call state and bearer connection) currently being supported by the PN. If the outage is over an extended period of time, then the ESS would takeover control of the port network. It is important to ensure that the call preserving recovery has a chance to complete before allowing ESS servers to assume control. Fault Example: Software Fault Resolution: System Restart or Server Interchange Description: If a software fault occurs on the main active server, the CM application will attempt to initiate a system restart it deems necessary to resolve the software fault and, at the same time, inform the local Arbiter. If the system restart fails to resolve the issue, the Arbiter may choose to interchange servers or to initiate another system restart at a higher level. Depending on the restart level chosen to resolve the issue, calls may (warm restart) or may not (cold restart) be preserved. ESS would enter the recovery solution only if the software faults are severe enough to completely freeze both main servers preventing them from communicating with IPSIs and issuing self reboot actions. 22 Fault Example: Complete IPSI Failure Resolution: IPSI Interchange Description: An IPSI is made up of a number of components including a PKTINT, an arch-angel, and a tone clock. If a complete IPSI failure occurs causing everything on it to cease operation, then the system will attempt to transition the control link to the standby IPSI. However, since the tone clock for the PN also has failed, the IPSI interchange will not preserve calls. If the port network did not have duplicated IPSIs then it would become inoperable. The IPSI is in charge of seeking out alternate ESS clusters when needed and if the IPSI is completely down, then ESS clusters will not be requested to take over. Also, even if an ESS took over the ISPI, the tone clock is a critical component of a PN and without it the port networks cannot operate. Fault Example: Complete Port Network Power Outage Resolution: None Description: Without power to the PN, it cannot provide any services to its endpoints. Until power is restored, the resources provided to the system through the powerless port network are out of service. In situations like this, ESS does not help. In addition, since no calls could exist within a powered off cabinet, no calls are preserved when the power is restored to the cabinet. One of the greatest assets of an Avaya Communication System is the ability it has to resolve faults that occur. When the DEFINITY G3r system evolves into the Avaya Communication Manager additional recovery mechanisms were invented, but they work as add-ons to, not replacements for, the already existing ones. ESS provides a new layer of system availability without compromising the tried-and-true mechanisms that are already built into the system. If ESS is used to protect against the faults mentioned above, then unnecessary resets will occur along with the possibility of unwanted system fragmentation. Section 6 - How does ESS Work ? Except for some basic initial configuration of the ESS servers, all system administration is done on the main cluster. Once an ESS server informs the main server that it is present (via registration), the main server synchronizes the entire system translation set to it (via file synch). This synchronization process continues periodically through maintenance and/or on a demand basis to ensure that the ESS clusters have updated translations. With these full system translations, the ESS servers have the information required to connect to all IPSIs such as IP addresses, ports, QOS parameters, and encryption algorithms. Once the ESS servers connect and authenticate with the IPSIs, they advertise administered factors which the IPSIs use to calculate a priority score for the ESS cluster. The IPSI is always maintaining an ordered failover preference list based on these priority scores. This priority list is dynamic based on network and server conditions and only contains currently connected viable servers. Upon an IPSI losing connectivity to its controlling cluster, it starts a no service timer. If a re-connection does not occur before the timer expires, the IPSI will request service from the highest ranking viable ESS clusters in its preference list. This algorithm prevents knee jerk and non-fault control shifts, but the IPSI service requesting process can be overridden by manual or scheduled agglomeration procedures. The rest of this section examines in detail each of the operational steps of the ESS offering. 6.1 What is ESS Registration ? When an ESS is installed into a system, it is configured with an IP address (local IP interface), a remote registration IP address (CLAN gatekeeper), and a unique server ID (SVID). In addition, a license file will be loaded which provides the ESS cluster with a unique module ID (otherwise known as a cluster ID or CLID) and a system ID (SID). Concurrently, the ESS is added to the system translations on the main server by creating a record with all of this information (IP address, CLID, SVID, and SID) along with associated factors which will be described later. 23 Upon an initial start-up or reset, an ESS will send a registration packet (an RRQ) containing its configured data to a remote registration IP address (port 1719). The initial time an ESS attempts to register with the main server, it only tries at the originally configured registration IP address. However, after the initial translation synchronization (discussed in the next section), the ESS first tries to register at the configured registration IP address, but upon a failure, tries all possible registration points, regardless of network region, administered within the system. Upon the main server receiving the registration request, it attempts to validate the ESS server by ensuring that the IP address, CLID, and SVID match the administered data for that ESS and that the SID provided by the ESS matches its own system ID. If it authenticates, the main server will respond to the ESS server with a confirmation (an RCF) and continue handshaking with it. The handshaking takes the form of heartbeats (KA-RRQs, a slight modified version of RRQs) and heartbeat responses (KA-RCFs, a slightly modified version of RCFs). Keep alive handshaking is performed by every ESS server on a periodic basis of once per minute. Since the keep-alive packets are very small, occur only once per minute, and there are relatively few ESS clusters within a system (maximum of 63), the bandwidth required for this is negligible (approximately 2.5 Kbps per ESS cluster). Figure #16 below shows the registration link between the ESS clusters and the main cluster. If the ESS cluster consists of a duplicated server, an S87XX media server, then only the active server within the cluster will register. IPSI Main SPE CLAN IP Network EPN #1 IP Network IPSI CLAN ESS EPN #2 Registration Pathway Figure 16 – ESS Registration Pathway Currently, as the figure shows, CLANs are utilized as the registration entry point to the main servers. It is important to note that CLANs in a port network controlled by an ESS cannot serve as a registration access point for another ESS cluster within the system (ESS servers do not register with each other). 6.2 How are ESS Translations Updated ? Other than for alarming reasons and providing a medium for issuing some remote commands (discussed later), the main purpose of ESS registration is to allow the main server to know the translation status of each ESS. Administration changes made on the main server are not updated in real time to the ESS servers. Rather, the translation changes are synched to the ESS server on either an automated periodic basis once per day or on demand with a manually entered command. Since the ESS offering is designed to only protect against catastrophic failures, taking over port networks with slightly out-of-date translations is completely valid. For example, if an end user has added a new feature button to his/her phone and a failure occurs before periodic maintenance has updated the ESS server, then the phone operates as if no administration actions took place (e.g. no new feature button) when it is taken over by the ESS server. Whenever synchronization between the main server and the ESS server is requested, the main server checks to see if it is necessary based on the current translation state of the ESS (which was provided in its registration heartbeats). If the ESS server has translations that are identical to the main server’s, then the actual synchronization process is skipped. However, if the ESS translations are not up to date with the main server, the main server synchronizes the ESS server by supplying the complete system translation file or by providing the changes since the last synchronization (incremental file synch). Incremental file synchs require the ESS cluster to be at most one translation synchronization behind since the changes need to be applied to identical images. This implies that full translation files need to be synched whenever initial startup occurs or if the ESS server has been un-registered for an extended period of time. Obviously the preferred method is to only transmit the changes since the amount of data needed to be sent is relatively 24 small (very dependant on the number of changes, but usually in the 100s of kilobyte range) as opposed to transmitting the entire system translation file which is relatively large (very dependant on the size of the system, but usually ranges from 2 to 30 megabytes and can be compressed to approximately a 5:1 ratio). The bandwidth required for this process is dictated by the rule that the synch process must be completed within 5 minutes. Therefore, the bandwidth required in the worst cases (largest systems synching full translation sets which are compressed) is approximately 200 Kbps over that 5 minute period. ESS servers are operational at all times proving their viability by running the CM software, monitoring system state-of-health, and handshaking with the IPSIs. After translation synchronization completes, the ESS server must stop running on the old translation set and begin executing on the new set. In order to achieve this, the ESS is required to perform a reboot (as described previously in the What are SPE Restarts section). Therefore, the entire synch process, under normal conditions, can be summarized by the following steps: 1. 2. 3. 4. The main server creates a compressed file consisting of the admin changes since the last synch. This file is transferred from the main server to the ESS server. The ESS server applies the changes given to it onto its existing translation file. The ESS initiates a self reboot in order to begin running on the new translations. However, based on the current state of the system, some of these steps may be altered. The first step may require an entire system file transfer as opposed to just the changes which causes the third step to swap out the files rather than applying changes. Rebooting an ESS server which is not currently controlling any port network gateways does not affect end-users what-so-ever except for not being available as an alternative source of control if needed during the reboot cycle (approximately 2 – 3 minutes). On the other hand, if the ESS is actively controlling port networks, then a reboot of an ESS server would cause an outage to all of the supported end-users. Therefore, the fourth step of the process will be delayed until the ESS server is no longer controlling any end-users or gateways. That being said, administration changes made on the main server will not be made available to an ESS which is actively controlling resources even if translation synchronization occurs. Another consequence of the fourth step is that an excessive number of demand synchronization requests will cause an ESS to go through many resets. While the resets themselves do not affect system operation or performance, the more time ESS servers spend resetting, the less time they are available as an alternate control source. Therefore, demand synchronizations, via the “save translations ess” command, should only be performed after critical administration has been completed which is absolutely needed in catastrophic failure modes and cannot wait for the automated synch process. The license file installed on the ESS server gives the CM software running on it the ability to receive file synchronizations from the main server. However, the ability to make administration changes locally on the ESS is not prohibited although there are three important caveats to take into account. The first caveat is that any administration changes made on the ESS are lost whenever translation synchronization takes place followed by the subsequent reboot. Therefore, administration changes can be made to an ESS server while it is not in control of any resources and have them be effective if the ESS needs to take over any port networks, but all the changes are lost when periodic or demand translation synchronization takes places. The second caveat is that translation changes made on an ESS cannot be re-synchronized back to the main server. Hence, administration changes made locally to an ESS server which is controlling end-points will not be reflected in the overall system translations and will only be effective until the main server resumes control of the end-points again. The last caveat is that translation changes made locally to an ESS cannot be saved to disk. Since an ESS server is just as powerful as a main server and has identical RTUs, a customer could conceivably purchase an ESS server and illegally make it into its own stand-alone system at a substantial discount. To prevent this from happening two major blockades are introduced; ESS servers enter “no license” mode whenever it controls resources which gives a limited operational lifespan of the server and elimination of the ability to save translations to disk on ESS servers which puts all administration changes done at risk if a system reboot occurs (which is what happens if the “no license” mode timer expires). The figure below shows the network interconnectivity required to have ESS servers get synchronizations from the main server. It is important to note that the secure transfer of the translation file or translation 25 changes file takes a drastically different path than the registration link. File synchs are done directly between media servers (utilizing port 21874) without going through CLANs. Also, the diagram shows two separate, distinct networks, the control LAN and the corporate LAN, but they can actually be implemented as one and the same. Translation Synchronization Pathway Corporate LAN IPSI Main SPE Control LAN CLAN IP Network EPN #1 IP Network IPSI CLAN ESS EPN #2 Registration Pathway Figure 17 – ESS Registration Pathway Moreover, the diagram is leveraging a simplex S8500 Media Server as the main cluster and a simplex S8500 Media Server as the ESS cluster. If either of the clusters consisted of duplex S87XX Media Servers, then the translation synchronization will only happen between the active media servers. In a duplex media server cluster, it is the responsibility of the active SPE to update the standby SPE translations. Unfortunately, there is a negative implication of synchronizing translations on a periodic basis as opposed to being done in real time – some phone state information will be lost. There are a class of features that end-users can execute which change the state of their phone and the associated phone’s translation information. Two primary examples of these types of features are send-all-calls and EC500 activation. If an end-user changes the state of these features (either enabling or disabling) and a failure occurs causing an ESS to takeover before the translation changes have been sent to the ESS server, the feature states will revert to their previous state. For example, imagine a scenario whereby a user hits their send-all-calls feature button in the morning and then goes on vacation. If a catastrophic failure occurs that afternoon before a translation synch occurs and the end-user is transferred to an ESS, then the send-all-calls feature will be deactivated (its previous state). Any calls that come to that user while being controlled by the ESS are not immediately sent to coverage. 6.3 How do Media Servers and IPSIs Communicate ? Once an ESS server receives a translation set from the main server, it has all the information needed in order to connect to all the IPSIs within a system. In other words, once the ESS servers have each IPSI’s IP address & listen port and each IPSI’s encryption algorithm settings & QOS parameters, it will initiate a TCP socket connection to all IPSIs (utilizing port 5010). If the connection is unsuccessful, the servers will periodically attempt a new connection every 30 seconds, otherwise the authentication process commences. The authentication process consists of the validation of key messages based on an encryption method commonly known between the servers and the IPSIs. Once the server and IPSI can communicate securely over this established TCP link, the ESS server uniquely identifies itself (via cluster ID and server ID). Following that, the IPSI will request priority factors (discussed in the next section) from the connecting servers in order to calculate a priority score (discussed in the next section) for that server which will be used to create the IPSI’s self-generated priority list (discussed below). At a high level, with the details to follow, the IPSI receives pre-programmed factors from each ESS server that it will use to calculate a score for that ESS cluster. The higher the score an IPSI assigns for an ESS cluster, the higher the ESS cluster 26 resides in the failover preference list. The following flow chart shows the initial message sequence that the server and the IPSI go through in order to establish a session. Media Server IPSI Open TCP Socket TCP Socket Up Confirmation Authentication & Encryption Setup Authentication & Encryption Completion ESS Indication Message Factors Request Message Factors Message Notification update message gets sent to all media servers currently connected to the IPSI Notification Update Message Figure 18 – Message Flow upon Server-IPSI Connection After the socket has been brought up between the IPSI and a non-controlling ESS server, there are three tasks that occur. First, the ESS servers need to be consistently monitoring the health of the IP connection and this is achieved by doing an application level keep alive once per second. These heartbeat messages can easily be seen by monitoring traffic on the control network and have an average size of 10 bytes (8 bytes for server to IPSI heartbeats and 12 bytes for IPSI to server response heartbeats). This implies, since these messages are sent as minimally sized TCP packets, that there is a static bandwidth of approximately 1 Kbits per second required between the servers and IPSIs for these operations. Secondly, the IPSI is responsible for keeping every connected server up to date with respect to its own selfgenerated priority preference list. This is done by sending up priority list notification messages whenever a change occurs (new server connecting, existing server disconnecting, or change in controlling cluster). An ESS server gets the current priority list upon its initial connection to the IPSI and then gets any subsequent updates. For example, if an ESS server disconnects from an IPSI for any reason, the IPSI sends updated priority lists, with that ESS server removed, to all servers which are still connected. If the disconnected ESS reconnects to the IPSI the same procedure is followed whereby the reconnecting ESS is inserted appropriately back into the priority list and all connected servers receive this update. The notification message size varies based on the number of ESS clusters and the ESS cluster types (simplex or duplex), from a minimum of 8 bytes to a maximum of 68 bytes. Since priority preference lists do not change on a periodic basis, only on network failures, ESS server failures, or certain ESS maintenance operations and the notifications are relatively small, virtually no static bandwidth is required for this operation. The third responsibility is to ensure there is always a socket up between servers and IPSIs. This implies that upon a connectivity failure, the server attempts to restore a TCP connection. If there is a socket failure the server will periodically try to bring up a new socket every 30 seconds. As the figure #18 above shows, there are initial handshakes that occur when a socket is established. This exchange comprises of 27 approximately 512 bytes. However, since socket connections go down during failure scenarios and this can not be occurring with any type of frequency, there is no static bandwidth required for this operation. To summarize, the bandwidth required between ESS servers and IPSIs during normal operation (main server in control of the entire system) is only 1 Kbits per second. During failure scenarios there is a slight spike in traffic between servers and IPSIs. It is important to realize however, that the 1 Kbits / second bandwidth requirement is for non-controlling servers only. The bandwidth required between a controlling server and an IPSI is approximately 64 Kbits / second per 1000 busy hour call completions supported by that IPSI. When designing a system, enough bandwidth must be available between ESS servers and port networks in case the ESS is required to take over port network control. In other words, even though only 1 Kbits per second are needed during normal operation between ESS servers and IPSI, the network bandwidth requirements must be established assuming the ESS will be controlling port networks. 6.4 What is a Priority Score ? Any disaster recovery plan worth its weight provides survivability for critical components in a deterministic manner. ESS addresses this by having critical components (the port network gateways) generate, via its resident IPSI, a ranked ordering of alternate control sources (ESS clusters). The IPSI ranks connecting ESS servers using a priority score calculated from factors provided by the ESS clusters. The most basic way to have a system react to catastrophic faults is to have all IPSIs share the same failover preference list. To achieve this, each ESS cluster can be assigned a base score value, between 1 and 100, which it will advertise to the IPSIs upon a connection. Suppose there are six port network gateways within a system distributed in three separate locations as shown in the figure below. Main SPE IP LAN Main Location ESS #40 ESS #20 ESS #30 IP LAN Location #1 EPN #1 EPN #2 IP LAN Location #2 EPN #3 EPN #4 EPN #5 IP LAN Location #3 EPN #6 Figure 19 – Basic System Layout If the disaster recovery plan states that all the port networks should attempt to get service from ESS #20, then from ESS #30, and then finally from ESS #40 in failure situations, the ESS clusters can be given the following base scores: ESS CLID Base Score 20 75 30 50 40 25 Table 5 – ESS Cluster Admin 28 The IPSIs will rank the ESS clusters based on their base scores (higher the better) and in this example each IPSI will have identical priority lists as follows: Port Network #: 1st Alternative: 2nd Alternative: 3rd Alternative: PN #1 PN #2 PN #3 PN #4 #20 #20 #20 #20 #30 #30 #30 #30 #40 #40 #40 #40 Table 6 – IPSI Priority Lists PN #5 #20 #30 #40 PN #6 #20 #30 #40 However, the network design may not allow this to be feasible. For example, suppose the WAN links interconnecting the locations is limited and attempts should be made to avoid sending control traffic over them unless absolutely necessary. This requirement can be realized if the IPSIs generated the following lists: Port Network #: 1st Alternative: 2nd Alternative: 3rd Alternative: PN #1 PN #2 PN #3 PN #4 #20 #20 #20 #30 #30 #30 #30 #20 #40 #40 #40 #40 Table 7 – IPSI Priority Lists PN #5 #30 #20 #40 PN #6 #40 #20 #30 To achieve this ESS, introduces two new concepts – communities and local preference. Every ESS server in the system is assigned to a community (with a default of 1) and every IPSI is assigned to a community (with a default of 1). By definition, if an ESS server is assigned to the same community as an IPSI, then the ESS server and IPSI are in the same community or local to each other. The local preference attribute can be assigned to an ESS server which can boost the priority score calculated for it by an IPSI if it is local to it. The priority score (PS) calculation is written generically as: N PS = ∑ wi vi i =1 Calculation 1 – Generic Priority Score Calculation with Index (i) 1 2 Factor Type base score local preference boost Factor Weight (w) 1 250 Factor Value (v) Administered base score (1-100) 0 – if not within same community or boost is not enabled 1 – if in the same community and boost is enabled Table 8 – Factor Definitions The 250 weight for the local preference boost guarantees that an ESS with this enabled will rank higher with respect to the IPSI if it is local than the ESS with the highest base score. If each ESS cluster in this example has this local preference boost enabled, shown below in table 9 below, then the IPSIs will all generate preference lists as shown above in table 8. ESS CLID 20 30 40 Base Score Locally Preferred 75 Yes 50 Yes 25 Yes Table 9 – ESS Cluster Admin Community 1 2 3 With this administration, it is important to realize that upon a catastrophic main server fault three autonomous systems will be created. All of the IPSIs in community #1 will failover to ESS #20 which is in community #1, all of the IPSIs in community #2 will failover to ESS #30 which is in community #2, and 29 the IPSI in community #3 will failover to ESS #40. If an additional disaster recovery plan requirement is added stating that the system should attempt to remain as one system as long as possible, then ESS #10 can be added to the system layout. Main SPE ESS #10 IP LAN Main Location ESS #40 ESS #20 ESS #30 IP LAN Location #1 EPN #1 EPN #2 IP LAN Location #2 EPN #3 EPN #4 IP LAN Location #3 EPN #5 EPN #6 Figure 20 – System Layout with System Preferred ESS However, even if ESS #10 is given a base score of 100, the IPSIs priority lists will unfortunately be as follows: Port Network #: PN #1 PN #2 PN #3 PN #4 PN #5 #20 #20 #20 #30 #30 1st Alternative: 2nd Alternative: #10 #10 #10 #10 #10 3rd Alternative: #30 #30 #30 #20 #20 4th Alternative: #40 #40 #40 #40 #40 Table 10 – Undesired IPSI Priority Lists PN #6 #40 #10 #20 #30 Even though the desired IPSI priority lists should be: Port Network #: 1st Alternative: 2nd Alternative: 3rd Alternative: 4th Alternative: PN #1 PN #2 PN #3 PN #4 #10 #10 #10 #10 #20 #20 #20 #30 #30 #30 #30 #20 #40 #40 #40 #40 Table 11 – Desired IPSI Priority Lists PN #5 #10 #30 #20 #40 PN #6 #10 #40 #20 #30 The desired IPSI priority lists causes the IPSIs to attempt to transfer over to ESS #10 in the case of a main cluster failure and then failover to local service if ESS #10 is unable to provide adequate control. To achieve this, ESS#10 must be given the system preferred attribute. This attribute will boost the priority score calculated of a system preferred ESS above any local locally preferred ESS clusters. This factor fits into the factor definitions as: Index (i) 1 2 3 Factor Type base score local preference boost Factor Weight (w) 1 250 Factor Value (v) Administered base score (1-100) 0 – if not within same community or boost is not enabled 1 – if in the same community and boost is enabled system preference boost 500 0 – if boost is not enabled 1 – if boost is enabled Table 12 – Factor Definitions 30 The 500 weight obviously allows system preferred ESS clusters to rank higher than all other clusters (even one that has a 100 base score and is locally preferred, 350 < 500). It is important to understand that the only need for assigning the system preferred attribute is to compensate for local preference attribute usage. In other words, if no ESS servers are given the local preferred setting, then there is no need to use the system preferred attribute. For simplicity, the IPSI was designed to treat all servers, main and ESS, consistently when calculating preference scores. The leads to the last factor that needs to be discussed – main server attribute. Since priority preference lists include the main cluster and it should always be ranked the highest (under normal conditions, the IPSIs should be receiving their control from the main cluster). Index (i) 1 2 Factor Type base score local preference boost Factor Value (v) Administered base score (1-100) 0 – if not within same community or boost is not enabled 1 – if in the same community and boost is enabled system preference boost 500 0 – if boost is not enabled 1 – if boost is enabled main cluster boost 1000 0 – if an ESS cluster 1 – if the main cluster Table 13 – Factor Definitions 3 4 Factor Weight (w) 1 250 Therefore, the disaster requirements are achieved by the following administration (the main cluster is always assigned a CLID of 1): ESS CLID 1 10 20 30 40 Base Score x 100 75 50 25 Main System Preferred Locally Preferred Yes x x No Yes x No No Yes No No Yes No No Yes Table 14 – ESS Cluster Admin Community 1 1 1 2 3 Which results in the following lists generated independently by each IPSI: Port Network #: 1st Alternative: 2nd Alternative: 3rd Alternative: 4th Alternative: 5th Alternative: PN #1 PN #2 PN #3 PN #4 #1 #1 #1 #1 #10 #10 #10 #10 #20 #20 #20 #30 #30 #30 #30 #20 #40 #40 #40 #40 Table 15 – IPSI Priority Lists PN #5 #1 #10 #30 #20 #40 PN #6 #1 #10 #40 #20 #30 To throw a final wrench into the picture, suppose the WAN link between locations 2 and 3 in Figure #20 is extremely limited and there is only enough bandwidth for one control link to traverse it. This implies ESS #40 cannot be an alternate control source for any port network gateway except for port network #6. This requirement manifests itself in the following preference lists: Port Network #: 1st Alternative: 2nd Alternative: 3rd Alternative: 4th Alternative: 5th Alternative: PN #1 PN #2 PN #3 PN #4 #1 #1 #1 #1 #10 #10 #10 #10 #20 #20 #20 #30 #30 #30 #30 #20 x x x x Table 16 – IPSI Priority Lists 31 PN #5 #1 #10 #30 #20 x PN #6 #1 #10 #40 #20 #30 ESS #40 must be administered to only offer its services to IPSIs within the same community or, stated differently, provide service locally only, which is realized by giving ESS #40 the local only attribute. This attribute is unlike the other factors whereby there is not a boost in score, but rather if this attribute is enabled for an ESS, that ESS server does not even attempt to contact IPSIs which are not within its own community and thereby never get onto non-local IPSI priority lists. The administration to achieve the priority lists in table X above is: ESS CLID 1 10 20 30 40 Base Score x 100 75 50 25 Main Yes 10 20 30 40 Sys Preferred Locally Preferred x x Yes x No Yes No Yes No Yes Table 17 – ESS Cluster Admin Community 1 1 1 2 3 Local Only x No No No Yes It is the combination of these basic factors which give the ESS offering the ability to achieve very complicated deterministic failovers for IPSIs, but at the same time is easily administered. While every IPSI independently generates its own priority list some commonalities will exist among all of them. The IPSI’s priority list can be broken down generically as follows: Main Server Common Among All IPSIs Possibly Different Among All IPSIs Figure 21 – Generic IPSI Priority Lists Where the break line between “common among all IPSIs” and “possibly different among all IPSIs” is based on the number of system preferred ESS clusters administered and the number of ESS which have the local preference attribute enabled. If the local preference attribute is not utilized, then the last section of the priority lists, “possibly different among all IPSIs,” would always be empty. 6.5 How do IPSIs Manage Priority Lists ? The previous section discussed how IPSIs generate their priority failover lists in non-faulted environments (every ESS can connect to every IPSI). Whenever a new cluster connects to an IPSI, a priority score is calculated for that cluster based on the advertised factors. The IPSI then dynamically inserts the cluster into its priority list based on the priority score and informs all currently connected ESS servers about the updates. Whenever an IPSI cannot communicate with a media server due to either a server fault or network fragmentation, it is dynamically removed from the priority list and all ESS servers still connected will receive updates. 32 ESS #10 ESS #20 Main (CLID #1) ESS #30 IP LAN EPN #1 Figure 22 – IPSI Interfacing with Multiple ESS Clusters Using the diagram above, under non-faulted environments, the IPSI’s priority failover list will be: PN #1 1st Alternative: #1 2nd Alternative: #10 3rd Alternative: #20 4th Alternative: #30 Table 18 – IPSI Priority Lists If a communication failure is detected (discussed in the next section) between the IPSI and, for this example, ESS #20, then the IPSI would remove ESS #20 from its priority list. PN #1 1st Alternative: #1 2nd Alternative: #10 3rd Alternative: #30 4th Alternative: none Table 19 – IPSI Priority Lists Once the communication between the IPSI and the ESS is restored, the IPSI will insert it back into the list appropriately. PN #1 1st Alternative: #1 2nd Alternative: #10 3rd Alternative: #20 4th Alternative: #30 Table 20 – IPSI Priority Lists It should be obvious that these concepts are easily extended to the addition/removal and enabling/disabling of ESS servers. For example, if an ESS server is added to the system (ESS #40 with a base score of 60), it will be inserted into the IPSI’s list appropriately. PN #1 1st Alternative: #1 2nd Alternative: #10 3rd Alternative: #40 4th Alternative: #20 5th Alternative: #30 Table 21 – IPSI Priority Lists 33 Since the IPSIs do not keep any type of historical records of previously connected ESS clusters, the system survivability plans can be altered at any time (e.g. move ESS clusters around strategically) or implemented over time (e.g. no flash cuts are required). While this list manipulation is very simple, the complexity occurs when port networks have duplicated IPSIs. It is absolutely critical that a pair of IPSIs in the same PN share identical priority lists. This is achieved by communication between the IPSI pairs over the carrier’s backplane. In a stable state shown in Figure 23, the IPSIs will each have priority lists of: A-side IPSI B-side IPSI 1st Alternative: #1 #1 2nd Alternative: #10 #10 3rd Alternative: #20 #20 Table 22 – IPSI Priority Lists Main (CLID #1) ESS #10 ESS #20 Control Network A Control Network B IPSI – A IPSI – B EPN #1 Figure 23 – Port Network with Duplicated IPSI and Multiple ESS Clusters If a failure occurs which causes ESS #10 to be unable to keep a session up with the B-side IPSI, the following steps take place: 1. 2. 3. Socket failure is detected by B-side IPSI. The B-side IPSI informs the A-side IPSI and checks if ESS #10 is still connected to the A-side IPSI (which it is). The B-side IPSI keeps ESS #10 in its list because the A-side IPSI is still connected to ESS #10. A-side IPSI B-side IPSI 1st Alternative: #1 #1 2nd Alternative: #10 #10 3rd Alternative: #20 #20 Table 23 – IPSI Priority Lists If the A-side IPSI then looses its connection to ESS #10, then the following procedural steps take place: 1. Socket failure is detected by A-side IPSI 34 2. 3. 4. The A-side IPSI informs the B-side IPSI and checks if ESS #10 is connected to the B-side IPSI (which it is not). The A-side IPSI removes ESS #10 from its priority list because it is not connected to itself or its peer. The B-side IPSI removes ESS #10 from its priority list for the same reasons. A-side IPSI B-side IPSI 1st Alternative: #1 #1 2nd Alternative: #20 #20 3rd Alternative: none none Table 24 – IPSI Priority Lists The conclusion of this should be that an ESS server does not get eliminated from either IPSIs list in a PN unless it cannot communicate to either of the IPSIs. This concept is further extended to cover the case where the ESS cluster has duplicated servers. As long as at least one server from an ESS cluster can communicate with at least one IPSI in a port network, it will remain on the priority lists of both IPSIs in the PN. The final caveat surrounding the management of the IPSI’s priority lists deals with the maximum size of these lists. If there was no cost associated with an ESS being on an IPSIs lists, then it would make sense to have maximum list sizes of 64 clusters (maximum of 63 ESS clusters along with a main cluster). However, there are a number of costs, albeit small, of having an ESS in an IPSI’s priority list. The first cost is the number of heartbeat messages that are continuously being exchanged between the ESS cluster and the IPSI. While each message is small and only occurs once per second, in a mesh connectivity configuration (every ESS server connected to every IPSI) unnecessary traffic is introduced onto the network. Secondly, the IPSI needs to spend precious resource cycles and memory managing the ESS clusters on its list. And finally, since every connected IPSI gets notified whenever any changes occur, a very large number of ESS clusters on a list increase the probability of changes occurring and increase the number of notification messages that are sent out when these changes happen. Therefore, the maximum list size of an IPSI for the ESS offering is 8 clusters. In other words, the IPSIs will maintain a priority list, under normal conditions, which contains the main cluster and the 7 best (based on calculated priority scores) ESS alternatives. This leads to the question of what happens to all the other ESS clusters in the system with respect to that IPSI. The answer is that if an ESS’s calculated priority score does not rank it high enough to be within the top 8, then it will be rejected. If an ESS is rejected from an IPSI it will disconnect from it and start a 15 minute timer. Once the timer expires, the ESS will re-attempt to get onto the IPSI’s priority list. ESS #30 ESS #40 ESS #50 ESS #20 ESS #10 ESS #60 ESS #70 IP LAN ESS #80 Main (CLID #1) ESS #90 EPN #1 Figure 24 – IPSI Interfacing with More Than 8 ESS Clusters The figure above shows a system with 9 ESS servers. While there is no correlation between CLID and priority ranking (except for the main cluster which is always CLID #1), this example will assume 35 administration which causes lower CLIDs to have higher priority scores. This gives the IPSI the following priority list: PN #1 1st Alternative: #1 2nd Alternative: #10 3rd Alternative: #20 4th Alternative: #30 5th Alternative: #40 6th Alternative: #50 7th Alternative: #60 8th Alternative: #70 Table 25 – IPSI Priority Lists Notice that ESS #80 and ESS #90 are not on the IPSI’s list even though network connectivity exists and are in a rejected state with respect to this IPSI. After the reject timer (15 minutes) expires, ESS #80 and #90 will attempt to re-connect to the IPSI only to reach the same result – get rejected again. However, suppose that some network fragmentation fault takes place which prevents ESS #30, #40, and #50 from communicating with the IPSI. This will leave the IPSI’s list looking as follows: PN #1 1st Alternative: #1 2nd Alternative: #10 3rd Alternative: #20 4th Alternative: #60 5th Alternative: #70 6th Alternative: none 7th Alternative: none 8th Alternative: none Table 26 – IPSI Priority Lists If the network connectivity is not restored within 15 minutes, ESS #80 and #90 reject timers will expire and attempt to reconnect to the IPSI. Unlike the previous time however, the IPSI will insert them on the list as the 6th and 7th alternatives respectively. PN #1 1st Alternative: #1 2nd Alternative: #10 3rd Alternative: #20 4th Alternative: #60 5th Alternative: #70 6th Alternative: #80 7th Alternative: #90 8th Alternative: none Table 27 – IPSI Priority Lists When the network faults are resolved, the isolated ESS clusters will reconnect with the IPSI and the IPSI will insert them onto its lists. When the first ESS returns, it will be inserted as the 4th alternative and all the lower ranked alternatives will be shifted down. However, when the next ESS server reconnects, the IPSI will already have a full list. This is resolved by having the IPSI reject the lowest ranked alternative, in this case ESS #90, and then insert the returning ESS cluster appropriately. This procedure will continue until the priority list goes back to how it was originally. 36 6.6 How are Communication Faults Detected ? After a media server connects to an IPSI and is inserted onto the priority list, heartbeat handshaking commences. These heartbeats are generated from the media server once per second and sent to the IPSI. Upon reception of a media server’s heartbeat, the IPSI will generate a heartbeat response and send it back to the media server. An application keep-alive is required because quick communication failure detection is needed. Therefore, a communication failure can be detected if there is an explicit closure of the TCP socket or there is a period of time that goes by without receiving a heartbeat. From the server’s perspective, a heartbeat response is expected from the IPSI in response to the periodically generated heartbeats. Under normal conditions, this response is received within a few milliseconds of sending the heartbeat message. This time delay is a function of the network latency and IPSI processing time (which is extremely minimal). However, the media server only requires that a response was received before it needs to send out the next heartbeat message. If the response is not received within one second (the time interval between heartbeats), a sanity fault is encountered. If three consecutive sanity failures occur and no other data has been transferred from the IPSI to the server within that time, the server closes the socket signifying a communication failure and socket recovery (as described previously) begins. If there are connectivity problems which caused this communication failure, then the IPSI will probably not receive the TCP socket closure. In this case, the IPSI will have to detect the connectivity problem on its own. The IPSI’s perspective is very similar to the server’s perspective. The IPSI expects to receive a heartbeat from the server once per second and if it does not get one, a sanity failure occurs. If six consecutive sanity failures occur and no other data has been received from the server during that time, a communication failure is detected and this causes the IPSI to close the existing TCP socket. Like the previous perspective above, if there is a connectivity problem, then it is up to the server to self-detect the issue since it probably will not receive the TCP socket closure from the IPSI. 6.7 Under What Conditions do IPSIs Request Service from ESS Clusters ? Up to this point in the discussion, how the IPSIs create and manage their priority lists has been covered, but not for which it is used. The discussion now shifts gears to examine how IPSIs use these ESS ranked lists to request service when needed. In order to eliminate any ESS contention over controlling an IPSI, the power of selecting an IPSI’s master is done by the IPSI itself. The IPSIs follow the basic rule that if no service is being supplied, request an ESS cluster to take control. In the Background section, it was shown that there are two different ways that a port network can be controlled – either directly though a co-resident IPSI or indirectly through an EI board when leveraging an ATM or a CSS PNC. When a port network is being controlled, an arch-angel has been designated which is the master of all other angels within the PN. By definition, if an angel has an arch-angel as a master, the angel is referred to as being scanned. Using this definition, an IPSI is said to be getting service if it has a master cluster or is being scanned. There are a few implications to this. First and foremost, if an IPSI has a media server which has declared itself the IPSI’s master and remains connected, it will not request service from another ESS cluster. This allows the current master server to attempt all other recoveries, as described in the Reliability - Single Cluster Environments section, without having the port network failover to another cluster. This is critical as failing over to an ESS server should only occur after all other recovery methods have been exhausted. For example, the recovery plan to address a failure may consist of IPSI interchanges or server interchanges and if the IPSI requests service from another cluster it would prevent these operations. The second implication is that if the port network is being controlled indirectly through another PN, then the IPSI should not attempt to find its port network another master. One of the recoveries previously discussed is to have a port network go into fallback mode upon a connectivity failure if possible. In Figure 25 below, the IPSI in port network #2 cannot communicate with the main cluster and therefore shifts the ESS to the first alternative. However, the main cluster gained control of the port network through the PNC. Even thought the IPSI in PN #2 does not have a master, it should not request service since the port network is up and functioning. 37 Main (CLID #1) ESS #10 Main (CLID #1) LAN LAN IPSI IPSI IPSI IPSI EAL #1 EAL #1 EAL #2 EPN #2 EPN #1 ESS #10 EI EI EPN #2 EPN #1 EI EI EAL #2 IPSI #1 Priority List #1 #10 PNC Normal Operation IPSI #2 Priority List #1 #10 IPSI #1 Priority List #1 empty PNC Fallback Recovery Preferred over ESS Failover IPSI #2 Priority List #10 empty Figure 25 – Fallback Recovery Occurs is Preferred to ESS Takeover Another topic covered in the Reliability – Single Cluster Environments section is the recovery methods used if an IPSI and server become disconnected. It is stated that if the connectivity is restored within 60 seconds, then the server shall regain control of the port network via a warm restart (which does not affect stable calls). Therefore, an IPSI gives its previous master some time, referred to as the no service time, to re-connect after a connectivity failure before requesting service from another ESS cluster. If an IPSI loses its connection to its master cluster, then it starts a no service timer. If the cluster reconnects within the no service timer, the IPSI remains under its control. If the no service timer expires and the IPSI is not getting scanned, then the IPSI requests service from the highest ranked ESS cluster in its list. Since IPSIs all act independently with respect to requesting service, there are times that could cause a single system to fragment into smaller autonomous systems. In Figure 26 below, the IPSI in port network #3 requests service from the ESS which it can communicate with after the no service timer expires. If the network then heals itself the main cluster reconnects to PN #3’s IPSI. In this case, the IPSI will place the main cluster as the 1st alternative and shift the ESS, which is currently controlling the IPSI, to the 2nd alternative. It is very important to realize, that a reconnection of a higher ranked ESS does not imply that the IPSI will shift control over to it. The IPSI in PN #3 will remain under the control of the ESS until another fault occurs or control is overridden (discussed in the next section). In this situation, every port network provides service to all of its end users, but does not get all the advantages of being one system such as availability of centralized resources, simple inter-port network dialing, and efficient resource sharing. Therefore, there is a trade-off between waiting for a previous controlling cluster to return (during that time no service is being provided to the port network) and requesting service from another ESS cluster (which causes system fragmentation). If an outage is only going to last a few minutes, it may be worth while to be out-of-service during that time, but remain as one large system when the outage is rectified. Since no crystal balls exist which can be queried when an outage occurs to determine how long the outage will be, the IPSI relies on the no service timer. This no-service timer is configurable by the customer with a range of 3 to 15 minutes. The minimum of three minutes was derived from waiting at least one minute to allow pre-ESS recoveries and then two additional minutes to avoid fragmentation. 38 Main ESS Main IP LAN EPN #1 EPN #2 ESS IP LAN EPN #3 EPN #1 EPN #2 EPN #3 Fragmented Network – Two Autonomous Systems Single System Figure 26 – Autonomous Systems Created Upon ESS Takeover The final piece of the requesting service discussion involves the duplication of IPSIs within a port network. The previous section covered how the priority lists between the peer IPSIs are guaranteed to be identical, but the service requests must also be synchronized. The IPSI is a conglomeration of a number of different components with one being the tone clock. When two IPSIs are within the same PN, one of the IPSI tone clocks drives the port network and the other IPSI’s tone clock is in a standby mode. The IPSI which has the active tone clock, otherwise known as the selected IPSI, owns the responsibility of the no service timer and when the request for service will be sent. Upon a connectivity failure from the master cluster to both IPSIs in the PN, the selected IPSI starts the no service timer. When the no service timer expires, the selected IPSI sends out a request for service and also instruct its peer IPSI to do the same via the backplane communication channel. It is important to see however, if the master cluster only loses connectivity to the selected IPSI and not the standby IPSI, the controlling server will attempt an IPSI interchange. This IPSI interchange, driven completely by the server, causes the previously standby IPSI to become the port network’s active, or selected, IPSI. The following figure shows the handshaking that occurs after an IPSI decides to send a request for service to its highest ranked alternative ESS once the IPSI’s no service timer expires. If the handshaking is not successful and the requested ESS does not takeover the IPSI, the IPSI will attempt to get service from the next alternative in its priority list. Media Server IPSI Service Request Control Takeover Message Notification Update Message (Takeover Confirmation Included) Figure 27 – Request For Service Message Flow 39 Notification update message gets sent to all media servers currently connected to the IPSI 6.8 How Does Overriding IPSI Service Requests Work ? When catastrophic failure situations occur, ESS servers will assume control of all port network gateways which cannot be given service by the main servers. As shown, this can occur for two reasons – main server goes out-of-service or the network prevents communication from the IPSI to the server. Eventually, the failure, of either type, will be fixed and the conditions whereby the IPSI shifts control over to the ESS server no longer exist. However, since shifting control between clusters is not connection preserving for all types of calls, the IPSI does not automatically go back to the main server. As covered in the last section, the IPSI does not request service unless it does not have a current master. Hence, unless there is an override to the master selection process, the port networks which are being controlled by ESS servers will never return to be part of the main system again until another fault occurs. The ESS offering introduced a new command into the already large suite supported by CM to address this situation. The command “get ipserver-interface forced-takeover <PN # / all>” can be issued from the main cluster in order to have the IPSI revert control over to it. This command must be used with caution because it causes a service outage on the port network that it is forcing over under its control. The IPSI, upon reception of the unsolicited takeover message (all other takeovers are in response to a service request) from the main cluster, informs its current master cluster it is shifting control back to the main cluster. Along with the manual agglomeration command introduced here, there are methods to put the system back together on a scheduled basis. If a fault occurs that requires ESS servers to assume control of some or all of the port networks, the system can be administered to converge back to the main cluster at a specified time. However, unlike the manual command, all the port networks under control of ESS servers are shifted over, and therefore, being reset at the same time. There is never a good time to have an outage, but controlling when it occurs is much better than the alternative – unplanned service outages. Section 7 - ESS in Control This section is broken into two topics – how existing calls are affected by failing over to ESS control and what the operational consequences are if a system becomes fragmented. ESS servers offer equivalent service (no feature debt and no performance degradation) to resources they assume control of as the main servers provided. However, there are many types of conditions when ESS servers only take control of some of the port networks in the system and that has a number of caveats. These topics include how dialing patterns may change in fragmented situations and what happens to centralized resources such as voicemail and other adjuncts such as CMS and CDR. 7.1 What Happens to Non-IP Phone Calls During Failovers ? If a failure occurs which requires an ESS to assume control of a port network, calls supported by that PN are affected drastically. As previously discussed, when the control of a PN shifts between entities that do not share call state data, the control transfer process requires a cold restart of the PN. This cold restart causes all TDM resource allocation to be reset and therefore any calls using those resources to be torn down. Since all analog phones, digital phones, and PRI trunks always use their associated port network’s TDM bus for a call, the calls they are members of are reset during this failover. IP telephony does not follow the same paradigm and is discussed in the next section. The following timeline shows the effects on the calls when recoveries take place. 40 60 seconds If previous master returns, EPN warm restart recovery. All calls within the EPN are preserved. No Service Timer If previous master returns, EPN cold restart recovery. All calls within the EPN are dropped. ESS takes over which causes an EPN cold restart recovery. All calls within the EPN are dropped. Figure 28 – Call Effects Based on Different Recovery Mechanisms It is very important to understand what “calls are preserved” means. If a call is preserved, that means the bearer connection and the call control are both unaffected when the recovery is complete. The next section, which deals with IP telephony, relies on the concept of “connection preservation.” If only the connection is preserved, then the bearer connection is unaffected, but the call control is lost. For example, if a user is on a call and a call preserving recovery takes place, there is no effect on the end user. However, if a user is on a call and a connection preserving recovery occurs, the communication pathway is undisturbed, but all call control is lost (e.g. the user will be unable to transfer or conference the call in the future). Another question that needs to be answered is what happens to the calls while waiting for a recovery. Stated differently, if a port network fails over to an ESS server after five minutes, what happened to the existing calls during that time? The answer is that the bearer connection continues to persist until the port network is restarted, but with zero call control (cannot put the call on hold, retrieve another call appearance, or even hang-up). This comes from the fact that if a PN loses its control link it preserves the current resource allocations. The following figure shows five phone calls which are currently active. The phones in this example are non-IP endpoints such as digital or analog. 2 3 4 PSTN PSTN EPN #1 EPN #2 EPN #3 EPN #4 5 1 PNC Figure 29 – Effects on Non-IP Phone Calls during Failover Suppose the two port networks on the right, PN #3 and PN #4, encounter some condition which causes them to failover to an ESS server. Calls #1 and #2 are not affected by this failover since their supporting port networks does not have a control shift and they are not using any resources on PN #3 or PN #4. Call #3 drops because it is not completely independent of PN #3 (the controlling server of PN #2 tears down call #3 once it loses control of all the resources it was managing involved with the call). Calls #4 and #5 also drop because the ESS taking over performs an EPN cold restart on port networks #3 and #4. 41 7.2 What Happens to IP Phone Calls During Failovers ? A traditional TDM phone (e.g. analog, digital, or BRI) is hardwired to a circuit pack which supports it. Through this wire, the phone receives signaling (e.g. when to ring or what to display). In addition, the wire is also used as the medium for bearer traffic. An H.323 IP phone does not have a single dedicated pathway for both of these functions. Digital Line Card Signaling & Bearer MEDPRO CLAN Bearer 1 Signaling IP Network Digital / Analog Phone Bearer 2 H.323 IP Phone H.323 IP Phone Figure 30 – Control and Bearer Paths for IP and TDM Phones The H.323 IP phone and the CLAN establish a logical connection to be used for call control signaling. The MEDPRO and the IP phone exchange RTP streams when certain types of bearer paths are required. For instance, when an IP phone dials the digital phone’s extension (call #1 in Figure 30), all of the signaling goes through the CLAN which is the gatekeeper for the SPE controlling the port network. The actual bearer communication pathway goes from the phone over IP to the MEDPRO, which is a conduit onto the TDM bus and then to the digital phone. If the IP phone called another IP phone (call #2 in Figure 30), the signaling would still go through the CLAN, but the bearer path will be directed at the terminating IP phone rather than the MEDPRO. Unlike digital phones whereby a failure of the circuit pack supporting it means it becomes inoperable, IP phones have the ability to shift from where they are getting control. In other words, if an IP phone fails to communicate with its current gatekeeper, the CLAN in this case, it attempts to get service from another one within the system. While the IP phone seeks out an alternate gatekeeper it keeps up its current bearer path connection. If the IP phone attempts to register with a CLAN that is being controlled by the same entity which is controlling its previous CLAN gatekeeper (shown on the left side of Figure 31), a control link is immediately established and the call is preserved. The IP phone continues to search gatekeepers until this condition is met. If it fails to do this, the IP phone will keep the existing connection up until the end user decides to terminate the call (e.g. hang-up). After the end user hangs up, the IP phone will search out a gatekeeper without the conditions just discussed. 42 Main ESS Main ESS IP IPSI controlled by same server IP IPSI EPN #1 IPSI EPN #2 CLAN EPN #1 CLAN EPN #2 CLAN CLAN failed control link IP failed control link IP new control link, established during call Call Preserving controlled by different servers IPSI new control link, established after call is ended Connection Preserving Figure 31 – H.323 IP Phone Link Bounce That being said, if an IP phone is not using any port network resources for the bearer portion of its call, the connection is preserved through a failover. If the port network supporting these three calls in Figure 32 below fails over to an ESS, only calls #1 and #2 are dropped since PN resources are being used from the bearer path. Call #3 has two IP phones communicating directly with each other (a shuffled call) and is connection preserving through the failover. Digital Line Card PSTN DS-1 Card MEDPRO 1 2 IP Network Digital / Analog Phone H.323 IP Phone 3 H.323 IP Phone H.323 IP Phone Figure 32 – Effects on IP Phone Calls during Failover 7.3 What Happens to H.248 Gateway Calls During Failovers ? Up to this point in the paper the only gateways that have been discussed were port networks. There is another series of gateways that the Avaya Communication Server supports – H.248 gateways (G700, G350, and G250). These gateways, which support digital and analog phones, PRI trunks, and VoIP resources, get controlled by the SPE indirectly through CLAN gatekeepers as H.323 IP phones do and also share the ability to seek out alternate gatekeepers. Therefore, since ESS servers provide survivability to the CLAN gatekeepers, they also provide survivability for H.248 gateways. This section covers the effects of calls 43 supported by H.248 gateways as they failover to the control of an ESS. The ability for H.248 gateways to also utilize Local Spare Processors (LSP) as alternate sources of control is discussed in the LSP/ESS Interaction section of this paper. SPE IP Digital / Analog Phone IPSI EPN #1 CLAN MEDPRO Digital Line Card 1 PSTN Signaling IP Bearer 2 H.248 Gateway Digital / Analog Phone 3 Digital / Analog Phone Digital / Analog Phone Digital / Analog Phone Figure 33 – H.248 Gateway Integrated into the System The figure above shows how H.248 gateways integrate into the communication system. As with H.323 IP phones, the H.248 gateway receives its control from the SPE via the CLAN gatekeeper. If the gateway loses its control link to the CLAN, it seeks out other possible gatekeepers that could provide it service. The gateway has a pre-programmed fixed failover list of alternate gatekeepers if a failure occurs. The gateway registers and brings up a control link to the first alternate gatekeeper that it can contact. If the gateway registers with a CLAN that is controlled by the same SPE which controlled its previous CLAN, then all the calls on the gateway are preserved (left side of Figure 34 below). If the gateway registers with a CLAN that has a master which is unaware of its calls, then the failover is connection preserving only (right side of Figure 34 below). 44 Main ESS Main ESS IP controlled by same server IP IPSI IPSI EPN #1 IPSI EPN #2 CLAN H.248 Gateway CLAN IP new control link, established immediately Call Preserving EPN #2 CLAN failed control link IP failed control link EPN #1 CLAN controlled by different servers IPSI H.248 Gateway new control link, established immediately Connection Preserving Figure 34 – H.248 Gateway Control Link Bounce Figure 33 shows how H.248 gateways integrate in the system and some example calls. If the port network containing the CLAN gatekeeper fails over to an ESS server, the existing calls are affected. Call #1 would be dropped because the bearer path of the call traverses the TDM bus of the port network. However, calls #2 and #3 only use the H.248 gateway resources for the bearer path and therefore do not lose their connections over the failover to an ESS. 7.4 How are Call Flows altered when the System is Fragmented ? Phone #2 Phone #3 Phone #1 Phone #4 4 PSTN 5 3 1 2 EPN #1 PSTN EPN #2 PNC Figure 35 – Phone Calls on Non-Faulted System The system laid out above has four digital phones with extensions 2001, 2002, 2003, and 2004 respectively and two PSTN trunks. In addition, suppose there is a route pattern which attempts trunk #1 and then trunk #2. The following table shows the example calls and the steps involved to establish them. 45 Phone Call #1: Phone #1 ! Phone #2 #2: #3: #4: #5: Establishment Steps Phone #1 dials “2002”. SPE identifies “2002” as phone #2’s extension. SPE is in control of phone #2 and knows it is in service. 4. SPE rings phone #2 and the user answers. 5. SPE connects a bearer path between the phones. Phone #3 ! Phone #4 Same as previous call. Phone #2 ! Phone #3 Same as previous call. Phone #1 ! External Number 1. Phone #1 dials external number. 2. SPE determines route pattern for dialed number. 3. SPE gets 1st option of route pattern (trunk #1). 4. SPE is in controls trunk #1 and knows it is in service. 5. SPE sends call out to the PSTN through trunk #1. Phone #4 ! External Number Same as previous call. Table 28 – Phone Call Establishment Steps (Non-Faulted) 1. 2. 3. Some of the five call flows just reviewed change drastically if the system becomes fragmented. The following table examines the steps taken during the same call examples when port network #2 has failed over to an ESS server. It is very important to note that these flow changes only occur because the system is fragmented. If the entire system (PN #1 and PN #2) failed over to ESS control, then the call flows would remain unchanged. Phone #2 Phone #3 Phone #1 Phone #4 4 PSTN 5 1 PSTN 2 EPN #1 EPN #2 Call PNC 3 does not succeed. Figure 36 – Phone Calls on Fragmented System Phone Call #1: Phone #1 ! Phone #2 #2: Phone #3 ! Phone #4 #3: Phone #2 ! Phone #3 1. 2. 3. 4. 5. Establishment Steps Identical to non-faulted call flow. Identical to non-faulted call flow. Phone #2 dials “2003”. SPE identifies “2003” as phone #3’s extension. SPE does not control phone #3 and therefore assumes it is out-of-service. SPE send the call to phone #3’s defined coverage path. If an in service coverage path termination point is found, the call is routed there; otherwise, re-order tone is given to the originator (phone #2). (see below how phone #1 can successfully call phone #3 in a fragmented system) 46 #4: Phone #1 ! External Number #5: Phone #4 ! External Number Identical to non-faulted call flow. Phone #4 dials external number. SPE determines route pattern for dialed number. SPE gets 1st option of route pattern (trunk #1). SPE does not control trunk #1 and therefore assumes it is out-of-service. 5. SPE then gets 2nd option of route pattern (trunk #2). 6. SPE is in control of trunk #2 and knows it is in service. 7. SPE sends call out to the PSTN through trunk #2. Table 29 – Phone Call Establishment Steps (Fragmented System) 1. 2. 3. 4. If a user desired to call a phone which was under control of a portion of the system which has been fragmented away, then the user will have to dial differently. Assuming that DID numbers map to trunks local to the user, if the originating phone can dial the terminating phone’s extension as an external number, the SPE processes it as such and routes it out through a PSTN trunk under its control. The PSTN would then route the incoming call to the port network which is fragmented away. The ESS SPE would then route the incoming PSTN call to the appropriate extension. Using the previous example, call #3 with phone #1 dialing phone #3’s extension as an external number would succeed as follows: Phone #2 outgoing PSTN call Phone #3 Phone #1 Phone #4 outgoing PSTN call 3 PSTN PSTN EPN #1 call routed over the PSTN EPN #2 PNC Figure 37 – Extension to Extension Call Routed over PSTN In addition, when a system enters into a fragmented situation, different call routing may be desired. In the How does ESS Works section, it was discussed that changes can be made to ESS server’s translations, but any changes will be lost when the ESS reset. This behavior can be leveraged to have ESS systems utilize resources differently than the main server does without effecting the permanent system’s translation set. For example, if the disaster plan requires that all incoming calls to the part of the system which is fragmented away get rerouted to a recorded announcement, administration changes can be made on the ESS system to accomplish this. However, these changes are not reflected on the main server and therefore once the port network returns to the main server, the calls are handled as normal. This manual technique can be used if a system is going to be staying in a fragmented state for an extended period of time. 7.5 What Happens to Centralized Resources when a System is Fragmented ? A big advantage to collapsing autonomous systems together is the more efficient usage of resources such as trunks, tone detectors, and announcements. Another advantage is the centralization of adjuncts. For example, rather than having two distinct CDR systems serving two distinct systems, one centralized CDR can serve one larger PBX switch. However, the cost benefit of doing this must be weighed against the risk of being isolated away from the resource in failure situations. Suppose two separate systems (shown on the 47 left side of Figure 38) are collapsed together and cost savings can be realized by the elimination of two PSTN trunks (shown on the right side of Figure 38). The danger of eliminating the trunks in this way (two from site #2 rather than one from each) is upon a network fragmentation failure which causes the ESS server to takeover EPN #2 and becomes its own stand-alone system, the users under the ESS server’s control are functional, but have no trunk access. Main – Site #1 Main – Site #2 IP ESS IP IPSI IP IPSI EPN #1 TR #1 Main IPSI EPN #1 TR #2 TR #3 IPSI EPN #1 TR #1 TR #4 EPN #2 TR #2 PSTN PSTN Separate Systems Collapsed Systems Figure 38 – Two Systems Collapsed Together Many adjuncts interface with the communication system over IP through a CLAN gatekeeper such as CMS and CDR. These adjuncts, unlike H.323 IP phones and H.248 gateways, do not have the ability to find an alternate gatekeeper if a problem arises. For that reason, the adjuncts are tightly associated with a CLAN, or in other words, these adjuncts are bound to the port network. The following diagram shows an adjunct interfacing with the system. As long as the link is up, the SPE continuously transmits pertinent data over the adjunct link. The service state of the link is inherently tied to the state of the port network and the state of the CLAN. With these short-comings, the communication between adjuncts and the SPE is increased by having multiple communication links – a primary link and an optional secondary link. The SPE does not treat them as an active/standby pair, but rather as both simultaneously functioning in an active mode (the relevant data is transmitted on both links concurrently). In some cases, both links may terminate to the same physical adjunct (shown below) and in others, the links may terminate to different physical adjuncts. 48 Main ESS Main ESS IP IPSI IPSI EPN #1 IP IPSI EPN #2 IPSI EPN #3 IPSI EPN #1 CLAN EPN #2 CLAN primary link IP primary link IP secondary link Adjunct Primary Adjunct Link EPN #3 CLAN IP Adjunct IPSI Primary & Secondary Adjunct Links Figure 39 – Adjuncts Interfacing over IP with PBX If a catastrophic server failure takes place and the entire system is taken over by an ESS cluster, the ESS’s SPE re-establishes the adjunct links and sends the appropriate data. The ESS SPE gained control of the adjunct links once it gains control of the port network with which it is associated. The interesting cases are when the system fragments. Assume a network failure which causes PN #2 and PN #3 to failover to the ESS server (left side of Figure 40 below). The main server continues to supply service to PN #1 and therefore still communicates over the adjunct link. The ESS server, on the other hand, controls PN #2 which has the other adjunct link. Therefore it transmits appropriate data over that link. In this situation, both fragmentations are still interfacing with the adjunct. Main ESS Main ESS IP IPSI IPSI EPN #1 CLAN IP Adjunct IP IPSI EPN #2 IPSI EPN #2 IPSI EPN #1 CLAN CLAN IP IP under ESS control IPSI EPN #2 EPN #2 CLAN IP Adjunct under main SPE control ESS Fragment Does Not Interface with Adjunct ESS Fragment Interfaces with Adjunct Figure 40 – Adjuncts Working in Fragmented Situations 49 The right side of the figure above shows a different network fragmentation whereby EPN #3 is isolated away from the rest of the system. In this case, the main server still owns both the primary and secondary links and continues to pump relevant data to the adjuncts over them. The autonomous system that was created made up of EPN #3 and the ESS server is unable to interface or utilize the adjuncts to which it no longer has access. For example, suppose the adjunct links in the example were CDR links. During this fragmented operation, all calls on the main server’s portion of the switch would be sent over the primary and secondary CDR links. Unfortunately, the fragmented portion of the system would not have any medium to output the call data on and after the SPE’s internal CDR buffers filled up, the data would be lost. 7.6 What Happens to Voicemail Access when a System is Fragmented ? call rerouted to voicemail Voicemail System PNC EPN #1 Phone #1 EPN #2 PSTN incoming call to phone #1 Figure 41 – Call Rerouted to Voicemail The figure above shows a voicemail system interfaced to the PBX over trunks that connected with PN #1. Any calls that are attempting to terminate to phone #1, but go unanswered, are rerouted to phone #1’s coverage point which is usually voicemail. The SPE sends the call to the voicemail system and informs it who the original call is for, who is calling, and the reason the call is being rerouted to the voicemail system (e.g. busy or ring/no answer). With this information the voicemail system can place the call into phone #1’s mailbox and plays the appropriate greeting. After a message is left for the phone, the voicemail instructs the SPE to turn on phone#1’s message waiting indicator (MWI). If phone #1 dials the voicemail system to get its messages, the voicemail system receives the appropriate information to identify phone #1 and place it in its mailbox. If this information about the calling and called parties is not supplied, the voicemail system provides a generic greeting and allows the calling party to manually identify itself for message retrieval or whom they are calling in order to leave a message. Suppose a failure occurs which causes the system to fragment in a way that PN #2 falls under the control of an ESS server. In this situation, a call needing to go to phone #1’s coverage point will need an alternate route because the voicemail system is no longer directly accessible from the fragmented portion of the switch. The SPE on the ESS server is intelligent enough to reroute the call over the PSTN to get to the voicemail system. In this rerouted call through the PSTN the ESS SPE provides all the information needed to allow the call to enter the correct mailbox. The same procedure is followed when phone #1 desires to check its messages and the voicemail system puts phone #1 into the appropriate mailbox. The only draw backs to operating in this environment is that all calls going to voicemail coverage on the fragmented site traverse the PSTN and the message waiting indicators on the phones not on the same part of the system as the voicemail are non-functional (no indication of new messages). 50 call rerouted to voicemail Voicemail System PNC EPN #1 Phone #1 EPN #2 PSTN incoming call to phone #1 Figure 41 – Call Rerouted to Voicemail Over PSTN Section 8 - ESS Variants Based on System Configuration The ESS offering ensures port networks are able to receive control from alternate sources upon a catastrophic failure of the main servers or lack of network connectivity to them for an extended period of time. Up to this point in the paper, recovery descriptions and explanations have been done independently of the system’s configuration. Unfortunately, there are subtle behavioral differences in port network operation depending on the type of PNC to which the PN is connected. Therefore, this section simulates both types of failure scenarios for each type of PNC configuration available and discusses the slight variations in operational behavior that occur. 8.1 Basic IP PNC Configured System A port network can be controlled in two different manners – either directly through a resident IPSI or indirectly from a non-resident IPSI tunneled through the PNC. As the Background section covered, an IP PNC does not support tunneling of control links, so every port network using an IP PNC must contain an IPSI in order to be operational. In addition, the point of an IP PNC is to transmit bearer communication between port networks over an IP network. The figure below shows a system which is levering an IP PNC. IPSI Control Links Main ESS Control LAN IPSI IPSI EAL #1 IPSI EAL #2 IPSI IPSI EAL #4 EAL #3 IPSI EAL #5 EAL #6 EPN #1 EPN #2 EPN #3 EPN #4 EPN #5 EPN #6 MEDPRO MEDPRO MEDPRO MEDPRO MEDPRO MEDPRO Inter-PN Bearer Traffic IP PNC Figure 42 – Non-faulted System with IP PNC 51 In a non-faulted situation, every PN in the system is receiving control from the main server directly over IP (no tunneled control). In addition, every IPSI has created a priority list consisting of the main cluster followed by the ESS cluster. 8.2 Catastrophic Server Failure in an IP PNC Environment The simplest scenario to describe is the failover of an entire system (every port network) to an ESS cluster that is configured to have its port networks use an IP PNC. The figure below shows a catastrophic failure leaving the main cluster inoperable and all the IPSIs failing over to the ESS cluster. IPSI Control Links Main ESS Control LAN IPSI IPSI EAL #1 IPSI EAL #2 IPSI IPSI EAL #4 EAL #3 IPSI EAL #6 EAL #5 EPN #1 EPN #2 EPN #3 EPN #4 EPN #5 EPN #6 MEDPRO MEDPRO MEDPRO MEDPRO MEDPRO MEDPRO Inter-PN Bearer Traffic IP PNC Figure 43 – Catastrophic Server Failure in an IP PNC Environment The following list discusses the steps that occur when the non-faulted system (Figure 42) encounters a server failure causing all the system’s port networks to transition over to an ESS (Figure 43). 1. 2. 3. 4. 5. 6. 7. 8. An event renders the main cluster inoperable. The socket connections between the main cluster and the IPSIs all fail. Each IPSI independently detects the loss of its control link. Each IPSI removes the main cluster from its priority list and moves the ESS cluster into the 1st alternative slot. Each IPSI starts its own no service timer. When each IPSI’s no service timer expires, it requests service from its 1st alternative (the ESS). The ESS acknowledges each IPSI’s service request by taking control of it. Every port network is then brought back into service via a cold restart. After the failover is completed, the system operates exactly the same as before the failure since one autonomous system image is preserved (a non-fragmented system), the IP PNC is still used as the connectivity medium for inter-port network bearer traffic, and the ESS servers have no feature debt, identical RTUs, and no performance compromises as compared to the main server. If the main server comes back into service, it will inform all the IPSIs that it is available to supply them service if needed, but 52 the IPSIs will remain under the control of the ESS until the system administrator decides to transition them back. 8.3 Network Fragmentation Failure in an IP PNC Environment The next scenario examined is the effect on the system if an event occurs that fragments the control network, as shown in figure below, causing part of the system to failover to an ESS server. IPSI Control Links Main ESS Control LAN IPSI IPSI EAL #1 IPSI EAL #2 IPSI IPSI EAL #4 EAL #3 IPSI EAL #6 EAL #5 EPN #1 EPN #2 EPN #3 EPN #4 EPN #5 EPN #6 MEDPRO MEDPRO MEDPRO MEDPRO MEDPRO MEDPRO Inter-PN Bearer Traffic IP PNC Figure 44 – Network Fragmentation in an IP PNC Environment The following list discusses the steps that occur during the transition of a non-faulted system (Figure 42) into a fragmented system (Figure 44). 1. An event occurs that fragments the control network preventing communication between the main server and some port networks. 2. The socket connections between the main cluster and port networks #4, #5, and #6 all fail. 3. The socket connections between the ESS cluster and port networks #1, #2, and #3 all fail. 4. Each IPSI in PN #4, #5, and #6 independently detects the loss of the main server’s control link. 5. Each IPSI in PN #1, #2, and #3 independently detects the loss of the ESS’s keepalive link. 6. Each IPSI in PN #4, #5, and #6 removes the main cluster from its priority lists and moves the ESS cluster into the 1st alternative slot. 7. Each IPSI in PN #1, #2, and #3 removes the ESS cluster from their priority lists leaving them with no alternatives to the main cluster. 8. Each IPSI in PN #4, #5, and #6 each starts its own no service timers. 9. When each IPSI’s no service timer expires, it requests service from its 1st alternative (the ESS). 10. The ESS acknowledges each IPSI’s service request by taking control of it. 11. Port networks #4, #5, and #6 are brought back into service via a cold restart. 53 Even though the same ESS server is being used to provide survivability as the previous example, the end users do not receive 100% equivalent service. The drawbacks to running in a fragmented mode were discussed in the ESS in Control section. The important things to note are that every port network is providing service to its end users and that both fragments still use the IP network for bearer traffic between the PNs. In the following scenarios, this is not always the case. If the network fragmentation is mended, the main server will inform port networks #4, #5, and #6 that it is available to supply them service if needed, but the IPSIs will remain under the control of the ESS until the system administrator decides to transition them back. If the network experiences the same problem again (e.g. a flapping network), the ISPI lose contact with the main server, but since it is not being controlled by it, there are no adverse effects on the end users. 8.4 Basic ATM PNC Configured System Unlike the IP PNC discussed in the previous section, the ATM PNC supports tunneled control links and therefore not every port network is required to have an IPSI. In order for the SPE to control a non-IPSI connected PN, it must select an IPSI in another port network to indirectly support it. The SPE preferences, but with absolutely no guarantees, control for non-IPSI connected PNs to be controlled by an IPSI in its same community. The following diagram, showing a non-faulted ATM PNC configured system, has PN #1 getting service through PN #2 and PN #6 getting service through PN #5. The ATM PNC in this case is being used as the inter-port network bearer medium and also as a conduit for control links to non-IPSI connected PNs. In addition, every IPSI has created a priority list consisting of the main cluster followed by the ESS cluster. IPSI Control Links Main ESS Control LAN IPSI IPSI EAL #2 EPN #1 EPN #2 EI IPSI EPN #3 EI IPSI EAL #4 EAL #3 EPN #4 EI EAL #5 EPN #5 EI EAL #1 ATM PNC EI EPN #6 EI EAL #6 Inter-PN Bearer Traffic Figure 45 – Non-faulted System with ATM PNC Be aware however, that the stability of PN #6 is inherently tied to PN #5 and the health of the ATM PNC. Therefore a reset of PN #5 leads to a reset of PN #6. In addition, the SPE must get port network #5 into service before attempting to tunnel control through it for PN #6. The implication of this is that upon a system cold restart, PN #5 becomes operational before PN #6 does. Also, if there is some malfunction of the ATM PNC whereby the communication between PN #6 is severed from the rest of the port networks, PN #6 goes out of service since, without an IPSI, there is no way to get a control link to it from the SPE. In summary, IPSI connected port networks stability does not rely on the stability of other PNs, comes into service faster than non-IPSI connected PNs after restarts, and are not completely reliant on the ATM PNC for control. 54 8.5 Catastrophic Server Failure in an ATM PNC Environment The next scenario described is the failover of an entire system (every port network) to an ESS cluster that is configured to have its port networks use an ATM PNC. The figure below shows a catastrophic failure leaving the main cluster inoperable and all the IPSIs failing over to the ESS cluster. IPSI Control Links Main ESS Control LAN IPSI IPSI EAL #2 EPN #1 EPN #2 EI EI IPSI IPSI EAL #4 EAL #3 EPN #3 EPN #4 EI EI EAL #5 EPN #5 EPN #6 EI EAL #1 ATM PNC EI EAL #6 Inter-PN Bearer Traffic Figure 46 – Catastrophic Server Failure in an ATM PNC Environment The following list discusses the steps that occur when the non-faulted system (Figure 45) encounters a server failure causing all the system’s port networks to transition over to an ESS (Figure 46). 1. 2. 3. 4. 5. 6. 7. 8. 9. An event renders the main cluster inoperable. The socket connections between the main cluster and the IPSIs all fail. Each IPSI independently detects the loss of its control link. Each IPSI removes the main cluster from its priority list and moves the ESS cluster into the 1st alternative slot. Each IPSI starts its own no service timer. When each IPSI’s no service timer expires, it requests service from its 1st alternative (the ESS). The ESS acknowledges each IPSI’s service request by taking control of it. Every IPSI connected port network is then brought back into service via a cold restart. The SPE tunnels control links to every non-IPSI connected port network and brings them into service via a cold restart. After the failover is completed, the system operates exactly the same as before the failure since one autonomous system image is preserved (a non-fragmented system), the ATM PNC is still used as the connectivity medium for inter-port network bearer traffic, and the ESS servers have no feature debt, identical RTUs, and no performance compromises as compared to the main server. If the main server comes back into service, it will inform all the IPSIs that it is available to supply them service if needed, but 55 the IPSIs will remain under the control of the ESS until the system administrator decides to transition them back. 8.6 Network Fragmentation Failure in an ATM PNC Environment This next scenario, network fragmentation failure, has two variations that need to be addressed. The first and much more common variation is a failure condition that only causes the control network to fragment while leaving the ATM PNC intact. The second variation is a failure condition that causes both the control network and the ATM PNC to fragment simultaneously. This can occur if the control network and the ATM PNC network are one and the same or using the infamous backhoe explanation. The following figure shows the first variation whereby only the control network fragmented and how the system restores control to the effected port networks. IPSI Control Links Main ESS Control LAN IPSI IPSI EAL #2 EPN #1 EPN #2 EI EI IPSI IPSI EAL #3 EPN #3 EPN #4 EI EPN #5 EI EAL #4 EPN #6 EI EI EAL #5 EAL #1 ATM PNC EAL #6 Inter-PN Bearer Traffic Figure 47 – Control Network Only Fragmentation in an ATM PNC Environment The following list discusses the steps that occur when the non-faulted system (Figure 45) encounters a control network fragmentation causing some of the system’s port networks to go into fallback control (Figure 47). It is important to note that this type of failure is resolved with system capabilities which existed prior to the ESS offering. This scenario is being reviewed to show how ESS allows other recovery mechanisms, which usually have a less dramatic effect on end users, to attempt to resolve issues before the ESS jumps in to provide service. 1. 2. 3. 4. 5. 6. An event occurs that fragments the control network preventing communication between the main server and some port networks. The socket connections between the main cluster and port networks #4 and #5 both fail. The socket connections between the ESS cluster and port networks #2 and #3 both fail. The main SPE detects the loss of connectivity to the IPSIs in PN #4 and #5. The IPSIs in PN #4 and #5 independently detect the loss of the main server’s control link. The IPSIs in PN #2 and #3 independently detect the loss of the ESS’s keep-alive link. 56 7. The IPSIs in PN #4 and #5 remove the main cluster from its priority lists and moves the ESS cluster into the 1st alternative slot. 8. The IPSIs in PN #2 and #3 remove the ESS cluster from their priority lists leaving them with no alternatives to the main cluster. 9. The IPSIs in PN #4 and #5 each starts its own no service timers. 10. The main SPE takes control of PN #4, #5, and #6 indirectly over the ATM PNC through the IPSIs in PN #2 and #3. 11. The IPSIs in PN #4 and #5 detect that someone is controlling their associated port networks and cancel their no service timers. Since the port networks on the right side of the fragmentation are taken over by the main SPE through a fallback recovery (discussed in the Reliability - Single Cluster Environments section) the port networks are brought back into service via a warm restart implying no stable calls are affected. In addition, the IPSIs in PN #4 and #5 do not request service from the ESS server while the main server is controlling their port networks through the ATM PNC. If the tunneled link has a failure, then the IPSI will begin its no service timer once again and the process starts over at step #9 above. Once the network fragmentation is fixed, the main SPE will contact the IPSIs in PN #4 and #5 and immediately take control of them since the SPE knows it is already controlling (albeit indirectly) their associated port networks. After the SPE deems the connectivity between itself and the IPSIs is stable, it transitions control links from being tunneled through the PNC to the IPSI directly (control link fall-up). Up to this point in the paper, the ATM PNC has been abstractly represented as a simple ATM switched network cloud. In reality, however, the cloud may represent a single ATM switch or a number of ATM switches integrated together to form the ATM PNC network. The second fragmentation variation, whereby the control network and the ATM PNC both fragment concurrently, needs to be looked at in both frameworks. Figure #48 below shows a system that has a fragmented control network and a fragmented ATM PNC. Also shown in the figure are the ATM EI boards interfacing with a local ATM switch. The combination of both of these interconnected ATM switches creates the ATM PNC infrastructure. IPSI Control Links Main ESS Control LAN IPSI IPSI EAL #2 EPN #1 EI EPN #2 EI IPSI EAL #3 EPN #3 IPSI EAL #4 EPN #4 EI EI EAL #5 EPN #5 EPN #6 EI EI EAL #1 EAL #6 Inter-PN Bearer Traffic ATM Switch ATM Switch Inter-PN Bearer Traffic ATM PNC Figure 48 – Control Network and ATM PNC Fragmentation (Multi-switched ATM Network) 57 The following list discusses the steps that occur when the non-faulted system (Figure 45) encounters a control network fragmentation and ATM PNC fragmentation causing the some of the system to be taken over by an ESS cluster (Figure 48). 1. An event occurs that fragments the control network preventing communication between the main server and some port networks and that fragments the ATM PNC preventing fallback recovery. 2. The socket connections between the main cluster and port networks #4 and #5 both fail. 3. The socket connections between the ESS cluster and port networks #2 and #3 both fail. 4. The IPSIs in PN #4 and #5 independently detect the loss of the main server’s control link. 5. The IPSIs in PN #2 and #3 independently detect the loss of the ESS’s keep-alive link. 6. The IPSIs in PN #4 and #5 remove the main cluster from its priority lists and moves the ESS cluster into the 1st alternative slot. 7. The IPSIs in PN #2 and #3 remove the ESS cluster from their priority lists leaving them with no alternatives to the main cluster. 8. The IPSIs in PN #4 and #5 each starts its own no service timers. 9. When each IPSI’s no service timer expires, it requests service from its 1st alternative (the ESS) since no fallback recovery took place. 10. The ESS acknowledges each IPSI’s service request by taking control of it. 11. Port networks #4 and #5 are brought back into service via a cold restart. 12. The ESS tunnels a control link through the IPSI in PN #5, for this example, over the ATM PNC to port network #6 and brings PN #6 back into service via a cold restart. The left side of the system (PN #1, #2, and #3) remains under the control of the main servers and is operating as a stand-alone system. Since the ATM switch is still interconnecting port networks #1, #2, and #3, bearer connectivity between them is possible and the tunneled control link to PN #1 is still viable. Other than being in a fragmented state, the left side of the system is operating equivalently as it was before the failure. The right side of the system (PN #4, #5, and #6) experiences a cold restart when it is taken over by the ESS server, but then operates as it did before the failure occurred but in a fragmented environment (see the ESS in Control section). Since the port networks (PN #4, #5, and #6) are still interconnected via an ATM switch, inter-port network bearer traffic is still possible along with tunneled control for PN #6. When the control network is healed, the main server informs the IPSIs in PN #4 and #5 that it is available to provide service and the ESS server informs the IPSIs in PN #2 and #3 that it available to provide service. All the IPSIs make appropriate changes to their priority lists based on the recovered connectivity, but do not switch the cluster from which they are currently getting service. Immediately after the fragmentation fault occurs, the main server continuously attempts to establish tunneled control links to port networks #4, #5, and #6 through the ATM network, but fails to do so due to the lack of connectivity. After the ATM PNC is healed, the connectivity is restored, but the tunneled control links still fails because the EI boards, where the control link terminates, rejects it. The EI boards in PN #4 and #5 disallow the control link because the port network is already being controlled through the IPSIs. This is accomplished by a method discussed in the last section of this paper. The EI board in PN #6 also disallows the control link because the EI board is designed to only allow one control link at any one time and it already has a control link up to it through PN #5. Also, the ATM PNC is self-managed (see the Background section) allowing two intelligent entities (the main SPE and the ESS SPE) to both use its resources at the same time without conflicting with each other. If this was not the case, it would prevent the two systems from acting independently and still allow inter-port network bearer communication over the ATM PNC. Figure 49 below shows the same fragmentation of the control network and the ATM network as discussed above and, in this case, the ATM network consists of a single ATM switch which happens to be on the left side of the fragmentation. 58 IPSI Control Links Main ESS PN #6 is Out-of-Service Control LAN IPSI IPSI EAL #2 EPN #1 EI EPN #2 EI IPSI EAL #3 EPN #3 IPSI EAL #4 EPN #4 EI EI EAL #5 EPN #5 EPN #6 EI EI EAL #1 Inter-PN Bearer Traffic ATM Switch No Inter-PN Bearer Traffic ATM PNC Figure 49 – Control Network and ATM PNC Fragmentation (Single-switched ATM Network) The following list discusses the steps that occur when the non-faulted system (Figure 45) encounters a control network fragmentation and ATM PNC fragmentation causing the some of the system to be taken over by an ESS cluster (Figure 49). 1. An event occurs that fragments the control network preventing communication between the main server and some port networks and that fragments the ATM PNC preventing fallback recovery. 2. The socket connections between the main cluster and port networks #4 and #5 both fail. 3. The socket connections between the ESS cluster and port networks #2 and #3 both fail. 4. The IPSIs in PN #4 and #5 independently detect the loss of the main server’s control link. 5. The IPSIs in PN #2 and #3 independently detect the loss of the ESS’s keep-alive link. 6. The IPSIs in PN #4 and #5 remove the main cluster from its priority lists and moves the ESS cluster into the 1st alternative slot. 7. The IPSIs in PN #2 and #3 remove the ESS cluster from their priority lists leaving them with no alternatives to the main cluster. 8. The IPSIs in PN #4 and #5 each starts its own no service timers. 9. When each IPSI’s no service timer expires, it requests service from its 1st alternative (the ESS) since no fallback recovery took place. 10. The ESS acknowledges each IPSI’s service request by taking control of it. 11. Port networks #4 and #5 are brought back into service via a cold restart. The left side of the system (PN #1, #2, and #3) operates in exactly the same manner as the previous failure. Since the single ATM switch resides on the left side of the fragmentation, the port networks #1, #2, and #3 59 remain interconnected. The main SPE is controlling port networks #2 and #3 through the resident IPSI and port network #1 via the ATM PNC. However, the right side of the fragmentation is tremendously different than the previous example since there is no ATM switch interconnecting port networks #4, #5, and #6 which has two major implications. The first issue is that no tunneled control links are possible (no viable pathway) and therefore PN #6 cannot receive control from the SPE. Port networks #4 and #5 have resident IPSIs and therefore their service state is independent of the interconnection ATM PNC health. However, without an ATM switch interconnecting port networks #4 and #5 there will be no inter-port network bearer communication. For example, a call entering through a trunk in PN #4 destined for a user off of PN #5 would not be able to be completed and is instead routed to the called party’s coverage. In other words, the port networks #4 and #5 are in service and controlled by the same SPE (the ESS server), but function as stand-alone islands without the ATM PNC connectivity. When the faulted ATM network is fixed, two major events take place. First of all, port networks #4 and #5 will be able to communicate over the ATM PNC again. Once the SPE recognizes this, it begins to allow inter-port network bearer traffic again since a viable pathway exists. Furthermore, once PN #6 can communicate with the ATM PNC, it can receive a tunneled control link to provide it service. The problem, however, is that it is non-deterministic race which SPE, the main or the ESS, gets a control link up to it first. As talked about in the previous section, when an SPE is not in control of a PN, it continuously attempts to bring up a control link through the ATM PNC. This operation usually fails because either there is no pathway for the link to be established on or the terminating EI rejects the establishment if it is already being controlled. In this case however, once the link establishments are no longer blocked by the ATM PNC fragmentation and since the main SPE and the ESS SPE do not coordinate recovery actions, it is a race condition between them in respect to who is going to get control of PN #6 first. Said differently, once the ATM PNC is healed all the port networks receive service, but there is no administrator control in the decision of which SPE controls it. The only way to ensure that non-IPSI connected port networks receive their control from a certain SPE is to have that SPE control all the IPSIs in the system. If other SPEs in the system do not control any IPSIs, they have no entry point into the ATM PNC and therefore never attempt to establish control links to the non-IPSI connected PNs. An undesirable scenario can occur if not every port network has a resident IPSI is port network called hijacking. With the example above, the main SPE brought up a tunneled control link through PN #3 to PN #6 before the ESS server did. Port network #6 would be brought into service via a cold restart and then be part of the partial system controlled by the main SPE. Since the ESS SPE does not control PN #6 it is continuously attempting to bring up a control link to it, but is blocked by the EI board since another control link is already established from the main SPE. However, if another event occurs that causes a short communication fault between the main SPE and PN #6, then PN #6 would temporarily be un-controlled (control link goes down). Typically, the main SPE re-establishes a control link very quickly and resumes controlling the SPE by bring it into service via a warm restart (no effect on stable calls). Unfortunately, there is a chance that the ESS SPE brings up its control link to port network #6 during this time and if this happens, PN #6 would be brought into service under the control of the ESS server via a cold restart (dropping all active calls) and the main SPE’s attempts to restore service would be blocked. This type of event is referred to as port network hijacking. While a system configured to use an ATM PNC does not require an IPSI in every port network to be functional and be protected by the ESS offering, there are a number of drawbacks from doing so. First of all, the port networks service state is inherently tied to the availability of the ATM PNC and conditions can arise, as shown in the previous example, whereby non-IPSI port networks do not receive control from ESS clusters in failure situations. Second of all, there are no mechanisms built that allow the administrator to dictate where a non-IPSI connected port network gets control from and it is non-deterministic where a nonIPSI connected port work gets service from during recovery. And finally, recovery of non-IPSI connected port networks take longer than IPSI connected port networks do since they rely on IPSI connected port networks to be in service before tunneling control links. It can be concluded, due to the reasons mentioned, a system that has IPSIs in every port network has higher availability than systems which do not. 60 8.7 Basic CSS PNC Configured System Another configuration covered is a system configured to use a CSS PNC. As in the ATM PNC case, the CSS PNC supports not only inter-port network bearer connections, but also a medium for tunneling control links. Therefore, a system leveraging a CSS PNC is not required to have an IPSI in every port network. The following diagram, showing a non-faulted CSS PNC configured system, has PN #1 getting service through PN #2 and PN #6 getting service through PN #5. The CSS PNC in this case is being used as the inter-port network bearer medium and also a conduit for control links to non-IPSI connected PNs. In addition, every IPSI has created a priority list consisting of the main cluster followed by the ESS cluster. IPSI Control Links Main ESS Control LAN IPSI IPSI EAL #2 EPN #1 EPN #2 EI EI IPSI EPN #3 IPSI EAL #4 EAL #3 EPN #4 EI EAL #5 EPN #5 EI EAL #1 CSS PNC EI EPN #6 EI EAL #6 Inter-PN Bearer Traffic over CSS PNC Figure 50 – Non-faulted System with CSS PNC In the How are Port Networks Interconnected section the CSS PNC is described to be SPE managed. The ramifications of a PNC being SPE managed is that only one intelligent entity may utilize its resources and if two or more intelligent entities attempted to simultaneously use the PNC, many resource conflicts will be encountered causing complete CSS PNC failure. The IP PNC and ATM PNC are self managed and therefore do not have the same limitations (see Figure 44 and Figure 48 respectively). With many intelligent entities within the system (main SPE and ESS SPEs) the only way to guarantee only one SPE is utilizing the CSS PNC at one time is to block all other SPEs from using it. The main SPE and the ESS SPEs do not communicate about call state or resource allocation, so the blocking cannot be on a dynamic basis. Instead, a requirement exists in the ESS offering stating that an ESS SPE never attempts to utilize the CSS PNC under any circumstances. As the fault scenario unfold below, this has a major effect on port network recovery and the method PNs communicate with each other in a failure mode. 8.8 Catastrophic Server Failure in a CSS PNC Environment The next scenario described is the failover of IPSI connected port networks to an ESS cluster that is configured to have its port networks use a CSS PNC. The figure below shows a catastrophic failure leaving the main cluster inoperable and all the IPSIs failing over to the ESS cluster. 61 IPSI Control Links Main Control LAN PN #1 is Out-of-Service IPSI IPSI EAL #2 EPN #1 EI ESS PN #6 is Out-of-Service IPSI IPSI EAL #4 EAL #3 EAL #5 EPN #2 EPN #3 EPN #4 EPN #5 MEDPRO EI MEDPRO EI EI MEDPRO EI MEDPRO EAL #1 CSS PNC Inter-PN Bearer Traffic over IP PNC EPN #6 EI EAL #6 IP PNC Figure 51 – Catastrophic Server Failure in a CSS PNC Environment The following list discusses the steps that occur when the non-faulted system (Figure 50) encounters a server failure causing some of the system’s port networks to transition over to an ESS (Figure 51). 1. 2. 3. 4. 5. 6. 7. 8. An event renders the main cluster inoperable. The socket connections between the main cluster and the IPSIs all fail. Each IPSI independently detects the loss of its control link. Each IPSI removes the main cluster from its priority list and moves the ESS cluster into the 1st alternative slot. Each IPSI starts its own no service timer. When each IPSI’s no service timer expires, it requests service from its 1st alternative (the ESS). The ESS acknowledges each IPSI’s service request by taking control of it. Every IPSI connected port network is then brought back into service via a cold restart. Abiding by the rule that ESS servers can never utilize the CSS PNC prevents a system configured with a CSS PNC from operating in the same manner or with equivalent service when being controlled by the main cluster versus being controlled by an ESS cluster. As Figure 51 shows, port networks without IPSIs (PN #1 and #6) are not provided service from the ESS server and therefore are not supplying service to their end users. This is due to the fact that without an IPSI, the only method to control a port network is with a tunneled control link through the CSS PNC and since the ESS SPE cannot use any CSS PNC resources, no pathway for the control link to traverse exists. This leads to a very basic design principle – a port network in a CSS PNC environment must have a resident IPSI if it to be provided additional reliability by the ESS offering. Another side-effect of the ESS SPE’s inability to use the CSS PNC is a different communication medium between the port networks is needed. If an ESS server controls multiple CSS PNC connected port 62 networks, it routes inter-port network bearer traffic over an IP PNC. In other words, an ESS SPE ignores the existence of EI boards in port networks it controls and treats the PN as if it were configured to use an IP PNC. This implies that for a port network to communicate with other PNs in survivability mode, it needs to have a MEDPRO allowing it to interface with an IP network for VoIP traffic. If the port network does not have a MEDPRO, the ESS still provides it service, but the PN becomes its own island unable to communicate with other PNs controlled by the ESS. When the main cluster is repaired, it informs all the IPSIs that it is available to provide it service if they require it. The IPSIs place the main cluster into their priority lists appropriately, but do not shift under the main clusters control automatically. The main SPE has the ability to utilize the CSS PNC to provide service to PN #1 and #6, but it does not have to an access point to do this. Once the control of one IPSI is transferred to the main SPE, it immediately tunnels control links through that IPSI to port networks #1 and #6. 8.9 Network Fragmentation Failure in a CSS PNC Environment This next scenario, network fragmentation failure, has two variations that need to be addressed. The first and much more common variation is a failure condition that only causes the control network to fragment while leaving the CSS PNC intact. The second variation is a failure condition that causes both the control network and the CSS PNC to fragment simultaneously. This variation is very unlikely due to the fact that the control network and the CSS PNC are always completely separate networks implying the only way for this to statistically occur is if there are single points of failure that they share such as a conduit. If the wiring of the networks is done within the same physical conduit and that conduit was cut, both networks would fragment at the same time. The following figure shows the first variation whereby only the control network fragmented and how the system restores control to the effected port networks. IPSI Control Links Main ESS Control LAN IPSI IPSI EAL #2 EPN #1 EPN #2 EI EI IPSI IPSI EAL #3 EPN #3 EPN #4 EI EPN #5 EI EAL #4 EPN #6 EI EI EAL #5 EAL #1 CSS PNC EAL #6 Inter-PN Bearer Traffic over CSS PNC Figure 52 – Control Network Only Fragmentation in a CSS PNC Environment The following list discusses the steps that occur when the non-faulted system (Figure 50) encounters a control network fragmentation causing some of the system’s port networks to go into fallback control (Figure 52). It is important to note that this type of failure is resolved with system capabilities which existed prior to the ESS offering. This scenario is being reviewed to show how ESS allows other recovery 63 mechanisms, which usually have a less dramatic effect on end users, to attempt to resolve issues before the ESS jumps in to provide control. 1. An event occurs that fragments the control network preventing communication between the main server and some port networks. 2. The socket connections between the main cluster and port networks #4 and #5 both fail. 3. The socket connections between the ESS cluster and port networks #2 and #3 both fail. 4. The main SPE detects the loss of connectivity to the IPSIs in PN #4 and #5. 5. The IPSIs in PN #4 and #5 independently detect the loss of the main server’s control link. 6. The IPSIs in PN #2 and #3 independently detect the loss of the ESS’s keep-alive link. 7. The IPSIs in PN #4 and #5 remove the main cluster from its priority lists and moves the ESS cluster into the 1st alternative slot. 8. The IPSIs in PN #2 and #3 remove the ESS cluster from their priority lists leaving them with no alternatives to the main cluster. 9. The IPSIs in PN #4 and #5 each starts its own no service timers. 10. The main SPE takes control of PN #4, #5, and #6 indirectly over the ATM PNC through the IPSIs in PN #2 and #3. 11. The IPSIs in PN #4 and #5 detect that someone is controlling their associated port networks and cancel their no service timers. Since the port networks on the right side of the fragmentation were taken over by the main SPE through a fallback recovery (discussed in the Reliability - Single Cluster Environments section) the port networks were brought back into service via a warm restart implying no stable calls were affected. In addition, the IPSIs in PN #4 and #5 do not request service from the ESS server while the main server is controlling their port networks through the CSS PNC. If the tunneled link has a failure, then the IPSI will begin its no service timer once again and the process starts over at step #9 above. Once the network fragmentation is fixed, the main SPE will contact the IPSIs in PN #4 and #5 and immediately take control of them since the SPE knows it is already controlling (albeit indirectly) their associated port networks. After the SPE deems the connectivity between itself and the IPSIs is stable, it transitions control links from being tunneled through the PNC to the IPSI directly (control link fall-up). It should be apparent that in this failure scenario, systems configured with an ATM PNC or with a CSS PNC behave identically. EPN #1 EPN #2 EPN #3 EPN #4 EPN #1 EPN #2 EPN #3 EPN #4 EI EI EI EI EI EI EI EI Carrier #1 Carrier #1 CSS PNC Carrier #2 CSS PNC Single Carrier CSS Multi-Carrier CSS (Split CSS) Figure 53 – CSS PNC Layouts The CSS PNC, as discussed in the How are Port Networks Interconnected section, can be physically laid out in two manners – a single carrier format or over multiple carriers (referred to as split CSS) as displayed above in Figure #53. Since an ESS server never attempts to use the CSS PNC, it is inconsequential for this discussion which layout is being used. In the case of a single carrier, the port networks on the right side of the fragment lose physical connectivity between each other that the CSS carrier provided. In the case of 64 split CSS, the port networks on the right side of the fragment may still have physical connectivity through on of the carriers making up the CSS, but the ESS server will not leverage it. Figure #54 shows a system that has a fragmented control network and a fragmented CSS PNC. IPSI Control Links Main ESS Control LAN IPSI IPSI EAL #2 EPN #1 EPN #2 EI EI PN #6 is Out-of-Service IPSI IPSI EAL #4 EAL #3 EPN #3 EI EAL #5 EPN #4 EPN #5 EI MEDPRO EI MEDPRO EPN #6 EI EAL #1 CSS PNC Inter-PN Bearer Traffic over CSS PNC IP PNC Inter-PN Bearer Traffic over IP PNC Figure 54 – Control Network and CSS PNC Fragmentation The following list discusses the steps that occur when the non-faulted system (Figure 50) encounters a control network fragmentation and CSS PNC fragmentation causing the some of the system to be taken over by an ESS cluster (Figure 54). 1. An event occurs that fragments the control network preventing communication between the main server and some port networks and that fragments the ATM PNC preventing fallback recovery. 2. The socket connections between the main cluster and port networks #4 and #5 both fail. 3. The socket connections between the ESS cluster and port networks #2 and #3 both fail. 4. The IPSIs in PN #4 and #5 independently detect the loss of the main server’s control link. 5. The IPSIs in PN #2 and #3 independently detect the loss of the ESS’s keep-alive link. 6. The IPSIs in PN #4 and #5 remove the main cluster from its priority lists and moves the ESS cluster into the 1st alternative slot. 7. The IPSIs in PN #2 and #3 remove the ESS cluster from their priority lists leaving them with no alternatives to the main cluster. 8. The IPSIs in PN #4 and #5 each starts its own no service timers. 9. When each IPSI’s no service timer expires, it requests service from its 1st alternative (the ESS) since no fallback recovery took place. 10. The ESS acknowledges each IPSI’s service request by taking control of it. 11. Port networks #4 and #5 are brought back into service via a cold restart. 65 The left side of the system (PNs #1, #2, and #3) remains under the control of the main servers and is operating as a stand-alone system. Since the CSS PNC is still interconnecting port networks #1, #2, and #3, bearer connectivity between them is possible and the tunneled control link to PN #1 is still viable. Other than being in a fragmented state, the left side of the system is operating equivalently as it was before the failure. The right side of the system does not recover as smoothly as in the ATM PNC environment. First of all, PNs #4 and #5 are being supplied service from the ESS server, but the inter-port network traffic is now being routed over an IP network. In addition, port network #6 is not supplied service from either the main SPE or the ESS SPE and therefore is inoperable. This implies that a CSS PNC connected port network is required to have a resident IPSI (for control) and a MEDPRO (for inter-port network bearer communication) if it is to be supplied service by an ESS in failure modes. When the control network is healed, the main server informs the IPSIs in PN #4 and #5 that it is available to provide service and the ESS server informs the IPSIs in PN #2 and #3 that it available to provide service. All the IPSIs make appropriate changes to their priority lists based on the recovered connectivity, but do not switch the cluster that they are currently getting service from. Immediately after the fragmentation fault occurs, the main server continuously attempts to establish tunneled control links to port networks #4, #5, and #6 through the CSS PNC, but fails to do so due to the lack of connectivity. After the CSS PNC is healed, the connectivity is restored, but the tunneled control links to PN #4 and #5 still fails because the EI boards, where the control link terminates, rejects it. The EI boards in PN #4 and #5 disallow the control link because the port network is already being controlled through the IPSIs. This is accomplished by a method discussed in the last section of this paper. However, the EI board in PN #6 accepts the control link because no other control links were terminating to it and the main server takes control of the port network. Unlike the ATM PNC scenario whereby the main SPE and the ESS SPEs race to control non-IPSI connected PNs in some cases, there is no race condition since only the main SPE is ever attempting to tunnel control links through the CSS PNC. After the administrator agglomerates the system by transferring control of PN #4 and #5 from the ESS server to the main server, the bearer traffic between the port networks is sent back over the CSS PNC rather than the IP PNC. 8.10 Mixed PNC Configured System The final configuration covered in this section is a system designed to use a mixed PNC environment. In the three previous sections, the PNC configurations discussed were all inclusive meaning that all port networks shared the same PNC. In an IP PNC configured system, all of the port networks are utilizing an IP network for inter-port network bearer communication. In an ATM PNC configured system, all of the port networks are utilizing an ATM network for inter-port network bearer communication. In a CSS PNC configured system, all of the port networks were utilizing a center-stage switch for inter-port network bearer communication. The mixed PNC configuration allows some of the port networks to utilize one instance of either a CSS PNC or ATM PNC and the other port networks in the system to utilize IP PNC. Figure 55, in the ESS in Action – A Virtual Demo section, shows a system using the mixed PNC configuration. A port network in a system configured to use mixed PNC operation in a failure scenario is based on what type of PNC it is interfacing with. For example, the port networks connected to either an CSS PNC or an ATM PNC operate in survivability mode as if the entire system was using CSS PNC or ATM PNC respectively and port networks connected to an IP PNC operate in survivability mode as if the entire system was configured to use IP PNC. Section 9 - ESS in Action - A Virtual Demo 9.1 Demo - Setup To summarize the material presented thus far and to show how it is applied to a real system, this section takes you through a “virtual demo” of the ESS feature. The example system, shown in Figure 55 below, is supplying phone service to three geographically disperse campuses that are interconnected through an IP WAN. The main site, on the left, is an upgraded DEFINITY switch consisting of traditional MCC cabinets interconnected with a CSS PNC. The other two locations are newer sites using G650 cabinets that are interconnected to each other and back to the main site via an IP network. This layout is another example of 66 a system configured to use a mixed PNC. Except for PN #4, all the other port networks have a resident IPSI through which the main server is using to control them by means of a direct EAL. Port network #4 is a non-IPSI connected port network and gets its control from the main server indirectly via a tunneled EAL through port network #3 in this case. Main ESS #10 ESS #20 LAN ESS #30 LAN IPSI IPSI IPSI EPN #1 EPN #2 EPN #3 EPN #4 EI EI EI EI LAN IPSI IPSI IPSI EPN #5 EPN #6 EPN #7 CSS PNC Figure 55 – Communication System Layout for Virtual Demo There are a number of steps that can be taken to increase a systems reliability including the duplication of IPSIs with in each PN, the duplication of the CSS PNC, and the addition of ESS servers. Since the focus of this virtual demo is the ESS feature and to keep the failure cases as simple as possible, standard reliability (simplex IPSIs and simplex PNC) is used in this example. Given the following list of objectives the strategic placement of ESS servers can be determined and prioritizations of them can be derived. 1. 2. 3. 4. All sites must be able to survive in isolation from the rest of the system. All sites must be able to survive a catastrophic main server failure. There is more than adequate WAN bandwidth available for port network control links and inter-port network bearer communication. Upon any failure, the number of autonomous systems created should be minimized. Objective #1 requires an ESS server to exist at every remote location since each site needs to be operational if it gets fragmented away from the rest of the system. Objective #2 subtly requires that an ESS server also be placed at the main location because it must be able to survive a catastrophic main server failure. It is unable to achieve this if it does not have a local control source alternative and becomes isolated from the other sites. Objective #3 and objective #4 dictate how the priority scores and attributes for the three ESS clusters should be administered. Since bandwidth is not an issue and the failover objective is to minimize fragmentation, none of the ESS clusters should utilize the local preference or the local only attributes which implies they also do not need to override local preference boosts with the system preferred option (see the What is a Priority Score section for more details). In addition, the objectives do not specify a desired failover order, so there are no restrictions for the priority score rankings of the ESS clusters. The administration of the ESS clusters for this demonstration is shown in the SAT screen shot below by executing the “display system-parameters ess” command. 67 display system-parameters ess ENTERPRISE SURVIVABLE SERVER INFORMATION Page 1 of 7 Cl Plat Server A Server B Pri Com Sys Loc Loc ID Type ID Node Name ID Node Name Scr Prf Prf Only -----------------------------------------------------------------------------MAIN SERVERS 1 Duplex 1 172.16 .192.1 2 172.16 .192.2 ENTERPRISE SURVIVABLE SERVERS 10 20 30 Simplex 10 Duplex 20 Simplex 30 ESS10-10 ESS20-20 ESS30-30 21 ESS20-21 75 100 50 1 1 1 1 1 1 1 1 n n n n n n n n n n n n n n n n n n n n n n n n Screen Shot 1 – ESS Cluster Administration The first observation from the ESS cluster administration is that ESS #10 and ESS #30 have platform form types (Plat Type) of simplex implying they are from the S8500 family of media servers and therefore have no Server B information. ESS #20 has a platform type of duplex meaning it is from the S87XX family of media servers. The second observation is that the ranked order of the ESS clusters is ESS #20, then ESS #10, and finally ESS #30 since they have priority scores of 100, 75 and 50 respectively. It is important to remember that the cluster IDs assigned come directly from the license file on each server and the main cluster always has an ID of #1. Once the ESS clusters are added to the system and installed they need to register with the main server to avoid alarming and to get updated translations. The status of the registration links and the ESS clusters can be seen in the SAT screen shot below using the command “status ess clusters” command. This command shows a quick view of the status of every ESS cluster in the system. It includes basic information such as if the ESS cluster is enabled (Enabled?), if the ESS cluster is registered (Registered?), and which server is active within the each cluster (Active Server ID). In addition, it also shows the administrator if the ESS cluster’s translation set is up-to-date (if the time stamp under Translations Updated for the ESS cluster matches the main cluster’s time stamp) and what software load each cluster is currently running (Software Version). 68 status ess clusters Cluster ID 1 ESS CLUSTER INFORMATION Cluster ID Enabled? 1 10 20 30 y y y y Active Server ID Registered? 1 10 20 30 y y y y Translations Updated 19:30 19:30 19:30 19:30 Software Version 11/14/2005 11/14/2005 11/14/2005 11/14/2005 R013x.01.0.626.0 R013x.01.0.626.0 R013x.01.0.626.0 R013x.01.0.626.0 Command successfully completed Command: Screen Shot 2 – ESS Cluster Registration Status As described in the How do IPSIs Manage Priority Lists section, IPSIs maintain independent priority failover lists of ESS clusters. For this reason, the command “status ess port-networks” queries each IPSI in real-time for its priority list and then displays them for the administrator. This command is critical in verifying the actual failover behavior of the IPSIs match the desired failover behavior as administered and in determining which clusters are currently in control of which IPSIs. status ess port-networks Cluster ID Com PN Num 1 ESS PORT NETWORK INFORMATION Port IPSI Intf Inft Ntwk Gtway Loc Type Ste Loc Pri/ Pri/ Sec Sec Loc State Cntl Connected Clus Clus(ter) ID IDs 1 1 1A01 IPSI up 1A01 1A01 actv-aa 1 1 20 10 30 2 1 2A01 IPSI up 2A01 2A01 actv-aa 1 1 20 10 30 3 1 3A01 IPSI up 3A01 3A01 actv-aa 1 1 20 10 30 4 1 4A01 EI up 3A01 5 1 5A01 IPSI up 5A01 5A01 actv-aa 1 1 20 10 30 6 1 6A01 IPSI up 6A01 6A01 actv-aa 1 1 20 10 30 7 1 7A01 IPSI up 7A01 7A01 actv-aa 1 1 20 10 30 Command successfully completed Command: Screen Shot 3 – Non-Faulted Environment IPSI Priority Lists 69 The first column (PN) on the left shows the port network number for which all the data in that row corresponds to. The next column (Com Num) shows what community the port network and its resident IPSI are assigned to. The fifth column, Port Ntwk Ste, presents the port network state, either up (in-service) or down (out-of-service), with respect to the sever executing the command. In this case, the command was run on the main server and the main cluster is currently controlling all of these port networks in a nonfaulted environment. The third and fourth columns, Intf Loc and Intf Type, along with the sixth column, IPSI Gtway Loc, describe how the port network is currently being given service by the controlling server. The interface location is the board where the EAL link is terminating to and the interface type is the type of board (EI or IPSI) where the arch-angel is residing. The IPSI gateway location field displays to the administrator which IPSI is supporting the control link for this port network. As described in the How are Port Networks Controlled section, there are two methods to control a port network -- either directly through a resident IPSI (left side of Figure 56) or indirectly though an IPSI residing in another PN (right side of Figure 56). If a port network’s IPSI gateway location is the same as the port network’s interface location, then it can be concluded that the port network is being controlled directly from a server through its own IPSI. Otherwise, if they do not match, the port network is getting controlled indirectly through another port network’s IPSI. To determine which port network is supporting another port network in these indirect control situations, figure out what port network the IPSI gateway board resides in via the “list cabinet” command. SPE SPE IPSI Gateway IPSI Gateway LAN IPSI LAN IPSI EAL #1 EPN #1 EPN #1 Interface Location (Type ! IPSI) EPN #2 EI EI PNC Direct Port Network Control Interface Location (Type ! EI) EAL #2 Indirect Port Network Control Figure 56 – Direct PN Control versus Indirect PN Control The next two columns, Pri/Sec Loc and Pri/Sec State, describe the port network’s IPSI(s), if they exist. This screen shot shows that PN #1 has its primary IPSI in board location 1A01 and it is currently active supporting the arch-angel (actv-aa). If the port network had duplicated IPSIs, a second row would appear immediately below describing the secondary IPSI. Since this is a stable non-faulted environment, the primary IPSI location should also be the IPSI gateway location (since this IPSI is supporting the PN) and the interface location (since this IPSI is supporting the arch-angel). There are failure scenarios for which these three values may not be equal. For example, if the port network experienced an IPSI outage, control could be tunneled through the CSS PNC. In this case, the interface location would be the EI board (where the arch-angel resides) and the IPSI gateway location would be another port network’s IPSI location (where the hybrid EAL terminates). The primary location of the IPSI would remain unchanged since that is an administered location and the primary IPSI state would be active (because it is the active IPSI regardless if it is being used or not). The rest of the row contains the results of a real-time query to the IPSIs concerning whom their master cluster is and what their priority lists currently are. The Cntl Clus ID column is the cluster ID returned by the IPSI referring to its currently controlling cluster. In this example, cluster ID #1 is returned which is the 70 main cluster’s default ID. The next columns, Connected Clus(ter) IDs, show the IPSI’s priority list. In this example based on the administration previously done, each IPSI has a preference list headed by the main cluster (ID #1) followed by ESS clusters #20, #10, and #30. It is important to note that the entry for port network #4, the non-IPSI connected PN, has significant differences from the others. First of all, the interface type is always an EI board since there are only two types of boards which can support arch-angels, EIs and IPSIs, and no IPSIs reside in the port network. The next difference is that the interface location and the IPSI gateway location are never the same since the port network is always receiving control indirectly through another port network. And finally, there are no entries for primary / secondary IPSI locations since no IPSIs exist in the port network. The last item to analyze before introducing failure into the system is the port network view from the ESS clusters themselves. The execution of the “status ess port-networks” command from ESS #20 is shown below. status ess port-networks Cluster ID Com PN Num 20 ESS PORT NETWORK INFORMATION Port IPSI Intf Inft Ntwk Gtway Loc Type Ste Loc Pri/ Pri/ Sec Sec Loc State Cntl Connected Clus Clus(ter) ID IDs 1 1 down 1A01 active 1 1 20 10 30 2 1 down 2A01 active 1 1 20 10 30 3 1 down 3A01 active 1 1 20 10 30 5 1 down 5A01 active 1 1 20 10 30 6 1 down 6A01 active 1 1 20 10 30 7 1 down 7A01 active 1 1 20 10 30 Command successfully completed Command: Screen Shot 4 – “status ess port-networks” from ESS #20 in Non-faulted Environment The most glaring difference from the execution of the command from ESS #20 as compared to when executed from the main server is the port network state and fields associated with it. Since the ESS server does not control the IPSI, and therefore not control the port network, from its perspective the port network does is not in service, or down. Also, the ESS server does not attempt to establish EAL links to the port network which implies there is no information to report in the interface location, interface type, and IPSI gateway fields. Another difference is that port network #4 has been omitted from the list of port networks. Abiding by the rule that ESS servers cannot tunnel control links through the CSS PNC and given that PN #4 is a non-IPSI connected port network, it is impossible for the ESS server, under any conditions, to take control of port network #4 or provide any relevant status of it. This same viewpoint is shared by all of the ESS clusters in this non-faulted situation. 9.2 Demo – Catastrophic Main Server Failure Now that the setup of the demo has been reviewed, it is time to introduce the first major fault into the system - a catastrophic main server failure. Suppose a cataclysmic event occurs that renders the entire 71 main cluster inoperable. Within a few seconds of the failure, the IPSIs detect that the connection to the main cluster is down which causes them to adjust their priority lists by removing the main’s cluster ID from it and starting their no-service timers. Screen Shot #4 shows the results of the “status ess port-networks” executed from ESS #20 again after the failure, but before the IPSIs no service timers have expired. status ess port-networks Cluster ID Com PN Num 20 ESS PORT NETWORK INFORMATION Port IPSI Intf Inft Ntwk Gtway Loc Type Ste Loc Pri/ Pri/ Sec Sec Loc State Cntl Connected Clus Clus(ter) ID IDs 1 1 down 1A01 active * 20 10 30 2 1 down 2A01 active * 20 10 30 3 1 down 3A01 active * 20 10 30 5 1 down 5A01 active * 20 10 30 6 1 down 6A01 active * 20 10 30 7 1 down 7A01 active * 20 10 30 Command successfully completed Command: Screen Shot 5 – Viewpoint from ESS #20 Immediately after Main Cluster Failure There are two very important pieces of information that the results of the command give the administrator. For every IPSI in the system, the controlling cluster ID transitioned from #1 (the main cluster’s ID) to ‘*’. The ‘*’ implies to the administrator that the IPSIs are reporting they do not currently have a controlling cluster. Also, each IPSI’s connected cluster IDs list has been adjusted by removing the main cluster’s ID and shifting all the ESS clusters up one position. Since the main cluster no longer appears on any list, the administrator can determine a catastrophic event has taken the main servers out-of-service. All of the other ESS clusters in the system share the same viewpoint as ESS #20. After the no service timer for each IPSI expires, the IPSI sends a request for service to its highest ranking cluster in the priority list – ESS #20. ESS #20 receives the request and takes over all the port networks, establishes a control link, and brings the IPSI’s PN back into service via a cold restart. PN #4 does not have a resident IPSI and since the ESS servers do not support tunneled control links, port network #4 remains non-functional. From this point forward in the demo, it is important to understand the viewpoints of the system from both ESS #20 and ESS #30. 72 status ess port-networks Cluster ID Com PN Num 20 ESS PORT NETWORK INFORMATION Port IPSI Intf Inft Ntwk Gtway Loc Type Ste Loc Pri/ Pri/ Sec Sec Loc State Cntl Connected Clus Clus(ter) ID IDs 1 1 1A01 IPSI up 1A01 1A01 actv-aa 20 20 10 30 2 1 2A01 IPSI up 2A01 2A01 actv-aa 20 20 10 30 3 1 3A01 IPSI up 3A01 3A01 actv-aa 20 20 10 30 5 1 5A01 IPSI up 5A01 5A01 actv-aa 20 20 10 30 6 1 6A01 IPSI up 6A01 6A01 actv-aa 20 20 10 30 7 1 7A01 IPSI up 7A01 7A01 actv-aa 20 20 10 30 Command successfully completed Command: Screen Shot 6 – Viewpoint from ESS #20 after Failover due to Main Cluster Failure status ess port-networks Cluster ID Com PN Num 30 ESS PORT NETWORK INFORMATION Port IPSI Intf Inft Ntwk Gtway Loc Type Ste Loc Pri/ Pri/ Sec Sec Loc State Cntl Connected Clus Clus(ter) ID IDs 1 1 down active 20 20 10 30 2 1 down active 20 20 10 30 3 1 down active 20 20 10 30 5 1 down active 20 20 10 30 6 1 down active 20 20 10 30 7 1 down active 20 20 10 30 Command successfully completed Command: Screen Shot 7 - Viewpoint from ESS #30 after Failover due to Main Cluster Failure Screen shots #6 and #7 show the current control state of the system. Since the “status ess port-networks” command queries the IPSIs in real time from their priority lists, both ESS cluster should report the same information. They both show that ESS #20 is the controlling cluster for every IPSI and they both show the same priority lists for each IPSI. The viewpoint of ESS #20, since it is the controlling cluster of every port network, is that all port networks are up and their EAL links are described appropriately. ESS #30’s viewpoint is different since it is does not have any control links established and reports that from its 73 perspective, all the port networks are out-of-service. Figure 57 shows the control links for every IPSI connected port network after the system recovers from the main cluster outage. Main ESS #10 ESS #20 LAN ESS #30 LAN IPSI IPSI IPSI EPN #1 EPN #2 EPN #3 EPN #4 EI EI EI EI LAN IPSI IPSI IPSI EPN #5 EPN #6 EPN #7 EPN #4 is out-of-service CSS PNC Figure 57 – Communication System In-service after Main Cluster Failure 9.3 Demo – Extended Network Fragmentation The next phase of the demo is the introduction of a severe network fragmentation fault on the WAN link between sites #2 and #3 on top of the already failed main cluster. Once the network fragmentation occurs, as shown in Figure 58, the IPSI in port network #7 detects the connectivity loss to its controlling cluster and starts its no service timer. The IPSIs on the other side of the fragmentation detect the connectivity loss to ESS #30 and make the appropriate changes to their priority lists, but service to them is uninterrupted. Unfortunately, unlike the previous situations, the network fragmentation prevents all ESS servers from being able to contact every IPSI and therefore they are unable to provide the administrator with a complete system view. Executing “status ess port-networks” from ESS #20 provides a system view of the left side of the fragmentation. As Screen Shot #8 shows, port networks #1, #2, #3, #5, and #6 are in service and being controlled by ESS #20. In addition, the all of these port network’s IPSIs have removed ESS #30 as an alternate source of control because of the lack of network connectivity. And finally, ESS #20 does not report any status about PN #7 because it cannot communicate with it. 74 status ess port-networks Cluster ID Com PN Num 20 ESS PORT NETWORK INFORMATION Port IPSI Intf Inft Ntwk Gtway Loc Type Ste Loc Pri/ Pri/ Sec Sec Loc State Cntl Connected Clus Clus(ter) ID IDs 1 1 1A01 IPSI up 1A01 1A01 actv-aa 20 20 10 2 1 2A01 IPSI up 2A01 2A01 actv-aa 20 20 10 3 1 3A01 IPSI up 3A01 3A01 actv-aa 20 20 10 5 1 5A01 IPSI up 5A01 5A01 actv-aa 20 20 10 6 1 6A01 IPSI up 6A01 6A01 actv-aa 20 20 10 7 1 down active Command successfully completed Command: Screen Shot 8 – Viewpoint from ESS #20 Immediately after Network Fragmentation On the other side of the fragmentation, executing the status command from ESS #30 gives a much different view of system control. Immediately after the network failure occurred, ESS #30 lost communication with every IPSI except the one residing in port network #7, hence, ESS #30 does not report any control information or current priority lists for those IPSIs. Additionally, port network #7’s IPSI detects the connectivity failure to the main server and begins recovery actions by removing the main server from its priority list and starts its no service timer. The screen shot below is a snapshot of ESS #30’s viewpoint during the no service time interval. 75 status ess port-networks Cluster ID Com PN Num 30 ESS PORT NETWORK INFORMATION Port IPSI Intf Inft Ntwk Gtway Loc Type Ste Loc Pri/ Pri/ Sec Sec Loc State Cntl Connected Clus Clus(ter) ID IDs 1 1 down active 2 1 down active 3 1 down active 5 1 down active 6 1 down active 7 1 down active * 30 Command successfully completed Command: Screen Shot 9 - Viewpoint from ESS #30 Immediately after Network Fragmentation Once the no service timer expires, PN #7’s IPSI requests service from its best option which happens to be the only ESS cluster it can communicate with (ESS #30). ESS #30, after receiving the service request, takes over PN #7 by establishing a control link and then brings it into service by cold restarting it. Figure 58 below shows the new control link from ESS #30 to PN #7 along with the unaffected control links from ESS #20 to the other port networks. Main ESS #10 ESS #20 LAN ESS #30 LAN IPSI IPSI IPSI EPN #1 EPN #2 EPN #3 EPN #4 EI EI EI EI CSS PNC LAN IPSI IPSI IPSI EPN #5 EPN #6 EPN #7 EPN #4 is out-of-service Figure 58 – Communication System In-service after Network Fragmentation Executing the status command from ESS #20 does not reveal any no information that it did before ESS #30 took over PN #7. Since the network is fragmented, no pathway exists for ESS #20 to get status from PN #7’s IPSI and report it to the administrator. On the other hand, “status ess port-networks” executed from ESS #30 shows that it is controlling port network #7 and that port network #7 is in-service. 76 status ess port-networks Cluster ID Com PN Num 30 ESS PORT NETWORK INFORMATION Port IPSI Intf Inft Ntwk Gtway Loc Type Ste Loc Pri/ Pri/ Sec Sec Loc State 1 1 down active 2 1 down active 3 1 down active 5 1 down active 6 1 down active 7 1 7A01 IPSI up 7A01 7A01 actv-aa Cntl Connected Clus Clus(ter) ID IDs 30 30 Command successfully completed Command: Screen Shot 10 - Viewpoint from ESS #30 after Failover due to Network Fragmentation 9.4 Demo –Network Fragmentation Repaired At this point, the strategic placement of ESS servers within the system has allowed the system to continue to provide service to all its end-users through both a catastrophic main server failure and an extended network fragmentation. The rest of the demonstration concentrates on restoring the fragmented system back into one autonomous system controlled by the main servers. While the network is fragmented, all the servers are attempting to bring up TCP sockets to all of the IPSIs which they are not communicating with, but the attempts are always unsuccessful since network connectivity is faulted. However, once the network is repaired, all ESS servers will be able to communicate with all IPSIs and all of the socket establishment attempts will be successful. This has two major effects on the system, but does not cause any adverse effects to any end-users. Every ESS cluster is going to be able to provide a complete system view and all of the IPSIs are going to be able to include all of the ESS clusters in their priority failover lists. As Figure 59 shows, ESS #30 continues to provide service to PN #7 even though ESS #10 and ESS #20 have informed the IPSI that they are available. The results of the status commands from both ESS #20 and ESS #30 in Screen Shots #11 and #12 reflect this situation. 77 Main ESS #10 ESS #20 ESS #30 LAN LAN IPSI IPSI IPSI EPN #1 EPN #2 EPN #3 EPN #4 EI EI EI EI LAN IPSI IPSI IPSI EPN #5 EPN #6 EPN #7 EPN #4 is out-of-service CSS PNC Figure 59 –Network Fragmentation Fixed status ess port-networks Cluster ID Com PN Num 20 ESS PORT NETWORK INFORMATION Port IPSI Intf Inft Ntwk Gtway Loc Type Ste Loc Pri/ Pri/ Sec Sec Loc State Cntl Connected Clus Clus(ter) ID IDs 1 1 1A01 IPSI up 1A01 1A01 actv-aa 20 20 10 30 2 1 2A01 IPSI up 2A01 2A01 actv-aa 20 20 10 30 3 1 3A01 IPSI up 3A01 3A01 actv-aa 20 20 10 30 5 1 5A01 IPSI up 5A01 5A01 actv-aa 20 20 10 30 6 1 6A01 IPSI up 6A01 6A01 actv-aa 20 20 10 30 7 1 30 20 10 30 down active Command successfully completed Command: Screen Shot 11 - Viewpoint from ESS #20 after Network is Repaired 78 status ess port-networks Cluster ID Com PN Num 30 ESS PORT NETWORK INFORMATION Port IPSI Intf Inft Ntwk Gtway Loc Type Ste Loc Pri/ Pri/ Sec Sec Loc State Cntl Connected Clus Clus(ter) ID IDs 1 1 down active 20 20 10 30 2 1 down active 20 20 10 30 3 1 down active 20 20 10 30 5 1 down active 20 20 10 30 6 1 down active 20 20 10 30 7 1 30 20 10 30 7A01 IPSI up 7A01 7A01 actv-aa Command successfully completed Command: Screen Shot 12 – Viewpoint from ESS #30 after Network is Repaired 9.5 Demo – Main Cluster is Restored At this point a number of things can be done. The administrator can leave the system running as is only suffering from the effects of operating in a fragmented mode (see the How are Call Flows Altered when the System is Fragmented section). Another approach the administrator could take is to shift port network #7 under the control of ESS #20 and then operate as one homogenous switch with every IPSI connected port network receiving its control from one source, but the non-IPSI connected port network (PN #7) would still be out of service. This approach can and should be used if the main servers are not going to be operational for an extremely long period of time. This demo takes a another approach to returning the system to normal operation – fix the main cluster and after it becomes stable again, transition all of the port networks in a controlled manner back under its control. The transition of the port networks back to the main servers can be achieved by forcing one port network at a time or all of them at once. Once the main cluster is operational, it contacts all of the IPSIs and informs them that it is available to provide service and also gathers the IPSIs current state information. Upon the main servers reconnection, the IPSIs insert the main cluster back into its priority list, but continue receive service from their current master cluster. The following screen shot shows that the main server knows the control status of the IPSI for each port network and is an alternative for each one, but is not in control of any of them. 79 status ess port-networks Cluster ID Com PN Num 1 ESS PORT NETWORK INFORMATION Port IPSI Intf Inft Ntwk Gtway Loc Type Ste Loc Pri/ Pri/ Sec Sec Loc State Cntl Connected Clus Clus(ter) ID IDs 1 1 down active 20 1 20 10 30 2 1 down active 20 1 20 10 30 3 1 down active 20 1 20 10 30 4 1 down 5 1 down active 20 1 20 10 30 6 1 down active 20 1 20 10 30 7 1 down active 30 1 20 10 30 Command successfully completed Command: Screen Shot 13 – Viewpoint from Main Cluster after the Main Cluster has been Restored The next screen shot shows ESS #20’s viewpoint of the system after the main cluster becomes operational once again. ESS #20 is still in control of port networks #1, #2, #3, #5, and #6 even though it is no longer the best alternative according to the dynamically adjusted priority lists. The viewpoint from ESS #30 (not shown) is in complete agreement with the main cluster’s and ESS #20’s viewpoints and is in control of port network #7. status ess port-networks Cluster ID Com PN Num 20 ESS PORT NETWORK INFORMATION Port IPSI Intf Inft Ntwk Gtway Loc Type Ste Loc Pri/ Pri/ Sec Sec Loc State Cntl Connected Clus Clus(ter) ID IDs 1 1 1A01 IPSI up 1A01 1A01 actv-aa 20 1 20 10 30 2 1 2A01 IPSI up 2A01 2A01 actv-aa 20 1 20 10 30 3 1 3A01 IPSI up 3A01 3A01 actv-aa 20 1 20 10 30 5 1 5A01 IPSI up 5A01 5A01 actv-aa 20 1 20 10 30 6 1 6A01 IPSI up 6A01 6A01 actv-aa 20 1 20 10 30 7 1 30 1 20 10 30 down active Command successfully completed Command: Screen Shot 14 - Viewpoint from ESS #20 after the Main Cluster has been Restored 80 To force all of the IPSI to leave their current controlling cluster to return to the main server, a call disruptive command “get forced-takeover ipserver-interface all” has been introduced. This command causes the SPE that it was executed on to send special messages to the IPSIs informing shut down their current control links and allowing this new SPE to be its master. After this command is executed, the viewpoints of all the servers in the system will return to their original non-faulted viewpoints as shown a while back in Screen Shots #3 and #4 with control links as shown below. Main ESS #10 ESS #20 LAN ESS #30 LAN IPSI IPSI IPSI EPN #1 EPN #2 EPN #3 EPN #4 EI EI EI EI LAN IPSI IPSI IPSI EPN #5 EPN #6 EPN #7 CSS PNC Figure 60 –Control Return to Restored Main Cluster Section 10 - More Thoughts and Frequently Asked Questions 10.1 How is ESS Feature Enabled ? The license file is extremely important to the ESS feature. In addition to providing a server with a unique identifier (module ID, otherwise known as cluster ID) which is required for proper registration, the license file also includes the necessary values that tell a media server what type it is – either the main server or an ESS server (which fundamentally changes its behavior). To this end, there are two primary customer options to check before loading a license file onto a server. FEAT_ESS is the customer option that determines whether or not the ESS feature itself is enabled. Without this option, the main server rejects all ESS cluster registrations and therefore ESS clusters cannot be deployed into the system. The other ESS relevant customer option is FEAT_ESS_SRV. This option informs a given server whether or not it is an ESS server or one of the main servers. This particular value should be set to “no” for the main server and “yes” for every ESS server. These values in the license file can be checked by issuing the following commands from the CLI interface: statuslicense –v –f FEAT_ESS statuslicense –v –f FEAT_ESS_SRV The matrix below summarizes the customer option settings that are required. Customer Option Main Server ESS Server FEAT_ESS_SRV No Yes FEAT_ESS Yes Yes Table 30. License File Settings License files can be loaded onto a media server at any time. For a number of features, the SPE can activate or deactivate a feature based on the newly uploaded license file without performing a reboot. For example, 81 if a license file is loaded which increases the number of IP endpoints allowed in the system, the SPE will allow more IP phones to register immediately. However, the ESS customer option, FEAT_ESS, does not work that way. If the current license file loaded onto a machine has FEAT_ESS set to “no” and the new license file has FEAT_ESS set to “yes”, the main SPE needs to be rebooted in order to turn on the ESS feature after the new license file is loaded. To this end, it is wise to have FEAT_ESS enabled in the initial installation or upgrade even if no ESS servers are currently being deployed. Doing this prevents a system reboot in the future when, and if, the ESS feature is set up in the system. 10.2 What are the Minimum EI Vintages Required for the ESS Feature ? As mentioned previously, a port network is controlled by an enhanced angel or an arch-angel. The archangel is the master of the port network’s TDM bus, scanning the other angels in the carrier for information and feeding that information back to the SPE for processing. If a port network does not have an active arch-angel, then that PN is completely out of service because no information is successfully being exchanged between the SPE and the angels residing on the port network. Various recovery methods have been discussed in the single cluster reliability section which attempt to activate one of the enhanced angels residing on the port networks by bringing up an EAL link. The EAL link can terminate at possibly four different locations in a port network (a-side IPSI, b-side IPSI, a-side EI, or b-side EI). Before the introduction of ESS, there was only one SPE in the system that could instantiate the EAL links and therefore it was ensured that only one EAL would be brought up to a port network at once. If a software fault or race condition occurs that causes two active EAL links to be brought up to the same port network concurrently, then two arch-angels will be activated. This leads to a deadlocked PN and is referred to as dueling arch-angels. Due to the possibility, albeit a very low probability, of software faults and race conditions, hardware level protection was introduced to avoid this scenario of dueling arch-angels. This hardware level protection is justified because nothing can be done remotely from the servers to bring the PN back online. The only way to recover a port network that is suffering from this dueling arch-angel condition is to physically remove one of the boards supporting one of the activated arch-angels. The hardware level protection took the form of an arch-angel token. The concept is that only one entity, and one entity only, has the ability to become the arch-angel if, and only if, it possesses the arch-angel token. This arch-angel token capability does not exist in all vintages of the EI boards as shown in the table below. EI Board TN570 TN2305 TN2306 Type of PNC Vintage A or B CSS C or D A ATM B A ATM B Table 31. Arch-Angel Token Support Arch-angel Token Aware No Yes No Yes No Yes While dueling arch-angels in a single cluster environment is very rare (as per the reason it is only suggested to have arch-angel token support on all boards rather than required), it is almost guaranteed to occur in ESS deployments. The following figure shows port network #2 failed over to an ESS server. As mentioned previously, if an IPSI loses its connection to its controlling cluster, it begins a no-service timer before requesting service from an alternate source. During this time, the previously controlling cluster attempts traditional recovery methods. If, for whatever reason, the fall-back recovery fails, the IPSI would request service from the alternate ESS server. At this point everything is stable with PN #1 under the control of the main server and PN #2 under the control of the ESS server. However, as far as the main server is concerned, it believes PN #2 is out-of-service and is therefore continuously attempting to restore service to it by re-connecting to the IPSI or establishing an EAL to the EI board (fall-back). If the connection to the IPSI finally succeeds, the IPSI will allow the main server to return to its priority list, but not instantly switch back control (see How ESS Works section). On the other hand, if the failure which was preventing the tunneled EAL from being established was rectified, the main server would attempt to activate the arch- 82 angel on the EI board. Since there is no real-time communication between the main server and the ESS servers, both servers continually attempt to control the port network as shown below. This classic case of dueling arch-angels is avoided by having the IPSI and EI boards token aware. If this was the case, the main SPE would never succeed in bringing up an EAL to the EI board while the ESS server has an established EAL to the IPSI. Main ESS Main ESS LAN IPSI LAN IPSI IPSI EAL #1 EPN #1 IPSI EAL #1 EAL #2 EPN #2 EI EPN #1 EI EAL #2 EPN #2 EI EAL #2 EI EAL #2 PNC PNC Fragmented PNC prevents EAL from getting established. Dueling arch-angels if EI board in PN #2 is not token aware. Figure 61. Dueling Arch-angels in ESS Environment In summary, arch-angel token aware boards are required in any port network which contains them both EI and IPSI boards. There are three different configurations for a port network – no EI boards (PN using an IP PNC), only EI boards (PN using a CSS or ATM PNC with no controlling IPSIs), and both EI boards and IPSIs (PN using a CSS or ATM PNC with controlling IPSIs). Only the final configuration mentioned requires EI boards be token aware. 10.3 What IPSI Vintages Support the ESS Feature ? IPSI boards were the main new component to the PBX system which allowed the SPE to control port networks via IP as opposed to only through a private CSS or ATM PNC. The original release of the IPSI, referred to as IPSI-1 or IPSI-AP, was able to reside in the tone clock slots supported by either traditional MCC/SCC cabinets or the G600 cabinetry. From the initial release of ACM up to release of the ESS offering, there were changes made to the IPSI’s firmware and hardware for both new feature support and bug fixes. In order for the IPSI to be able to also reside in the new G650 cabinet, it needed to be able to supply maintenance board functionality and be able to interface with a new carrier’s backplane. The new IPSI is referred to as the IPSI-2 or IPSI-BP. The new functionality needed for the ESS feature was first supported in firmware release 20 (anything previous to this release is called pre-ESS firmware) and could run on either IPSI-AP boards or IPSI-BP boards. In other words, the ESS feature does not dictate, nor require, the need for IPSI hardware versions (IPSI-AP or IPSI-BP), only the carrier type does. The table below shows which IPSI versions are compatible with which software versions. 83 Pre-ESS IPSI Firmware ESS IPSI Firmware Pre-ESS Software Supported Supported Main Server Supported Supported ESS Servers NOT SUPPORTED Supported Table 32 – IPSI Firmware Compatibility Table It should be concluded that new firmware will be backwards compatible with pre-ESS release software and also completely supported by all types of servers. In addition, the new software on main servers is backwards compatible with pre-ESS firmware, but ESS servers cannot interface with it. 10.4 What is the Effect of Non-Controlling IPSIs on ESS ? Throughout this document, IPSIs have only been discussed as an IP control interface into the port networks for the SPE. In addition, the IPSI also was described as a conglomeration of different components – a tone clock, a PKTINT, an arch-angel, and (for IPSI-BP boards only) a cabinet maintenance board. On a typical upgrade the IPSI boards replace the tone clock board and replicate its functionality. In fact, if an IPSI was inserted into a port network supported by a G3r SPE in place of a tone clock, it would report itself as a tone clock and the G3r SPE would be none the wiser. When the G650 cabinet was released it needed to have a maintenance board and a tone clock, but only one slot was available for it. Therefore a new board could have been released to support these features or the IPSI could be used instead. For an IP connected PN, the decision was simple because an IPSI was needed for control anyway. However, if the PN was interconnected via a CSS or ATM PNC an IPSI is not necessarily required for control (it could be tunneled through the EI boards). Therefore the concept of a non-controlling IPSI was created which essentially says, “The port network needs a tone clock and a maintenance board which the IPSI can provide, but do not control the PN through it”. The reason a system might be configured like this is to avoid having to run an IP network from the SPE servers to the PN itself. However, the ramification of this is that the system treats non-controlling IPSIs as purely tone clock boards which prevents any server (main or ESS) from connecting to it in order to control it. As discussed previously, the IPSI creates a failover priority list based on the ESS servers that connect to it. Since a noncontrolling IPSI never receives any connection, it does not have the ability to failover to ESS servers. An IPSI cannot be administered as a non-controlling IPSI with respect to the main server and a controlling IPSI with respect to an ESS server since the translation sets the servers are running are identical. 84 Section 11 - Index Section 1 - Introduction 1.1 What is ESS ? 1.2 How is this paper organized ? Section 2 - Background 2.1 What is an Avaya Media Server ? 2.2 What are Port Networks ? 2.3 How are Port Networks Interconnected ? 2.4 How are Port Networks Controlled ? 2.5 What are SPE Restarts ? 2.6 What are EPN restarts ? Section 3 - Reliability – Single Cluster Environments 3.1 What is a Cluster ? 3.2 How do single cluster environments achieve high availability ? 3.3 What is a Server Interchange ? 3.4 What is Server Arbitration ? 3.5 What is an IPSI Interchange ? 3.6 What is EI Control Fallback ? Section 4 - Reliability – Multiple Cluster Environments 4.1 How long does it take to be operational ? 4.2 What are Survivable Remote Processors (SRP) ? 4.3 What are ATM WAN Spare Pairs (ATM-WSP) ? 4.4 What are Manual Backup Servers (MBS) ? Section 5 - ESS Overview 5.1 What are Enterprise Survivable Servers (ESS) ? 5.2 What is ESS designed to protect against ? 5.3 What is ESS NOT designed to protect against ? Section 6 - How does ESS Work ? 6.1 What is ESS Registration ? 6.2 How are ESS Translations Updated ? 6.3 How do Media Servers and IPSIs Communicate ? 6.4 What is a Priority Score ? 6.5 How do IPSIs Manage Priority Lists ? 6.6 How are Communication Faults Detected ? 6.7 Under What Conditions do IPSIs Request Service from ESS Clusters ? 6.8 How Does Overriding IPSI Service Requests Work ? Section 7 - ESS in Control 7.1 What Happens to Non-IP Phone Calls During Failovers ? 7.2 What Happens to IP Phone Calls During Failovers ? 7.3 What Happens to H.248 Gateway Calls During Failovers ? 7.4 How are Call Flows altered when the System is Fragmented ? 7.5 What Happens to Centralized Resources when a System is Fragmented ? 7.6 What Happens to Voicemail Access when a System is Fragmented ? 85 Section 8 - ESS Variants Based on System Configuration 8.1 Basic IP PNC Configured System 8.2 Catastrophic Server Failure in an IP PNC Environment 8.3 Network Fragmentation Failure in an IP PNC Environment 8.4 Basic ATM PNC Configured System 8.5 Catastrophic Server Failure in an ATM PNC Environment 8.6 Network Fragmentation Failure in an ATM PNC Environment 8.7 Basic CSS PNC Configured System 8.8 Catastrophic Server Failure in a CSS PNC Environment 8.9 Network Fragmentation Failure in a CSS PNC Environment 8.10 Mixed PNC Configured System Section 9 - ESS in Action - A Virtual Demo 9.1 Demo – Setup 9.2 Demo – Catastrophic Main Server Failure 9.3 Demo – Extended Network Fragmentation 9.4 Demo –Network Fragmentation Repaired 9.5 Demo – Main Cluster is Restored Section 10 - More Thoughts and Frequently Asked Questions 10.1 How is ESS Feature Enabled ? 10.2 What are the Minimum EI Vintages Required for the ESS Feature ? 10.3 What IPSI Vintages Support the ESS Feature ? 10.4 What is the Effect of Non-Controlling IPSIs on ESS ? 86