International Engineering Consortium
Web ProForums
High-Availability Considerations for Softswitch-Based Networks

4. High-Availability Techniques for the Network Office

There are multiple techniques available to meet high-availability requirements. This section analyses these techniques:

  • Fault-tolerant hardware
  • Fault-tolerant software
  • System redundancy
  • Network redundancy

Fault Tolerant Hardware

The traditional approach to achieving high availability in stand-alone systems is with fault tolerant hardware. With this technique, redundant hardware is built into the hardware platform, and the active hardware is constantly monitored for failures. When a failure is detected, switchover to the redundant hardware must occur seamlessly (i.e., no calls being handled by the component are impacted). To achieve this seamless "fail-over" to standby hardware, identical software images must be executing on both the primary and redundant hardware.

The downside to fault-tolerant hardware, compared to standard hardware platforms, is cost and complexity—the cost of hardware dedicated to redundancy (hot standby) and the expensive hardware architecture to support fast switchover. Furthermore, this approach only addresses hardware faults, neglecting other areas for failure, including software and network faults.

Fault-Tolerant Software

Fault-tolerant software monitors the "health" of individual software elements, transferring the affected functions to a different or new process upon the detection of a problem. The switchover may be made to a different version of software or the same version that was originally experiencing problems.

Detecting software problems is much more complex than detecting hardware problems. And while switchover can still be done in a matter of seconds, it is not necessarily instantaneous as with fault-tolerant hardware.

In order to meet the specifications of the network office, the backup process usually executes in parallel with the primary process. It is usual for the primary and backup processes to exchange context information regarding the processing being performed. When properly designed, software fail-over typically provides the ability to maintain all connections that were already established. Connections that were in the process of being set up at the time of the fail-over, however, might not get set up and hence be dropped.

Fault-tolerant software provides additional benefits. Fault-tolerant software mechanisms can be used to execute different versions of the software as the primary and backup processes, thereby providing the ability to gracefully upgrade the system's software without interrupting normal operation.

System Redundancy

System redundancy blends features of both hardware and software redundancy. It can also recover from some types of network failures, such as an element becoming isolated from the rest of the network.

It is typically implemented as a warm or hot standby platform, executing in parallel with the primary platform. The following is an example:

  • A softswitch might have both a primary and redundant signaling gateway element.
  • The "health" of the primary system's hardware and software is monitored.
  • If a problem is detected, fail-over to the backup system is performed in a matter of seconds.

In order to monitor the primary system, messages, more commonly known as "heartbeats," are exchanged either between the two systems or with a third system (typically a management system monitoring the overall softswitch). Similar to software redundancy, this also requires context-information exchange between the primary and backup processes. While some standardized context-sharing mechanisms are available from vendors (e.g., Sun), these mechanisms are not well suited for the real-time nature of softswitches. Therefore, context sharing is normally implemented via specialized mechanisms by the softswitch vendor.

With system redundancy, it is not always necessary to have one-to-one redundancy of hardware platforms. The functions of the primary system are backed up on other systems, but this can be done in a distributed configuration. The following strategy permits more efficient utilization of hardware platforms. The following is an example:

  • A softswitch may have five distinct call processing platforms.
  • If one of those platforms fails, its processing load can be distributed among the four remaining systems.

Since redundancy is provided on a separate platform, different versions of software can be executed on each platform. When a new version of software is deployed on a live network, it can be deployed on just one platform first (either the primary or backup). If any problems are encountered, the previous version of software can be back in operation in seconds.

Signaling gateway redundancy requires special considerations due to the physical connectivity of the SS7 signaling links. Since they are typically deployed in load-sharing pairs, it is advisable to distribute the SS7 signaling links between the primary and backup systems. The recommended configuration includes both pairs being used whenever both systems are operating properly, with one software process controlling both pairs.

Network Redundancy

We have presented how reliable communication between the softswitch components and other devices such as the media gateways is a critical requirement of the network office. Just as important is network redundancy—a critical softswitch operation that consists of two main pieces:

  1. Multiple network interfaces
  2. The ability to switch communication between the interfaces in real time

The approach to multiple network interfaces is fairly straightforward:

  • Each softswitch component is typically configured with a dual Ethernet network interface.
  • The component then monitors both network interfaces and their connectivity to remote systems.
  • If a network interface fails, all communication is switched to the other network interface.
  • If connectivity to a single system is lost, then communication to that specific system is switched to a different network interface.

Since multiple network interfaces may be in use simultaneously with each connected to a different subnet, the interfaces will each be assigned with different IP addresses. Even in the event of routing problems inside the network, this approach provides fully independent communication paths.

Because multiple IP addresses may be used, switching between the interfaces in real time requires special handling. Communication between the softswitch platforms needs to be reliable. One possible approach would be to use transmission control protocol (TCP). However, TCP sessions are bound to IP addresses, implying that the TCP session must be regenerated whenever a network interface switch must be made. This situation can result in long fail-over times including the possibility of lost or duplicated messages.

The stream control transmission protocol (SCTP) addresses that problem by offering session resiliency between softswitch platforms across multiple network interfaces. In the event that communication between a pair of systems is switched to a different network interface in the middle of a conversation, SCTP includes the mechanisms to ensure that the message stream delivery is guaranteed. SCTP also provides other efficiency benefits over TCP such as selective acknowledgments.

SCTP covers a broad set of specifications that are still being developed by the Internet Engineering Task Force (IETF). Until SCTP deployment is more widespread, partial implementations or vendor-specific implementations will be necessary. This will impact the progress of standards development. Here are examples of how some softswitch vendors are implementing SCTP in various areas:

  • Implementing stabilized portions of the current standard
  • Implementing network redundancy using SCTP between its elements
  • Implementation instead of their own protocols to deliver some of the SCTP functionality
  • Addressing network failures with the two pieces of network redundancy—multiple interfaces and protocols for real-time fail-over.

Registered Users
Enjoy exclusive access to free On-Line Education and receive the biweekly IEC newsletter.

IEC Newsletter
Get the latest industry information including critical insights from key industry leaders, technology briefings, and an Analyst Corner.
Current
Subscribe

Newsroom

IEC Corporate Member

Advertising Kit