A hot restart system has the fastest recovery time but is the most complex to implement. In a hot restart model, the application saves state information about the current activity of the system. That information is given to the standby component so it is ready to take over quickly. The application must be designed to restart using this state information.
A hot restart system also requires that a standby component is designated prior to a fault management event (see Figure 2). In clustered systems (2N), this is straightforward, as there is a one-to-one correspondence between components and their standbys. In N+1 systems, hot restart requires the standby device to save the state of multiple components. The standby must have the extra capacity to do this. Otherwise, a warm restart model must be used.

Figure 2. Hot Standby
A warm restart is similar to the hot restart model. In a warm restart model, the applications save state information about the current activity of the system, and the standby component is not designated until the fault management cycle is in progress. Then the standby component is configured with the necessary application and state information. This adds time to the restart process but can reduce costs associated with the standby components (see Figure 3). Warm restart is also easier to implement in systems where the standby devices are not identical to the active devices.

Figure 3. Warm Restart in N+1
A cold restart is the least complex to implement but requires the most time. A cold restart implies the starting place for the standby element is its initialization point. A cold restart is used when no information is available about the state of the failing component. The last known state is therefore initialization.
A cold restart system can be implemented with little or no changes to the applications of a system. The high-availability-specific software components can be relegated to operating system software and services. The price for this simplicity is that restart times are much longer, and current activity in the system may be lost.
The times associated with the different restart models vary depending on the implementation of the system and application software. In relative terms, if the hot restart model is 1X, the warm restart model can be 2X to 3X, and the cold restart model is approximately 10X to 100X. Restart times will be the lowest in systems where the system software and applications are designed to support high availability.


