How Session Recovery Works
This section provides an overview of how this feature is implemented and the recovery process.
The Session Recovery feature provides seamless failover and reconstruction of subscriber session information in the event of a hardware or software fault within the system preventing a fully connected user session from being disconnected.
Session recovery is performed by mirroring key software processes (for example, session manager and AAA manager) within the system. These mirrored processes remain in an idle state (standby-mode) wherein they perform no processing, until they may be needed in the event of a software failure (for example, a session manager task aborts).
The system spawns new instances of "standby mode" session and AAA managers for each active control processor (CP) being used. These mirrored processes require both memory and processing resources, which means that additional hardware may be required to enable this feature (see Additional ASR 5500 Hardware Requirements).
Other key system-level software tasks, such as VPN manager, are performed on a physically separate packet processing card to ensure that a double software fault (for example, session manager and VPN manager fails at same time on same card) cannot occur. The packet processing card that hosts the VPN manager process is in active mode and reserved by the operating system for this sole use when session recovery is enabled.
-
Task recovery mode: Wherein one or more session manager failures occur and are recovered without the need to use resources on a standby packet processing card. In this mode, recovery is performed by using the mirrored "standby-mode" session manager task(s) running on active packet processing cards. The "standby-mode" task is renamed, made active, and is then populated using information from other tasks such as AAA manager. In case of Task failure, limited subscribers will be affected and will suffer outage only until the task starts back up.
-
Full packet processing card recovery mode: Used when a packet processing card hardware failure occurs, or when a planned packet processing card migration fails. In this mode, the standby packet processing card is made active and the "standby-mode" session manager and AAA manager tasks on the newly activated packet processing card perform session recovery.
Session/Call state information is saved in the peer AAA manager task because each AAA manager and session manager task is paired together. These pairs are started on physically different packet processing cards to ensure task recovery.
-
Additional software or hardware failures occur during the session recovery operation. For example, an AAA manager fails while the state information it contained was being used to populate the newly activated session manager task.
-
A lack of hardware resources (packet processing card memory and control processors) to support session recovery.
Important |
After a session recovery operation, some statistics, such as those collected and maintained on a per manager basis (AAA Manager, Session Manager, etc.) are in general not recovered, only accounting and billing related information is checkpointed and recovered. |
-
Any session needing L2TP LAC support (excluding regenerated PPP on top of an HA or GGSN session)
-
ASR 5500 only – Closed RP PDSN services supporting simple IP, Mobile IP, and Proxy Mobile IP
-
ASR 5500 only – eHRPD service (evolved High Rate Packet Data)
-
ASR 5500 only – ePDG service (evolved Packet Data Gateway)
-
GGSN services for IPv4 and PPP PDP contexts
-
HA services supporting Mobile IP and/or Proxy Mobile IP session types with or without per-user Layer 3 tunnels
-
ASR 5500 only – HNB-GW: HNB Session over IuH
-
ASR 5500 only – HNB-GW: HNB-CN Session over IuPS and IuCS
-
ASR 5500 only – HNB-GW: SeGW Session IPSec Tunnel
-
ASR 5500 only – HSGW services for IPv4
-
IPCF (Intelligent Policy Control Function)
-
ASR 5500 only – IPSG-only systems (IP Services Gateway)
-
LNS session types (L2TP Network Server)
-
MME (Mobility Management Entity)
-
ASR 5500 only – NEMO (Network Mobility )
-
P-GW services for IPv4
-
ASR 5500 only – PDIF (Packet Data Interworking Function)
-
PDSN services supporting simple IP, Mobile IP, and Proxy Mobile IP
-
S-GW (Serving Gateway)
-
SGSN (Serving GPRS Support Node ) services
-
ASR 5000 and VPC-DI – IPv6 and IPv4IPv6 (dual) PDP session recovery is supported for 3G and 2G services
-
SaMOG (S2a Mobility over GTP) Gateway (CGW and MRME)
-
ASR 5500 only – SAE-GW (System Architecture Evolution Gateway)
-
ASR 5500 only – SGSN services (3G and 2.5G services) for IPv4 and PPP PDP contexts
-
Destination-based accounting recovery
-
GGSN network initiated connections
-
GGSN session using more than 1 service instance
-
MIP/L2TP with IPSec integration
-
MIP session with multiple concurrent bindings
-
Mobile IP sessions with L2TP
-
Multiple MIP sessions
-
:RAB recovery
Important |
Always refer to the Administration Guides for individual products for other possible session recovery and Interchassis Session Recovery (ICSR) support limitations. |
-
Data and control state information required to maintain correct call behavior.
-
A minimal set of subscriber data statistics; required to ensure that accounting information is maintained.
-
A best-effort attempt to recover various timer values such as call duration, absolute time, and others.
-
The idle time timer is reset to zero and the re-registration timer is reset to its maximum value for HA sessions to provide a more conservative approach to session recovery.
Important |
Any partially connected calls (for example, a session where HA authentication was pending but has not yet been acknowledged by the AAA server) are not recovered when a failure occurs. |
Note |
Failure of critical tasks will result in restarting StarOS. Kernel failures, hypervisor failures or hardware failures will result in the VM restarting or going offline. The use of ICSR between two VPC-DIs or two VPC-SIs is the recommended solution for these types of failure. |