Backup and Recovery of Key KPI Statistics

This feature allows the backup of GGSN, P-GW, SAEGW, and/or S-GW counters for recovery of key KPI counter values after a session manager (SessMgr) restart.

This chapter includes the following information:

Feature Description

Before the Backup and Recovery of Key KPI Statistics feature was implemented, statistics were not backed up and could not be recovered after a SessMgr task restart. Due to this limitation, monitoring the KPI was a problem as the GGSN, P-GW, SAEGW, and S-GW would lose statistical information whenever task restarts occurred.

KPI calculation involves taking a delta between counter values from two time intervals and then determines the percentage of successful processing of a particular procedure in that time interval. When a SessMgr crashes and then recovers, the GGSN, P-GW, SAEGW, and S-GW lose the counter values - they are reset to zero. So, the KPI calculation in the next interval will result in negative values for that interval. This results in a dip in the graphs plotted using the KPI values, making it difficult for operations team to get a consistent view of the network performance to determine if there is a genuine issue or not.

This feature makes it possible to perform reliable KPI calculations even if a SessMgr restart occurs.

How It Works

A key set of counters used in KPI computation will be backed up for recovery if a SessMgr task restarts. The counters that will be backed up are determined by the KPIs typically used in several operator networks.

The backup of counters is enabled or disabled via configuration. The configuration specifies the product (GGSN, P-GW, SAEGW, and/or S-GW) for which counters will be backed up and also a time interval for the backup of the counters.

Architecture

When this feature is enabled (see Configuring Backup Statistics Feature below), the GGSN, P-GW, SAEGW, and/or S-GW only backs up the counters maintained at the SessMgr. The recovery function does not need to be configured or started as it occurs automatically as needed when the feature is enabled.

The counters are backed up to the AAAMgr that is paired with the SessMgr.

Checkpointing

Node-level statistics are checkpointed at AAAMgr. Once statistics are backed up for a specific product, all the associated services, such as eGTP-C and GTP-U statistics, are also checkpointed.

Recovery

When SessMgr restarts, recovery is performed by receiving all the stored statistics from the mapped AAAMgr and the recovered values are added to the backup counters maintained at per-service level. This will not impact session recovery time as the backed up counters are pushed to SessMgr only after session recovery is complete.

Since session recovery is complete, the session managers may start processing calls. In such cases, the counters will continue to be incremented. The recovered values of the corresponding counters will always be added to the existing counters. Gauge counters are checkpointed but not recovered.

Order of Statistics Collection

The upper limit of checkpoint messaging is a maximum of 1 MB. Before picking any node to checkpoint, available memory is checked. If memory is insufficient, the whole node is discarded.

Since there is 1 MB limit, nodes/statistics to checkpoint are prioritized as follows:

  1. SAEGW statistics:

    • P-GW and S-GW service node-level statistics collected

  2. P-GW service node configuration will store the following statistics:

    • P-GW, eGTP-C ingress, GTP-U ingress, per-interface (s2a, s2b, s5s8), and GGSN (if associated) statistics collected

    • SAEGW associated P-GW service statistics not collected

  3. S-GW service node configuration will store the following statistics:

    • S-GW, eGTP-C ingress/egress, and GTP-U ingress/egress statistics collected

    • SAEGW associated S-GW service statistics not collected

  4. GGSN statistics:

    • GGSN service statistics, if not associated with P-GW service, collected

  5. Session disconnect reasons collected if GGSN/P-GW/SAEGW/S-GW is enabled

Error Handling

If adding new statistics is going to cause overflow of 1 MB buffer, that service and the corresponding node will not be included. Checkpointing of any further nodes will also be stopped. Error level log will be flagged if total memory requirement goes above 1 MB.

Limitations

  • A backup interval must be specified and counters are backed up only at the specified interval. For example, if the backup interval is specified as 5 minutes, then counters are backed up every 5 minutes. Suppose backup happened at Nth minute and the configured backup interval is for every 5 minutes, then if a task crash happens at N+4 minutes, the GGSN, P-GW, SAEGW, and/or S-GW recovers only the values backed up at Nth minute and the data for the past 4 minutes is lost.

  • Only statistics maintained at the SessMgr are backed up. Statistics at other managers are not backed up or recovered.

  • The following statistics are not considered for backup:
    • APN-level statistics

    • eGTP-C APN-QCI statistics

    • DemuxMgr statistics

  • The CLI command clear statistics will not trigger checkpoint to delete the node statistics on AAAMgr. New checkpoint after timer expiry will overwrite the statistics.

  • Maximum of 1 MB of statistics will be stored on AAAMgr. Services after the maximum size limit are not backed up.

  • Setting the backup interval to shorter periods of time causes higher system overhead for checkpointing. Alternately, setting the backup interval to longer periods of time results in lower system overhead for checkpointing but higher probability of hitting the 1 MB storage limit.

  • If SessMgr restarts and AAAMgr restarts before SessMgr recovers statistics from AAAMgr, then backed up statistics are lost.

  • This feature is not applicable for ICSR.

Configuring Backup Statistics Feature

For the Backup and Recovery of Key KPI Statistics feature to work, it must be enabled by configuring the backup of statistics for the GGSN, P-GW, SAEGW, and/or S-GW.

Configuration

Enabling

The following CLI commands are used to manage the functionality for the backing up of the key KPI statistics feature.

The following configures the backup of statistics for the GGSN, P-GW, SAEGW, and/or S-GW and enables the Backup and Recovery of Key KPI Statistics feature.

configure 
      statistics-backup { ggsn | pgw | saegw | sgw }  
      exit 
 

Setting the Backup Interval

The following command configures the number of minutes (0 to 60) between each backup of the statistics. When the backup interval is not specified, a default value of 5 minutes is used as the backup interval

configure 
      statistics-backup-interval minutes 
      exit 
 

Important

Setting the backup interval to shorter periods of time causes higher system overhead for checkpointing. Alternately, setting the backup interval to longer periods of time results in lower system overhead for checkpointing but higher probability of hitting the 1 MB storage limit.


Disabling

The following configures the GGSN, P-GW, SAEGW, and/or S-GW to disable the backing up of statistics for the GGSN, P-GW, SAEGW, and/or S-GW.

configure 
      no statistics-backup { ggsn | pgw | saegw | sgw }  
      exit 
 

Verifying the Backup Statistics Feature Configuration

Use either the show configuration command or the show configuration verbose command to display the feature configuration.

If the feature was enabled in the configuration, two lines similar to the following will appear in the output of a show configuration [ verbose ] command:

statistics-backup  pgw 
statistics-backup-interval 5 

Notes:

  • The interval displayed is 5 minutes. 5 is the default. If the statistics-backup-interval command is included in the configuration, then the 5 would be replaced by the configured interval number of minutes.
  • If the command to disable the feature is entered, then no statistics-backup line is displayed in the output generated by a show configuration [ verbose ] command.