Increase in Monitoring of Peers Supported Through Heartbeat Mechanism for PMIP Sessions

This chapter describes the following topics:

Feature Summary and Revision History

Summary Data

Applicable Product(s) or Functional Area

  • P-GW

  • SAEGW

Applicable Platform(s)

  • ASR 5500

  • VPC - SI

  • VPC - DI

Feature Default

Disabled - Configuration Required

Related Changes in This Release

Not Applicable

Related Documentation

  • Command Line Interface Reference

  • P-GW Administration Guide

  • SAEGW Administration Guide

Revision History

Revision Details

Release

First introduced.

21.4

Feature Description

In the existing setup, the HA Manager supports monitoring of Proxy Mobile IPv6 (PMIP) sessions for up to 256 peers through the heartbeat mechanism. Now there is a requirement to increase the monitoring of peers from 256 to 128000.

To increase the number of PMIP sessions to enable more peers to be monitored for path failure with the heartbeat mechanism, a new CLI monitor-max-peers is added under the LMA Service Configuration mode. This feature supports the following behavior:

  • When configured, the maximum number of peers that can be supported for heartbeat monitoring can be increased from 256 to 128000 peers.

  • The first 128000 peers are identified during the calls irrespective of whether the heartbeat mechanism is enabled or not.

  • A separate list is maintained for retransmission heartbeats and periodic heartbeats for batch processing.

  • The decision to monitor peers is done at the time of call setup, recovery, and ICSR. For example, consider that there are more than 256 peers (considering the CLI is configured for a maximum of 128000 peers) that are being monitored for heartbeat. Later, this configuration is changed to default, which is for a maximum of only 256 peers. Then, monitoring continues for all peers until HA Manager recovery or ICSR (with monitor-max-peers configuration of a maximum 256 peers) occurs.

  • The parameters for batch processing for heartbeat messages are changed as follows:

    Batch Size (before)

    Batch Size

    Batch Interval (before)

    Batch Interval

    Periodic heartbeat batch

    100

    550

    200 ms

    200 ms

    Retransmission heartbeat batch

    100

    550

    200 ms

    100 ms

  • If more than 10% of peers (12800 peers) are not responding, then the detection of path failure of nodes is delayed. This delay is to avoid a huge impact on performance when such a condition occurs.

    If retransmissions start occurring for more than the batch size expected based on the calculations, then the heartbeat messages follow the periodic timer for sending heartbeat messages. For example, if the configuration of the heartbeat interval is 60 seconds, retransmission timeout is 3 seconds, and maximum retries is 3.

    Now if the number of heartbeat messages for retransmissions exceed the expected batch size, then instead of a retransmission occurring every 3 seconds, retransmissions of heartbeat messages start with interval of 60 seconds. Therefore, under normal condition if a peer path failure was detected at a maximum of 9 seconds (3*3), it is now detected at 180 seconds (60*3).

  • Minimum heartbeat interval must be 60 seconds.

    If 128000 peers are configured for monitoring heartbeat, then heartbeat interval must not be configured for less than 60 seconds. If the heartbeat interval is configured for less than 60 seconds, then a configuration error is displayed.

  • Minimum heartbeat retransmission timeout should be three seconds.

    If 128000 peers are configured for monitoring heartbeat, then heartbeat retransmission timeout must not be configured for less than three seconds. If the heartbeat interval is configured for less than 60 seconds, then a configuration error is displayed.

  • The CLI is configured at the service level but the list is maintained at the instance level. Therefore, it is recommended that all services have the same configuration.

    If services have different configuration, then the limitation is based on that service level configuration. However, the maximum number of peers is determined based on how many peers are already there in that instance.

    For example, consider two services: lma1 and lma2. lma1 has the monitor-max-peers configured as 128000 peers. lma2 has monitor-max-peers configured as 256 peers. Now if the call comes from lma1, it checks the max peers limitation of 128000 peers. If the call comes from lma2, it checks max peers limitations of 256. However, for lma2 it may include all 256 peers that are being monitored in lma1.


Note

This feature is customer-specific. For more information, contact your Cisco account representative.


Configuring the Increase in Number of PMIP Sessions Supported with the Heartbeat Mechanism Feature

The following section provides the configuration commands to enable or disable the feature.

heartbeat monitor-max-peers

This new CLI command supports monitoring of a maximum of 128000 PMIP sessions through the heartbeat mechanism. This CLI is added under the LMA Service Configuration mode.

To configure monitoring of a maximum number of PMIP sessions, enter the following commands:

context  context_name 
 configure lma-service service_name 
  [ default ] heartbeat monitor-max-peers 
  end 

Notes

  • default: Monitors 256 peers through the heartbeat mechanism. This CLI is disabled by default.

  • heartbeat monitor-max-peers: Monitors a maximum of 128000 peers through the heartbeat mechanism.

Monitoring and Troubleshooting

This section provides information regarding show commands and/or their outputs in support of this feature.

Show Commands and/or Outputs

The output of the following CLI command has been enhanced in support of the feature.

show lma-service all

The following show lma-service all CLI command now includes the configured heartbeat monitor max peers value.

On configuring the new CLI – heartbeat monitor-max-peers:

show lma-service all 
  Heartbeat Support:        Enabled
  Heartbeat Interval:       60
  Heartbeat Retransmission timeout: 1
  Heartbeat Max Retransmissions: 1
  Heartbeat Monitor Max Peers: 128000
 

On configuring the default CLI – default heartbeat monitor-max-peers

show lma-service all
  Heartbeat Support:        Enabled
  Heartbeat Interval:       60
  Heartbeat Retransmission timeout: 1
  Heartbeat Max Retransmissions: 1
  Heartbeat Monitor Max Peers: 256
Restrictions:
  • A maximum of 128000 PMIP sessions can be monitored with the new CLI.

  • The following CLI restrictions are added for configuring heartbeat monitor-max-peers CLI command.

    At service startup time (boxer configuration boot):

    If heartbeat interval is less than 60 seconds (for the lma-service) or retransmission timeout is less than 3 seconds (across lma-services) then the monitor-max-peers command displays a configuration error and the monitor-max-peers configuration is not applied and vice-versa.

    At the time of updating the Service configuration:

    • While configuring the heartbeat interval of less than 60 seconds if the monitor-max-peers is already configured in that lma-service or across lma-services then it displays a configuration error.

    • While configuring a heartbeat retransmission timeout of less than 3 seconds if the monitor-max-peers is already configured in that lma-service or across lma-services then it displays a configuration error.

    • While configuring monitor-max-peers if that lma-service or across lma-services has a heartbeat interval of less than 60 seconds or the heartbeat retransmission timeout is less than 3 seconds, then it displays a configuration error.

      CLI error displayed:

      configure
      contex pgw
      lma-service lmav6
      heartbeat interval 40
      heartbeat monitor-max-peers 
      Failure: Recommended heartbeat interval: 60+, retransmission timeout: 3+, to configure monitor-max-peers. Please retry.
      heartbeat retransmission timeout 2
      heartbeat monitor-max-peers
      Failure: Recommended heartbeat interval: 60+, retransmission timeout: 3+, to configure monitor-max-peers. Please retry.
      end
       
      
      configure
      contex pgw
      lma-service lmav6
      heartbeat interval 60
      heartbeat retransmission timeout 3
      heartbeat monitor-max-peers 
      heartbeat interval 40
      Failure: Recommended heartbeat interval: 60+, in presence of monitor-max-peers. Please retry.
      heartbeat retransmission timeout 2
      Failure: Recommended heartbeat retransmission timeout: 3+, in presence of monitor-max-peers. Please retry.
      end
      
    • Unusual logs are displayed as follows when there is more than 10% path failure and there is a delay in the detection of path failure.

      “Retransmissons list size exceeds than expected, hb message will be sent with periodicty of configured HB interval for callid 20016”