Information About High Availability
High availability (HA) in controllers allows you to reduce the downtime of the wireless networks that occurs due to the failover of controllers.
A 1:1 (Active:Standby-Hot) stateful switchover of access points and clients is supported (HA SSO). In an HA architecture, one controller is configured as the primary controller and another controller as the secondary controller.
After you enable HA, the primary and secondary controllers are rebooted. During the boot process, the role of the primary controller is negotiated as active and the role of the secondary controller as standby-hot. After a switchover, the secondary controller becomes the active controller and the primary controller becomes the standby-hot controller. After subsequent switchovers, the roles are interchanged between the primary and the secondary controllers. The reason or cause for most switchover events is due to a manual trigger, a controller and/or a network failure.
During an HA SSO failover event, all of the AP CAPWAP sessions and client sessions in RUN state on the controller are statefully switched over to the standby controller without interruption, except PMIPv6 clients, which will need to reconnect and authenticate to the controller following an HA SSO switchover. For additional client SSO behaviors and limitations, see the "Client SSO" section in the High Availability (SSO) Deployment Guide at:
https://www.cisco.com/c/en/us/td/docs/wireless/controller/technotes/8-1/HA_SSO_DG/High_Availability_DG.html#pgfId-53637The standby-hot controller continuously monitors the health of the active controller through its dedicated redundancy port. Both the controllers share the same configurations, including the IP address of the management interface.
Before you enable HA, ensure that both the controllers can successfully communicate with one another through their dedicated redundancy port, either through a direct cable connection or through Layer 2. For more details, see the "Redundancy Port Connectivity" section in the High Availability (SSO) Deployment Guide:
https://www.cisco.com/c/en/us/td/docs/wireless/controller/technotes/8-1/HA_SSO_DG/High_Availability_DG.html#pgfId-83028ACL and NAT IP configurations are synchronized to the HA standby controller when these parameters are configured before HA pair-up. If the NAT IP is set on the management interface, the access point sets the AP manager IP address as the NAT IP address.
The following are some guidelines for high availability:
-
We recommend that you do not pair two controllers of different hardware models. If they are paired, the higher controller model becomes the active controller and the other controller goes into maintenance mode.
-
We recommend that you do not pair two controllers on different controller software releases. If they are paired, the controller with the lower redundancy management address becomes the active controller and the other controller goes into maintenance mode.
-
All download file types, such as image, configuration, web-authentication bundle, and signature files– are downloaded on the active controller first and then pushed to the standby-hot controller.
-
Certificates should be downloaded separately on each controller before they are paired.
-
You can upload file types such as configuration files, event logs, crash files, and so on, from the standby-hot controller using the GUI or CLI of the active controller. You can also specify a suffix to the filename to identify the uploaded file.
-
To perform a peer upload, use the service port. In a management network, you can also use the redundancy management interface (RMI) that is mapped to the redundancy port or RMI VLAN, or both, where the RMI is the same as the management VLAN. Note that the RMI and the redundancy port should be in two separate Layer2 VLANs, which is a mandatory configuration.
-
If the controllers cannot reach each other through the redundant port and the RMI, the primary controller becomes active and the standby-hot controller goes into the maintenance mode.
Note
To achieve HA between two Cisco Wireless Services Module 2 (WiSM2) platforms, the controllers should be deployed on a single chassis, or on multiple chassis using a virtual switching system (VSS) and extending a redundancy VLAN between the multiple chassis.
Note
A redundancy VLAN should be a nonroutable VLAN in which a Layer 3 interface should not be created for the VLAN, and the interface should be allowed on the trunk port to extend an HA setup between multiple chassis. Redundancy VLAN should be created like any other data VLAN on Cisco IOS-based switching software. A redundancy VLAN is connected to the redundant port on Cisco WiSM2 through the backplane. It is not necessary to configure the IP address for the redundancy VLAN because the IP address is automatically generated. Also, ensure that the redundancy VLAN is not the same as the management VLAN.
For more information, see the "High Availability Connectivity Using Redundant VLAN on WiSM-2 WLC" section in the in the High Availability (SSO) Deployment Guide at:
https://www.cisco.com/c/en/us/td/docs/wireless/controller/technotes/8-1/HA_SSO_DG/High_Availability_DG.html#pgfId-43232
Note
When the RMIs for two controllers that are a pair, and that are mapped to same VLAN and connected to same Layer3 switch stop working, the standby controller is restarted.
The "mobilityHaMac is out of range" XML message is seen during the active/standby second switchover in an HA setup. This occurs if mobility HA MAC field is more than 128.
-
When HA is enabled, the standby controller always uses the Remote Method Invocation (RMI), and all the other interfaces—dynamic and management, are invalid.
Note
The RMI is meant to be used only for active and standby communications and not for any other purpose.
-
When HA is enabled, ensure that you do not use the backup image. If this image is used, the HA feature might not work as expected:
-
The service port and route information that is configured is lost after you enable SSO. You must configure the service port and route information again after you enable SSO. You can configure the service port and route information for the standby-hot controller using the peer-service-port and peer-route commands.
-
For Cisco WiSM2, service port reconfigurations are required after you enable redundancy. Otherwise, Cisco WiSM2 might not be able to communicate with the supervisor. We recommend that you enable DHCP on the service port before you enable redundancy.
-
We recommend that you do not use the reset command on the standby-hot controller directly. If you use this, unsaved configurations will be lost.
-
-
We recommend that you enable link aggregation configuration on the controllers before you enable the port channel in the infrastructure switches.
-
All the configurations that require reboot of the active controller results in the reboot of the standby-hot controller.
-
The Rogue AP Ignore list is not synchronized from the active controller to the standby-hot controller. The list is relearned through SNMP messages from Cisco Prime Infrastructure after the standby-hot controller becomes active.
-
Client SSO related guidelines:
-
The standby controller maintains two client lists: one is a list of clients in the Run state and the other is a list of transient clients in all the other states.
-
Only the clients that are in the Run state are maintained during failover. Clients that are in transition, such as roaming, 802.1X key regeneration, web authentication logout, and so on, are dissociated.
-
As with AP SSO, Client SSO is supported only on WLANs. The controllers must be in the same subnet. Layer3 connection is not supported.
-
-
In Release 7.3.x, AP SSO is supported, but client SSO is not supported, which means that after an HA setup that uses Release 7.3.x encounters a switchover, all the clients associated with the controller are deauthenticated and forced to reassociate.
-
You must manually configure the mobility MAC address on the then active controller post switchover, when a peer controller has a controller software release that is prior to Release 7.2.
Redundancy Management Interface
The active and standby-hot controllers use the RMI to check the health of the peer controller and the default gateway of the management interface through network infrastructure.
The RMI is also used to send notifications from the active controller to the standby-hot controller if a failure or manual reset occurs. The standby-hot controller uses the RMI to communicate to the syslog, NTP/SNTP server, FTP, and TFTP server.
It is mandatory to configure the IP addresses of the Redundancy Management Interface and the Management Interface in the same subnet on both the primary and secondary controllers.
Redundancy Port
The redundancy port is used for configuration, operational data synchronization, and role negotiation between the primary and secondary controllers.
The redundancy port checks for peer reachability by sending UDP keepalive messages every 100 milliseconds (default frequency) from the standby-hot controller to the active controller. If a failure of the active controller occurs, the redundancy port is used to notify the standby-hot controller.
If an NTP/SNTP server is not configured, the redundancy port performs a time synchronization from the active controller to the standby-hot controller.
In Cisco WiSM2, the redundancy VLAN must be configured on the Cisco Catalyst 6000 Supervisor Engine because there is no physical redundancy port available on Cisco WiSM2.
The redundancy port and the redundancy VLAN in Cisco WiSM2 are assigned an automatically generated IP address in which the last two octets are obtained from the last two octets of the RMI. The first two octets are always 169.254. For example, if the IP address of the RMI is 209.165.200.225, the IP address of the redundancy port is 169.254.200.225.
The redundancy ports can connect over an L2 switch. Ensure that the redundancy port round-trip time is less than 80 milliseconds if the keepalive timer is set to default, that is, 100 milliseconds, or 80 percent of the keepalive timer if you have configured the keepalive timer in the range of 100 milliseconds to 400 milliseconds. The failure detection time is calculated, for example, if the keepalive timer is set to 100 milliseconds, as follows: 3 * 100 = 300 + 60 = 360 + jitter (12 milliseconds) = ~400 milliseconds. Also, ensure that the bandwidth between redundancy ports is 60 Mbps or higher. Ensure that the maximum transmission unit (MTU) is 1500 bytes or higher.