Switchover in DCNM HA
Problem When Active node (represented as A) goes down, the Standby node (represented as B) takes the role of the Active node. However, when A node comes up, it takes the role of the Active node again. This condition is known as Switchover in DCNM HA. The old Active node must not take the role of Active node unless a failover is triggered or the HA heartbeat instances cannot talk to eachother.
Possible Cause This occurs when shutdown or no shutdown command is executed on the switch interfaces connected to DCNM eth1 interfaces. Heartbeat detects a Split-Brian syndrome. As both the nodes detect this condition, both the nodes will shut down and restart. Therefore, the DCNM Web UI A node becomes active again, when node A is operational again.
Solution
HA Ping feature allows you to shut down the heartbeat instances that cannot reach (ping) a specified device on the network.
Configure HA Ping IP address and Peer IP address on the device, by using the following commands on both the HA nodes.
HA_PING_ADDRESS=
PEER_ETH1_IP=
echo "* * * * * root /sbin/ha-ping.sh" > /etc/cron.d/ha-ping
echo "IP=$HA_PING_ADDRESS" > $DCNM_HA_HOME/ha-ping.conf
echo "PEER_IP=$PEER_ETH1_IP" >> $DCNM_HA_HOME/ha-ping.conf
chkconfig heartbeat off
sed -i "s/APP_STATUS_HEARTBEAT=.*/APP_STATUS_HEARTBEAT=ha-ping/g" /root/.DO_NOT_DELETE
To avoid HA Switchover, increase the Heartbeat's deadtime to 60 or 90 seconds. This avoids re-occurrence of this issue if shut and no shut duration is 30 to 60 seconds apart.
To increase the timers and thereby avoid the HA Switchover, you must edit the deadtime specified in the /etc/ha.d/ha.cfg file.
The following shows an example to edit the edit /etc/ha.d/ha.cfg file.
dcnm-standby# stop ha-apps
dcnm-active# stop ha-apps
Edit the deadtime value in /etc/ha.d/ha.cfg file on both the nodes.
Execute the appmgr start ha-apps on old Active node.
dcnm-active# appmgr start ha-apps
Wait until the Active node is active again. Verify the role using the show ha-role command.
Execute the appmgr start ha-apps on old Standby node.
dcnm-standby# appmgr start ha-apps