Cluster Quorum and Master Switch Election
This section describes the Cisco IOA cluster quorum and the process for electing the master switch in a cluster.
Node ID
Every switch in a cluster has a node ID. Cisco IOA assigns a node ID to every new switch as it is added to the cluster. The switch where the cluster is created is assigned the node ID of 1. This is the master switch. When a new switch is added to the cluster, it is assigned the next available higher node ID. For example, when a second switch is added to the cluster it gets the node ID of 2 and the third switch gets the node ID of 3, and so on.
Cluster View
The cluster view is the set of switches that are part of the operational cluster.
Cluster Quorum
For a cluster to be operational, it must include more than half the number of configured switches in the cluster view. In an N-node cluster, N/2 + 1 nodes form a cluster quorum.
If N is even, the cluster quorum requires N/2 nodes and also, the presence of the switch with the lowest node ID.
The quorum logic ensures that in the event of cluster partitions at least one partition can be operational. All other switches are nonoperational. This guarantees the consistency of the cluster.
Master Switch Election
When a cluster is created, the switch on which the cluster is created becomes the cluster master switch. When the master switch fails or is rebooted, another switch takes over as the master switch. The master election logic uses the node ID and the latest cluster configuration to determine which switch in the cluster will become the master switch. The master election logic is described as follows:
-
If the master switch fails in an operational cluster, the switch with the next lowest node ID takes over as the master switch.
Note that in an operational cluster, all the switches run the same cluster configuration.
- When the previous master switch comes back online and joins the cluster, it does not immediately become the master.
-
When all the switches of a cluster are coming up, the switch that has the latest cluster configuration becomes the master
switch. If there are multiple switches with the same configuration, the switch with the lowest node ID is chosen to be the
master switch.
- Once a master switch is chosen and the cluster is operational (there is a quorum), even if a switch with a lower node ID joins the cluster at a later time, the master switch does not change.
For example, there are three switches S1, S2, and S3 with node IDs 1, 2, and 3, respectively. If switches S2 and S3 form a quorum then switch S2 becomes the master switch. Even if switch S1 with the node ID of 1 comes up and joins the cluster at a later time, switch S2 continues to be the master. However, if switch S2 goes down for any reason, switch S1 will become the master switch.
Two-Switch Cluster Scenarios
According to the cluster quorum logic, a cluster with two configured switches can be operational if both switches are operational or the switch with the lowest node ID is operational.
In the latter case, the switch with the lowest node ID is the master of the one-switch cluster. The other switch could have failed or simply lost connectivity to the operational switch. In either case, the switch with the higher node ID would become nonoperational. If the node with the lower node ID failed, the other switch cannot form an operational cluster.
The examples that follow describe these scenarios. The first three examples consider single switch failures.
- Assume that in a two-switch
cluster with switches S1 (node ID 1) and S2 (node ID 2), S1 is the master (the
master has the lower node ID).
When the switches lose connectivity between them, the master switch S1 continues to be operational since it has the lower node ID and can form an (N/2) switch cluster. Switch S2 becomes nonoperational.
- Assume that in a two-switch
cluster with switches S1 (node ID 1) and S2 (node ID 2), S2 is the master (note
that the master has the higher node ID because it has the latest configuration
when both the switches came online).
When the switches lose connectivity between them, switch S2 becomes nonoperational and S1 takes over as the master to form a 1-switch cluster. This is consistent with the quorum logic in a two-switch cluster (N/2 with lowest node ID).
- Assume that in a two-switch
cluster with switches S1 (node ID 1) and S2 (node ID 2). If S1 fails
(regardless of which switch was the master), S2 will also become
non-operational as long as S1 is down.
When S1 comes up, S1 and S2 will form a two-switch cluster.
The next set of examples describe reboots of both switches (S1 with node ID 1 and S2 with node ID 2):
Caution |
If you perform any configuration change on a cluster, you must save the running configuration to the startup configuration by entering the copy running-config startup-config CLI command on all switches before rebooting them. Otherwise, the cluster may not form correctly after the reboot (see example Example 3). |
- After a reboot, if both
switches S1 and S2 come up about the same time, a two-switch cluster will be
formed.
- If the cluster configurations are the same, S1 (with the lower node ID) will become the master.
- If the cluster configurations are different, the switch with the latest cluster configuration will become the master.
- After a reboot, if switch S2 comes up first, it will not be able to form a cluster until S1 also comes up. After that, the algorithm explained in the previous case will be used.
- After a reboot, if switch S1 comes up first, it will form a one-switch cluster (N/2 with lowest node ID). When S2 comes up, it will join the cluster to form a two-switch cluster.
When S2 comes up and if it happens to have the latest cluster configuration in the startup configuration (this can happen if you did not save the running configuration to the startup configuration on S1 but did so on S2), it will not be able to join the cluster formed by S1.
Caution |
It is critical that you save the running configuration on all switches before a reboot. |
Three-Switch Cluster Scenarios
In a three-switch cluster, the quorum requires two switches to be in the cluster view (N/2 + 1). The examples below explain three scenarios in a three-switch cluster with switches S1 (node ID 1), S2 (node ID 2) and S3 (node ID 3). S1 is the master switch.
- In a three-switch operational cluster, if switch S3 fails or loses connectivity with the other two switches, then S3 becomes nonoperational. Switches S1 and S2 will form an operational cluster. When S3 comes up again, it will rejoin the cluster.
- In a three-switch operational cluster, if the master switch S1 fails or loses connectivity with the other two switches, then S1 becomes nonoperational. Switches S2 and S3 will form an operational cluster and S2 will be the master. When S1 comes up again, it will rejoin the cluster. Note that S2 will continue to be the master.
- If two switches fail, the cluster will become nonoperational.
These examples describe reboots on all switches in the cluster:
Caution |
If you perform any configuration change on a cluster, you must save the running configuration to the startup configuration by entering the copy running-config startup-config command on all switches before rebooting them. Otherwise, the cluster may not form correctly after the reboot. |
- After a reboot, if all
switches come up at about the same time, first a 2-switch cluster will be
formed and later the third switch will be added.
- If the cluster configurations are the same, S1 (with the lower node ID) will become the master switch and form the 2-switch cluster first; and then add the third switch.
- If the cluster configurations are different, the switch that is running the latest configuration will become the master switch and then form a 2-switch cluster; and then add the third switch.
- After a reboot, if the
switches come up one at a time, a 2-switch cluster will be formed after the
first two switches are up. Later, when the third switch comes online, it will
join the cluster.
If the third switch happens to be running the latest cluster configuration in the startup configuration (this can happen if you save the running configuration only on this switch but not on the other two), the third switch will not be able to join the cluster.
Caution |
It is critical that you save the running configuration on all switches before a reboot. |
Four-Switch Cluster Scenarios
The four-switch cluster scenario is very similar to the examples above. The cluster will be operational if the cluster view has at least three switches (N/2 + 1), or if the cluster view has two switches including the switch with the lowest node ID (N/2 with lowest node ID).