Recovering Fabric Interconnect During Upgrade
If one or both fabric interconnects fail during failover or firmware upgrade, you can recover them by using one of the following approaches:
-
Recover a fabric interconnect when you do not have a working image on the fabric interconnect
-
Recover a fabric interconnect when you have a working image on the fabric interconnect
-
Recover an unresponsive fabric interconnect during upgrade or failover
-
Recover fabric interconnects from a failed FSM during upgrade with Auto Install
Recovering Fabric Interconnects When You Do Not Have Working Images on The Fabric Interconnect or The Bootflash
You can perform these steps when both or any fabric interconnect goes down during firmware upgrade, gets rebooted, and is stuck at the loader prompt, and you do not have working images on the fabric interconnect.
Procedure
Step 1 |
Reboot the switch, and in the console, press Ctrl+L as it boots to get the loader prompt.
Example:
|
||
Step 2 |
Configure the interface to receive the kickstart image through TFTP.
|
||
Step 3 |
Enter the init system command at the switch(boot)# prompt. This will reformat the fabric interconnect. Example:
|
||
Step 4 |
Configure the management interface. |
||
Step 5 |
Copy the kickstart, system, and Cisco UCS Manager management images from the TFTP server to the bootflash. Example:
|
||
Step 6 |
Create separate directories for installables and installables/switch in the bootflash. Example:
|
||
Step 7 |
Copy the kickstart, system, and Cisco UCS Manager images to the installables/switch directory. Example:
|
||
Step 8 |
Ensure that the management image is linked to nuova-sim-mgmt-nsg.0.1.0.001.bin. nuova-sim-mgmt-nsg.0.1.0.001.bin is the name that the reserved system image uses, and it makes the management image Cisco UCS Manager-compliant. Example:
|
||
Step 9 |
Reload the switch. Example:
|
||
Step 10 |
Boot from the kickstart image. Example:
|
||
Step 11 |
Load the system image. The Basic System Configuration Dialog wizard appears after the system image is completely loaded. Use this wizard to configure the fabric interconnect. Example:
|
||
Step 12 |
Log in to Cisco UCS Manager and download the firmware. Example:
|
||
Step 13 |
After the firmware download is complete, activate the fabric interconnect firmware and Cisco UCS Manager firmware. This step updates Cisco UCS Manager and the fabric interconnects to the version you want, and then reboots them. Example:
|
Recovering Fabric Interconnect During Upgrade When You have Working Images on the Bootflash
You can perform these steps when both or any fabric interconnect goes down during firmware upgrade, gets rebooted, and is stuck at the loader prompt.
Before you begin
You must have working images on the bootflash to perform these steps.
Procedure
Step 1 |
Reboot the switch, and in the console, press Ctrl+L as it boots to get the loader prompt.
Example:
|
||
Step 2 |
Run the dir command. The list of available kernel, system, and Cisco UCS Manager images in the bootflash appears. Example:
|
||
Step 3 |
Boot the kernel firmware version from the bootflash.
Example:
|
||
Step 4 |
Ensure that the management image is linked to nuova-sim-mgmt-nsg.0.1.0.001.bin. nuova-sim-mgmt-nsg.0.1.0.001.bin is the name that the reserved system image uses, and it makes the management image Cisco UCS Manager-compliant. Example:
|
||
Step 5 |
Load the system image. Example:
|
||
Step 6 |
Log in to Cisco UCS Manager and update your fabric interconnect and Cisco UCS Manager software to the version that you want. |
Recovering Unresponsive Fabric Interconnects During Upgrade or Failover
During upgrade or failover, avoid performing the following tasks because they introduce additional risk:
-
Pmon stop/start
-
FI reboots – power cycle or CLI
-
HA failover
Procedure
Step 1 |
If the httpd_cimc.sh process is lost, as documented in CSCup70756, you lose access to the KVM. Continue with the failover or contact Cisco Technical Assistance. |
Step 2 |
If you lose access to the KVM on the primary side, continue with the failover to resolve the issue. |
Step 3 |
If KVM is needed or is down on the subordinate side, start only that service using the debug plugin. Contact TAC to run the debug image. |
Step 4 |
If the /dev/null issue is encountered, as documented in CSCuo50049, fix the rights to 666 with the debug-plugin at both steps if required. Contact Cisco Technical Assistance to run debug commands. |
Step 5 |
If both CSCup70756 and CSCuo50049 are encountered, it can cause VIP loss. If the VIP is lost, do the following:
|
Recovering Fabric Interconnects From a Failed FSM During Upgrade With Auto Install
You can perform these steps when all the following occur:
-
You are upgrading or downgrading firmware using Auto Install between Cisco UCS Manager Release 3.1(2) and Release 3.1(3) while a service pack is installed on the fabric interconnects.
-
Both or any fabric interconnect goes down because of an FSM failure or multiple retries in the DeployPollActivate stage of the FSM
Procedure
Step 1 |
When the FSM fails, or when multiple retries are observed in the DeployPollActivate stage of the FSM on the subordinate fabric interconnect, do the following: |
Step 2 |
Upgrade the infrastructure firmware using the force option through Auto Install. Example:
|
Step 3 |
Acknowledge the reboot of the primary fabric interconnect. Example:
|
Step 4 |
When the FSM fails, or when multiple retries are observed in the DeployPollActivate stage of the FSM on the current subordinate fabric interconnect, do the following: |