This document describes the possible causes and the recommended corrective actions for a Cisco Nexus 7000 6.0KW AC Power Supply Module failure alert.
Cisco recommends that you have basic knowledge of these topics:
The information in this document is based on these software and hardware versions:
The information in this document was created from the devices in a specific lab environment. All of the devices used in this document started with a cleared (default) configuration. If your network is live, make sure that you understand the potential impact of any command.
An N7K power supply module can be listed as failed for several different reasons, each with various impacts to the power that is provided to the chassis.
The power supply module failure can be reported as failed in numerous locations, such as:
Nexus7000# show environment power
Power Supply:
Voltage: 50 Volts
Power Actual Total
Supply Model Output Capacity Status
(Watts ) (Watts )
------- ------------------- ----------- ----------- --------------
1 N7K-AC-6.0KW 350 W 6000 W Ok
2 N7K-AC-6.0KW 470 W 6000 W Fail/Shut
3 N7K-AC-6.0KW 313 W 6000 W Ok
<snip>
2013 Dec 1 22:29:20.814 Nexus7000 PLATFORM-2-PS_FAIL Power supply 2
failed or shut down (Serial number AZS1000000W)
When an N7K power supply module fails, the reason for the failure is saved in the onboard 8-bit registers on the Power Supply Unit (PSU). In order to view these registers, enter the show environment power detail command into the CLI and look for the Hardware alam_bits line in the output:
Nexus7000# show environment power detail
<snip>
Power Usage Summary:
--------------------
Power Supply redundancy mode (configured) PS-Redundant
Power Supply redundancy mode (operational) PS-Redundant
Total Power Capacity (based on configured mode) 12000 W
Total Power of all Inputs (cumulative) 18000 W
Total Power Output (actual draw) 3060 W
Total Power Allocated (budget) 5593 W
Total Power Available for additional modules 6407 W
Power Usage details:
--------------------
Power reserved for Supervisor(s): 420 W
Power reserved for Fabric Module(s): 500 W
Power reserved for Fan Module(s): 1273 W
Total power reserved for Sups,Fabrics,Fans: 2193 W
Are all inlet chords connected: Yes
Power supply details:
---------------------
PS_1 total capacity: 6000 W Voltage:50V
chord 1 capacity: 3000 W
chord 1 connected to 220v AC
chord 2 capacity: 3000 W
chord 2 connected to 220v AC
Software-Alarm: No
Hardware alam_bits reg0:1A, reg1: 0, reg2: 0, reg3:10
Reg0 bit1: restarted successfully
Reg0 bit3: loss of line1
Reg0 bit4: loss of line2
Reg3 bit4: reserved
PS_2 total capacity: 6000 W Voltage:50V
chord 1 capacity: 3000 W
chord 1 connected to 220v AC
chord 2 capacity: 3000 W
chord 2 connected to 220v AC
Software-Alarm: No
Hardware alam_bits reg0: 2, reg1: 0, reg2:80, reg3: 10
Reg0 bit1: restarted successfully
PS_3 total capacity: 6000 W Voltage:50V
chord 1 capacity: 3000 W
chord 1 connected to 220v AC
chord 2 capacity: 3000 W
chord 2 connected to 220v AC
Software-Alarm: No
Hardware alam_bits reg0:1A, reg1: 0, reg2: 0, reg3:10
Reg0 bit1: restarted successfully
Reg0 bit3: loss of line1
Reg0 bit4: loss of line2
Reg3 bit4: reserved
In this example, you can see that Power Supply 2 (PS_2) has:
In order to determine the bits that are set in the 8-bit registers, you must convert the Hexadecimal (HEX) values into 8-bit Binary values. Here is an example:
Register | HEX Value | Binary Value | Bit Set (0 Based) |
---|---|---|---|
reg0 | 2 | 0000 0010 | 1 |
reg2 | 80 | 1000 0000 | 7 |
reg3 | 10 | 0001 0000 | 4 |
Based on the tables that are provided in this section, match the register number and the bit set in order to find the failure reason and the recommended corrective action.
Bit | Default Value | Bit name | Comment | Recommended Action |
---|---|---|---|---|
7
|
0
|
PEC Error | Latches to 1 if a PEC error is detected on an SMBus write cycle (read cycle PECs are checked by the Supervisor). | Reset and monitor for a reoccurrence. Look for instances of PEC errors for other devices on the SMBus. |
6
|
0
|
Invalid Access | Latches to 1 if a read-only or unused register or location is written to or an unused location is read. | Reset and monitor for a reoccurrence. Look for instances of errors for other devices on the SMBus. |
5
|
0
|
Data out of range | Latches to 1 if an attempt to change a control register to an invalid value. | Reset and monitor for a reoccurrence. Look for instances of errors for other devices on the SMBus. |
4
|
0
|
Loss of AC 2 | AC line 2 is < spec allowed. Latched | Check the AC input. |
3
|
0
|
Loss of AC 1 | AC line 1 is < spec allowed. Latched | Check the AC input. |
2
|
0
|
Shutdown occurred | Latches to 1 if a supply shut down has occurred. | Check the PSU switch. |
1
|
0
|
Started successfully | The power supply module can restart from a shutdown condition if the event that causes the shutdown has recovered. Set this bit to 1 once the power supply module has started successfully. It can be cleared by System software by Writing 1 to this bit. This flag provides information to the controller that an event has occurred that has been resolved. This information is useful because a restart clears all status and alarm flags and an interrupt sent out from the power supply might still be outstanding for the controller to service. | Informational Only. No action is required. |
0
|
0
|
Enable pin HI | The power supply is shut down because the hardware enable signal is HI. | The PSU is grounded internally, which is expected if the PSU switch is off. If the PSU switch is on, toggle the switch. Replace the PSU. |
Bit | Default Value | Bit name | Comment | Recommended Action |
---|---|---|---|---|
7
|
0
|
Internal fault | Internal diagnostics failed. | Potential cosmetic issue only (refer to Cisco bug ID CSCty78612). Reset the PSU. Replace the PSU. |
6
|
0
|
Power Cycle Occurred | Latched to 1 if controlled shut down occurs under: 1) Power Cycle bit register 40 bit 5 has been set |
Informational only. No action is required. |
5
|
0
|
50V 2 Over-current shutdown | The supply has shut down because the 50V output 2 exceeded rated current. | Check the AC input. Reset the PSU. |
4
|
0
|
50V 1 Over-current shutdown | The supply has shut down because the 50V output 1 exceeded rated current. | Check the AC input. Reset the PSU. |
3
|
0
|
3.4V Over-current shutdown | The supply has shut down because the 3.4V output exceeded rated current. | Check the AC input. Reset the PSU. |
2
|
0
|
50V 2 Over-voltage shutdown | The supply has shut down because the 50V output 2 exceeded rated voltage. | Check the AC input. Reset the PSU. |
1
|
0
|
50V 1 Over-voltage shutdown | The supply has shut down because the 50V output 1 exceeded rated voltage. | Check the AC input. Reset the PSU. |
0
|
0
|
3.4V Over-voltage shutdown | The supply has shut down because the 3.4V output exceeded rated voltage. | Check the AC input. Reset the PSU. |
Bit | Default Value | Bit name | Comment | Recommended Action |
---|---|---|---|---|
7
|
0
|
Fan Fault | Latches 1 if the fan speed drops below 70% of the normal operating speed. The power supply module will not shut down because of a fan fault condition. | Check fan for obstructions. Replace the PSU. |
6
|
0
|
Thermal sensor Failed | One of the thermal sensors has failed. | Replace the PSU. |
5
|
0
|
Boost 2 over temp. shutdown | The supply has shutdown because of a boost 2 over-temperature condition. | Check the environment. |
4
|
0
|
Boost 1 over temp. shutdown | The supply has shutdown because of a boost 1 over-temperature condition. | Check the environment. |
3
|
0
|
50V 2 over temp. shutdown | The supply has shutdown because of a 50V output 2 over-temperature condition. | Check the environment. |
2
|
0
|
50V 1 over temp. shutdown | The supply has shutdown because of a 50V output 1 over-temperature condition. | Check the environment. |
1
|
0
|
3.4V over temp. shutdown | The supply has shutdown because of a 3.4V output over-temperature condition. | Check the environment. |
0
|
0
|
Over-temp warning | Issued 5 seconds prior to a thermal shutdown event. | Check the environment. |
Bit | Default Value | Bit name | Comment | Recommended Action |
---|---|---|---|---|
7
|
0
|
Force Shut Down | If the power supply is shut down via the power knob key, then this bit will be at logic 1; otherwise, logic 0. | Informational only. No action is required. |
6
|
0
|
Unused | ||
5
|
0
|
Unused | ||
4
|
0
|
Input Mode Change | If the input mode of AC1 or AC2 changes, this bit is set to 1. | Informational only. No action is required. |
3
|
0
|
Current Share Fault | If the two modules fail to current share, this bit is set to 1. | Reset the PSU. Replace the PSU. |
2
|
0
|
50V module 2 under voltage | The 50V output of module 2 fell below the rated voltage. Alarm only if AC2 is on. | Replace the PSU. |
1
|
0
|
50V module 1 under voltage | The 50V output of module 1 fell below the rated voltage. Alarm only if AC1 is on. | Replace the PSU. |
0
|
0
|
3.4V under voltage | The 3.4V output fell below rated voltage. | Replace the PSU. |
With the information that is described in the examples that are used throughout this document, you can see that the power supply fan failed through the setting of Register 2, Bit 7. The fan was checked for obstructions (as recommended in the table), but none were found. The PSU was then replaced via Return Material Authorization (RMA).
Revision | Publish Date | Comments |
---|---|---|
1.0 |
21-May-2015 |
Initial Release |