THIS FIELD NOTICE IS PROVIDED ON AN "AS IS" BASIS AND DOES NOT IMPLY ANY KIND OF GUARANTEE OR WARRANTY, INCLUDING THE WARRANTY OF MERCHANTABILITY. YOUR USE OF THE INFORMATION ON THE FIELD NOTICE OR MATERIALS LINKED FROM THE FIELD NOTICE IS AT YOUR OWN RISK. CISCO RESERVES THE RIGHT TO CHANGE OR UPDATE THIS FIELD NOTICE AT ANY TIME.
Revision | Publish Date | Comments |
---|---|---|
1.0 |
21-Jan-22 |
Initial Release |
Affected OS Type | Affected Software Product | Affected Release | Affected Release Number | Comments |
---|---|---|---|---|
NON-IOS |
Unified Computing System (UCS) Server Software Bundle |
4.1 |
4.1(2b) |
For FI-Attached (UCS Manager) Servers |
NON-IOS |
Unified Computing System (UCS) Server Firmware |
4.1 |
4.1(2b) |
For Standalone (IMC SW) Servers |
Defect ID | Headline |
---|---|
CSCvw49192 | BIOS POST hang with 2x memory refresh rate |
During the Power On Self Test (POST), the server might hang when it performs platform power characterization. This will appear in the Keyboard/Video/Mouse (KVM) Interface as being stuck at the "Loading PTU Driver" screen. The system will not progress past POST and might also assert Catastrophic Error (CATERR) in the system event logs (SELs).
This issue only applies to Intel-based B, C, HX, and S-Series Unified Computing System (UCS) servers that run the 4.1(2b) server firmware.
UCS servers will perform a power characterization to understand what the expected minimum and maximum power required will be. This power characterization occurs if "Global Power Profiling Policy" is configured in UCS Manager or will occur automatically on UCS-C rack servers after a BIOS update. This process utilizes the Intel Node Manager Power Thermal Utility (NM-PTU) driver. In some hardware configurations with "Memory Refresh Rate" as "2x Refresh", the lower power limit test will cause the CPU to not process memory transactions, which leads to the CATERR and the hang symptom. Firmware version 4.1(2b) changed the default "Memory Refresh Rate" to "2x Refresh", and thus exposed this condition.
This issue occurs when the system will not progress past power characterization. The Power Thermal Utility (PTU) driver will load and attempt to run power characterization. This will not complete and the system will hang at this screen. POST and discovery will not complete and the system is unusable.
During the power characterization process, some memory configurations will encounter a CATERR when the BIOS token "Memory Refresh Rate" is set to "2x Refresh". Along with CATERR, additional memory errors might be logged for one or more memory DIMMs. These memory errors are benign and not indicative of a DIMM failure; the errors are asserted incorrectly. Do not replace any DIMMs which assert errors as a result of this issue. This condition is only possible on UCS M5 servers that run the aforementioned firmware versions. Earlier UCS server generations are not impacted by this issue.
An upgrade is required in order to resolve this issue. Perform a BIOS upgrade to fully eliminate chances of this behavior. Once on fixed BIOS, you can use the 2x memory refresh rate without concern.
As a workaround while on impacted BIOS, you can change the Memory Refresh Rate in your BIOS from 2x (default) to 1x refresh rate.
If your system is stuck and will not apply the new BIOS policy, you can perform a manual BIOS downgrade to the backup BIOS stored on the server. For further instructions on how to accomplish this, contact the Cisco Technical Assistance Center (TAC).
This issue only applies to Intel-based B, C, HX, and S-Series UCS servers that run the 4.1(2b) server firmware.
This issue is restricted only to the 4.1(2b) version of UCS server firmware. Only Intel-based M5 servers are impacted by this issue. Both B and C Series M5 servers are impacted. Both Standalone and UCS Manager Integrated Rack servers are impacted on the 4.1(2b) release.
UCS Manager
The simplest way to identify an impacted UCS Managed server is through the UCS Manager CLI. Log into your UCS Manager and enter the show server firmware bios
command. In this example, server 4/1 runs the impacted 4.1(2b) version:
Standalone C-Series (Cisco Integrated Management Controller)
The Cisco Integrated Management Controller (Cisco IMC) version is listed on the Summary Home page of your HTTP UI. Log into your server’s Cisco IMC and check the BIOS Version and Firmware Version. In this example, the impacted 4.1(2b) version is shown.
Intersight
Impacted servers can be identified via Intersight. Choose OPERATE > Servers and create a new filter for Firmware Version 4.1(2b). In this case, you can see that a server runs the impacted 4.1(2b) version, however it is an S3260 M4 server and thus NOT impacted.
If you require further assistance, or if you have any further questions regarding this field notice, please contact the Cisco Systems Technical Assistance Center (TAC) by one of the following methods:
My Notifications—Set up a profile to receive email updates about reliability, safety, network security, and end-of-sale issues for the Cisco products you specify.
Unleash the Power of TAC's Virtual Assistance