THIS FIELD NOTICE IS PROVIDED ON AN "AS IS" BASIS AND DOES NOT IMPLY ANY KIND OF GUARANTEE OR WARRANTY, INCLUDING THE WARRANTY OF MERCHANTABILITY. YOUR USE OF THE INFORMATION ON THE FIELD NOTICE OR MATERIALS LINKED FROM THE FIELD NOTICE IS AT YOUR OWN RISK. CISCO RESERVES THE RIGHT TO CHANGE OR UPDATE THIS FIELD NOTICE AT ANY TIME.
Affected Product Name | Description | Comments |
---|---|---|
DS-C9396S-K9 | MDS 9396S Fabric Switch (48 ports) | |
DS-X9648-1536K9 | MDS 9700 48-Port 32-Gbps Fibre Channel Switching Module |
Defect ID | Headline |
CSCwh53262 | 'machine check' error triggers reload on MDS 9396S switch and warning on MDS 32 Gbps FC linecard |
This issue has platform dependent symptoms:
Only a subset of devices are affected by this issue. It is not possible to determine in advance which devices will be affected or when.
This issue applies to the above products only when running one of the following Cisco MDS NX-OS releases:
Earlier releases were affected by a similar issue described in FN72346.
This issue is caused by a PCIe (Peripheral Component Interconnect Express) operation that might fail after an indeterminate runtime. The success or failure of the operation depends on which address is being accessed by the operation. This can lead to a variety of symptoms and prevents prediction of which devices will be affected.
The symptom is different for each product:
Reason: Reset triggered due to HA policy of Reset Service: f16_mac_usd hap reset
Version: 8.4(2d)
%MODULE-4-MOD_WARNING: Module 1 (Serial number: ...) reported warning 1/1-1/0 due to EOBC heartbeat failure on standby sup in device DEV_EOBC_MAC (device error 0xc0a0504f) %MODULE-4-MOD_WARNING: Module 1 (Serial number: ...) reported warning 1/1-1/0 due to EOBC heartbeat failure in device DEV_EOBC_MAC (device error 0xc0a0514d)
In both cases, a machine check error and 'MCSR=0x80200000' value are logged after the event. To determine whether a device has been affected by this issue, run the following commands:
If the device is affected by this issue, the following output is displayed:
Machine check in kernel mode on Core ... comm [...] pid [...] L2 Machine check on Core ...: MCSR=0x80200000 ...
Solution: To permanently resolve this issue, upgrade to a fixed version of Cisco MDS NX-OS. This issue is resolved in Cisco MDS NX-OS 9.4(2) and later releases.
Workaround:
For Cisco MDS 9396S Fabric switches there is no workaround.
To determine the release of Cisco MDS NX-OS running on a device, enter the show version CLI command. The release is shown in the 'system' field, as shown in the following example:
switch# show version
...
Software BIOS: version 5.2.25 loader: version N/A kickstart: version 9.2(1) system: version 9.2(1)
To determine a device Product ID, enter the show inventory CLI command. The Product ID is shown in the 'PID' field.
switch# show inventory
NAME: "Chassis", DESCR: "MDS 9396S 96X16G FC (2 RU) Chassis"
PID: DS-C9396S-K9 , VID: V00 , SN: JPG191800A0
switch# show inventory
NAME: "Slot 1" , DESCR: "4/8/16/32 Gbps Advanced FC Module"
PID: DS-X9648-1536K9 , VID: V01 , SN: JAE210308AT
This issue is due to an incomplete fix for FN72346.
Version | Description | Section | Date |
1.1 | Updated fixed releases and workarounds. | Problem Description, Workaround/Solution | 2024-DEC-16 |
1.0 | Initial Release | — | 2024-MAR-27 |
For further assistance or for more information about this field notice, contact the Cisco Technical Assistance Center (TAC) using one of the following methods:
To receive email updates about Field Notices (reliability and safety issues), Security Advisories (network security issues), and end-of-life announcements for specific Cisco products, set up a profile in My Notifications.
Unleash the Power of TAC's Virtual Assistance