THIS FIELD NOTICE IS PROVIDED ON AN "AS IS" BASIS AND DOES NOT IMPLY ANY KIND OF GUARANTEE OR WARRANTY, INCLUDING THE WARRANTY OF MERCHANTABILITY. YOUR USE OF THE INFORMATION ON THE FIELD NOTICE OR MATERIALS LINKED FROM THE FIELD NOTICE IS AT YOUR OWN RISK. CISCO RESERVES THE RIGHT TO CHANGE OR UPDATE THIS FIELD NOTICE AT ANY TIME.
Affected Product Name | Description | Comments |
---|---|---|
DS-C9396S-K9 | ^MDS 9396S HW base (48 ports active) | |
DS-X9648-1536K9 | MDS 9700 48-Port 32-Gbps Fibre Channel Switching Module |
Defect ID | Headline |
CSCwh53262 | 'machine check' error triggers reload on MDS 9396S switch and warning on MDS 32 Gbps FC linecard |
This issue has platform dependent symptoms:
This issue applies to the above products only when running one of the following Cisco MDS NX-OS releases:
Earlier releases were affected by a similar issue described in FN72346.
Only a subset of devices are affected by this issue. It is not possible to determine in advance which devices will be affected or when.
This issue is caused by a PCIe (Peripheral Component Interconnect Express) operation that might fail after an indeterminate runtime. The success or failure of the operation depends on which address is being accessed by the operation. This can lead to a variety of symptoms and prevents prediction of which devices will be affected.
The symptom is different for each product:
For a Cisco MDS 9396S fabric switch, the switch resets without any user intervention.
This is disruptive to traffic. The reason is 'HA policy of Reset' and the service is 'f16_mac_usd'. This can be seen with the show system reset-reason command:
Reason: Reset triggered due to HA policy of Reset Service: f16_mac_usd hap reset
Version: 8.4(2d)
For a Cisco MDS 9700 32 Gbps linecard, a transient nondisruptive EOBC heartbeat failure occurs. EOBC heartbeat failure syslog messages from the module occur on both active and standby supervisors. These can be seen with the show logging logfile command, as shown in the following example:
%MODULE-4-MOD_WARNING: Module 1 (Serial number: ...) reported warning 1/1-1/0 due to EOBC heartbeat failure on standby sup in device DEV_EOBC_MAC (device error 0xc0a0504f) %MODULE-4-MOD_WARNING: Module 1 (Serial number: ...) reported warning 1/1-1/0 due to EOBC heartbeat failure in device DEV_EOBC_MAC (device error 0xc0a0514d)
In both cases, a machine check error and 'MCSR=0x80200000' value are logged after the event. To determine whether a device has been affected by this issue, run the following commands:
For Cisco MDS 9396S, use the show system internal kernel nvram-messages previous command.
For Cisco MDS 9700 32 Gbps linecards, use the slot <slot> show system internal kernel messages command, where <slot> is the slot number of the module.
If the device is affected by this issue, the following output is displayed:
Machine check in kernel mode on Core ... comm [...] pid [...] L2 Machine check on Core ...: MCSR=0x80200000 ...
Solution: To permanently resolve this issue, upgrade to a fixed version of Cisco MDS NX-OS. A fix will be available in the upcoming Cisco MDS 9.4(2) release, scheduled to be available in the second quarter of 2024.
Workaround: In order to temporarily prevent this issue, reset the device uptime to zero days.
In order to display the current uptime of a device, enter the slot x show system uptime command (where x is the slot number of the relevant device). The slot number is always 1 for MDS 9396S Fabric switches.
For Cisco MDS 9396S Fabric switches, execute a nondisruptive switch reload or a nondisruptive switch upgrade. To nondisruptively reload the switch use the reload system non-disruptive command. The prompt takes several minutes to return while the switch CPU is nondisruptively reloaded. To nondisruptively upgrade the switch to another release of NX-OS use the install all command.
To determine the release of Cisco MDS NX-OS running on a device, enter the show version CLI command. The release is shown in the "system" field, as shown in the following example:
switch# show version
...
Software BIOS: version 5.2.25 loader: version N/A kickstart: version 9.2(1) system: version 9.2(1)
To determine a device Product ID, enter the show inventory CLI command. The Product ID is shown in the "PID" field.
For MDS 9396S fabric switches:
switch# show inventory
NAME: "Chassis", DESCR: "MDS 9396S 96X16G FC (2 RU) Chassis"
PID: DS-C9396S-K9 , VID: V00 , SN: JPG191800A0
For MDS 9700 32 Gbps FC modules:
switch# show inventory
NAME: "Slot 1" , DESCR: "4/8/16/32 Gbps Advanced FC Module"
PID: DS-X9648-1536K9 , VID: V01 , SN: JAE210308AT
This issue is due to an incomplete fix for FN72346.
Version | Description | Section | Date |
1.0 | Initial Release | — | 2024-MAR-27 |
For further assistance or for more information about this field notice, contact the Cisco Technical Assistance Center (TAC) using one of the following methods:
To receive email updates about Field Notices (reliability and safety issues), Security Advisories (network security issues), and end-of-life announcements for specific Cisco products, set up a profile in My Notifications.
Unleash the Power of TAC's Virtual Assistance