THIS FIELD NOTICE IS PROVIDED ON AN "AS IS" BASIS AND DOES NOT IMPLY ANY KIND OF GUARANTEE OR WARRANTY, INCLUDING THE WARRANTY OF MERCHANTABILITY. YOUR USE OF THE INFORMATION ON THE FIELD NOTICE OR MATERIALS LINKED FROM THE FIELD NOTICE IS AT YOUR OWN RISK. CISCO RESERVES THE RIGHT TO CHANGE OR UPDATE THIS FIELD NOTICE AT ANY TIME.
Affected Product Name | Description | Comments |
---|---|---|
DS-C9396S-K9 | MDS 9396S HW base (48 ports active) | |
DS-X9648-1536K9 | MDS 9700 48-Port 32-Gbps Fibre Channel Switching Module |
Defect ID | Headline |
CSCvz61883 | Module disruptively reloads after 450-470 days uptime due to 'machine check' error |
MDS (Multilayer Director Switch) 9396S fabric switches and MDS 9700 32 Gbps FC modules might reload once during a window of 450-470 days of uptime. During the reload, ports on the affected device go down and disrupt traffic forwarding through the device.
Only a subset of devices are affected by this issue. It is not possible to determine in advance which devices will be affected. Once a device passes 470 days of uptime it will not be affected.
In addition, this issue applies only to these Cisco MDS NX-OS releases:
This issue is caused by a PCIe (Peripheral Component Interconnect Express) operation that might fail after a fixed period of runtime. The success or failure of the operation depends on what concurrently occurs on the CPU at that point. This can lead to a variety of symptoms and prevents prediction of which devices will be affected.
For MDS 9396S fabric switches:
If this issue is encountered, the switch will always reload which is disruptive to traffic. The uptime can vary between 450-470 days. These lines are in the output of the show system internal kernel nvram-messages previous
command after the reload:
L2 Machine check on Core 2: MCSR=0x80200000 l2plbstats1=0xc0000 00001000 100c: 00000010 f04: 00000000 10b: 00000000
The first line will match except for the core number. The second line will contain the string "100c: 00000010".
For MDS 9700 32 Gbps FC modules:
If this issue is encountered, it will occur very close to 468 days of uptime. The affected module will either:
These lines are in the output of the show logging logfile
command:
%MODULE-4-MOD_WARNING: Module 1 (Serial number: ...) reported warning 1/1-1/0 due to LC CPU not responding in device DEV_EOBC_MAC (device error 0xc0a0505c) %KERN-2-SYSTEM_MSG: [40437235.724980] LC [slot:1] CPU not responding... LC Status 0xf8 - kernel
These lines are in the output of the slot x show system internal kernel messages
command (where x is the slot number of the relevant device):
[40435278.830804] L2 Machine check on Core 0: MCSR=0x80200000 l2plbstats1=0xc0000 [40435279.604197] 110c: 00001000 100c: 00010010 f04: 00000000 10b: 00000000
The first line will match except the core number. The second line will contain the string "100c: 00000010".
The previous logs might appear more than once for a given module, a few minutes apart, which indicates multiple events.
This line is logged in the output of the show logging logfile
command:
%MODULE-2-MOD_NOT_ALIVE: Module 1 not responding... resetting (Serial number: ...)
This line is logged in the output of the show logging logfile
command:
%SYSMGR-SLOT1-2-SERVICE_CRASHED: Service "f32mac" (PID 1517) hasn't caught signal 6 (core will be saved).
In this case the show cores
command will show a core file from the module with a timestamp just before the module reload:
Module Instance Process-name PID Date(Year-Month-Day Time) ------ -------- --------------- -------- ------------------------- 1 1 f32cmn_usd 1517 2021-08-07 04:59:31
The core file might have a different process name such as "sm15_usd".
For both MDS 9396S fabric switches and MDS 9700 32 Gbps FC modules, when the device reloads, the uptime at the time of reload can be determined by examining show logging onboard boot-uptime
logs for the affected device. This example has a difference of 468 days and 12 minutes between boot record times:
Sat Aug 7 05:01:32 2021: Boot Record ----------------------------------------------------------------- Boot Time..........: Sat Aug 7 05:01:32 2021 Slot Number........: 1 … Sun Apr 26 04:49:41 2020: Boot Record ----------------------------------------------------------------- Boot Time..........: Sun Apr 26 04:49:41 2020 Slot Number........: 1
Some platforms are affected by more than one Field Notice. Refer to the table below to determine the minimum version of Cisco MDS NX-OS to address each individual Field Notice.
Field Notice: |
|||
Summary: |
Memory leak reload |
FCNS issue after ISSU |
Machine check reload |
Affected MDS Platforms: |
9148S 9250i 9396S |
9132T 9148S 9148T 9220i 9396S 9396T |
9396S 9700 32 Gbps FC module |
Cisco MDS NX-OS train |
Fixed Cisco MDS NX-OS Versions |
||
8.4 train: |
8.4(2c) and later |
9220i: None 1 others: 8.4(2d) and later |
None 2 |
8.5 train: |
8.5(1) |
None |
None 2 |
9.2 train: |
9.2(1a) |
9.2(1a) and 9.2(2) |
None 2 |
9.3 train: |
9.3(1) and later |
9.3(1) and later |
None 2 |
1. Cisco MDS 9220i is not supported in the 8.4 train.
2. Refer to Field Notice FN74128.
Refer to the table below to determine the minimum version of Cisco MDS NX-OS to address FN72223, FN72237, and FN72346 by platform.
Minimum Cisco MDS NX-OS Version to Resolve FN72223, FN72237, and FN72346 |
||||||||
Cisco MDS NX-OS train |
9700 32 Gbps FC module |
9396T |
9148T |
9132T |
9396S |
9148S |
9250i |
9220i |
8.4 train: |
8.4(2d) 2 |
8.4(2d) |
8.4(2d) |
8.4(2d) |
None 2 |
8.4(2d) |
8.4(2c) |
None 1 |
8.5 train: |
8.5(1) 2 |
None |
None |
None |
None 2 |
None |
8.5(1) |
None |
9.2 train: |
None 2 |
9.2(1a) and 9.2(2) |
9.2(1a) and 9.2(2) |
9.2(1a) and 9.2(2) |
None 2 |
9.2(1a) |
9.2(1a) |
9.2(1a) and 9.2(2) |
9.3 train: |
9.3(1) 2 |
9.3(1) |
9.3(1) |
9.3(1) |
None 2 |
9.3(1) |
9.3(1) |
9.3(1) |
1. Cisco MDS 9220i is not supported in the 8.4 train.
2. Refer to Field Notice FN74128.
Solution: In order to permanently resolve this issue, upgrade to a fixed version of Cisco MDS NX-OS.
For MDS 9396S fabric switches, a nondisruptive upgrade to a fixed version might trigger the issue described in Field Notice 72237. Refer to Field Notice 72237 to determine the appropriate steps.
For MDS 9700 with 32 Gbps FC modules, the switch can be upgraded nondisruptively with the install all
command.
Workaround: In order to temporarily resolve this issue, reset the device uptime to zero days. This can be used if it is not possible to apply the solution and will provide another 450 days of runtime before the issue might occur.
In order to display the current uptime of a device, enter the slot x show system uptime
command (where x is the slot number of the relevant device). The slot number is always 1 for MDS 9396S Fabric switches.
For MDS 9396S Fabric switches, a nondisruptive switch reload or a nondisruptive switch upgrade to a fixed version might trigger the issue described in Field Notice 72237. Refer to Field Notice 72237 to determine the appropriate steps.
For MDS 9700 32 Gbps FC modules, there are two ways to reset the device uptime. Either:
reload module x non-disruptive
command (where x is the slot number of the relevant device) in order to nondisruptively reset the uptime to zero days. If multiple modules are affected in a chassis, reload one module and wait for the CLI prompt to return before you reload the next module. The prompt takes several minutes to return while the module CPU is nondisruptively reloaded.install all
command. The ISSU/D itself causes all device uptimes in the chassis to be reset to zero days. Note that ISSU/D to the same version of Cisco MDS NX-OS will not reset the device uptime.In order to determine the release of Cisco MDS NX-OS running on a device, enter the show version
CLI command. The release is shown in the "system" field.
switch# show version
...
Software
BIOS: version 5.2.25
loader: version N/A
kickstart: version 9.2(1)
system: version 9.2(1)
In order to determine a device Product ID, enter the show inventory
CLI command. The Product ID is shown in the "PID" field.
For MDS 9396S fabric switches:
switch# show inventory
NAME: "Chassis", DESCR: "MDS 9396S 96X16G FC (2 RU) Chassis"
PID: DS-C9396S-K9 , VID: V00 , SN: JPG191800A0
For MDS 9700 32 Gbps FC modules:
switch# show inventory
NAME: "Slot 1" , DESCR: "4/8/16/32 Gbps Advanced FC Module"
PID: DS-X9648-1536K9 , VID: V01 , SN: JAE210308AT
Refer to your Cisco MDS switch supplier for their Cisco MDS NX-OS recommendations. For the latest recommendations from Cisco, refer to the Recommended Releases for Cisco MDS 9000 Series Switches.
Version | Description | Section | Date |
1.3 | Added references to Field Notice FN74128. | Workaround/Solution | 2024-APR-23 |
1.2 | Updated the release numbers for the 9.2 train. | Workaround/Solution | 2023-OCT-04 |
1.1 | Updated the Workaround/Solution Section and Added the Additional Information Section | — | 2022-JUN-08 |
1.0 | Initial Release | — | 2022-FEB-25 |
For further assistance or for more information about this field notice, contact the Cisco Technical Assistance Center (TAC) using one of the following methods:
To receive email updates about Field Notices (reliability and safety issues), Security Advisories (network security issues), and end-of-life announcements for specific Cisco products, set up a profile in My Notifications.
Unleash the Power of TAC's Virtual Assistance