THIS FIELD NOTICE IS PROVIDED ON AN "AS IS" BASIS AND DOES NOT IMPLY ANY KIND OF GUARANTEE OR WARRANTY, INCLUDING THE WARRANTY OF MERCHANTABILITY. YOUR USE OF THE INFORMATION ON THE FIELD NOTICE OR MATERIALS LINKED FROM THE FIELD NOTICE IS AT YOUR OWN RISK. CISCO RESERVES THE RIGHT TO CHANGE OR UPDATE THIS FIELD NOTICE AT ANY TIME.
Revision | Publish Date | Comments |
---|---|---|
1.0 |
20-May-21 |
Initial Release |
2.0 |
21-May-21 |
Updated the Products Affected, Workaround/Solution, and How to Identify Affected Products Sections |
2.1 |
02-Jun-21 |
Updated the Workaround/Solution |
2.2 |
23-Dec-21 |
Update the Workaround/Solution section of the fieldnotice |
Affected Product ID | Comments |
---|---|
NCS-5501= |
Part Alternate |
NCS-5501 |
|
NCS-5501-SE= |
Part Alternate |
NCS-5501-SE |
|
NCS-5502= |
Part Alternate |
NCS-5502 |
|
NCS-5502-SE= |
Part Alternate |
NCS-5502-SE |
|
NCS-55A2-MOD-HD-S= |
Part Alternate |
NCS-55A2-MOD-HD-S |
|
NC55-18H18F= |
Part Alternate |
NC55-18H18F |
|
NC-55-18H18F |
|
NC-55-18H18F= |
Part Alternate |
NC55-24H12F-SE= |
Part Alternate |
NC55-24H12F-SE |
|
NC-55-24H12F-SE= |
Part Alternate |
NC-55-24H12F-SE |
|
NC55-24X100G-SE= |
Part Alternate |
NC55-24X100G-SE |
|
NC-55-24X100G-SE= |
Part Alternate |
NC-55-24X100G-SE |
|
NC55-36X100G= |
Part Alternate |
NC55-36X100G |
|
NC-55-36X100G |
|
NC-55-36X100G= |
Part Alternate |
NC55-36X100G-S |
|
NC55-36X100G-S= |
Part Alternate |
NC-55-36X100G-S |
|
NC-55-36X100G-S= |
Part Alternate |
NC55-36X100G-A-SE= |
Part Alternate |
NC55-36X100G-A-SE |
|
NC-55-36X100GA-SE |
|
NC-55-36X100GA-SE= |
Part Alternate |
NC-55-MOD-A |
|
NC-55-MOD-A= |
Part Alternate |
NC-55-MOD-A-SE |
|
NC-55-MOD-A-SE= |
Part Alternate |
NC55-RP= |
Part Alternate |
NC55-RP |
|
NC55-6X200-DWDM-S= |
Part Alternate |
NC55-6X200-DWDM-S |
|
NC-55-6X2H-DWDM-S |
|
NC-55-6X2H-DWDM-S= |
Part Alternate |
NC55-MOD-A-S= |
Part Alternate |
NC55-MOD-A-S |
|
NC55-MOD-A-SE-S= |
Part Alternate |
NC55-MOD-A-SE-S |
Defect ID | Headline |
---|---|
CSCvx16766 | Support firmware upgrade functionality for onboard SSD |
Due to a flaw in Solid State Drive (SSD) firmware, the SSD drive does not respond after approximately 3.2 years of accumulated operation.
After the first unresponsive event is experienced, every subsequent power cycle of the system allows the drive to operate for another 1,008 hours (approximately six weeks) before it no longer responds.
After approximately 3.2 years (28,224 of accumulated Power On Hours (POH)), a memory buffer overrun condition occurs, which triggers the firmware event in the SSD. This causes the drive to become unresponsive until the drive is power cycled.
No data loss occurs when the memory buffer overrun firmware event occurs. A power cycle restores normal operation of the drive. The drive continues to operate normally for approximately six weeks (1,008 additional accumulated POH), at which time the drive becomes unresponsive again. A power cycle of the drive re-initiates the 1,008 hours window.
After 3.2 years of operation, the router behavior is unpredictable as the SSD locks up.
Below is one of the instances of an SSD in lock state:
RP/0/RP0/CPU0:ios#show logging | inc Read-only
Here is a sample output:
Day MMM?X HH:MM:SS.048 UTC start_backing_thread:bind: Read-only file system start_backing_thread:bind: Read-only file system ctrace_enable_configuration(): inotify_add_watch() failed fd 11 ctrl_file_name /var/log/ctrace/show_logging/xr_ds_capi_info/ctrace.ctrl. No such file or directory (2) ctrace_enable_configuration2 failed with error 0x5 ctrace_enable_configuration(): inotify_add_watch() failed fd 12 ctrl_file_name /var/log/ctrace/show_logging/xr_ds_capi_error/ctrace.ctrl. No such file or directory (2) ctrace_enable_configuration2 failed with error 0x5 ctrace_enable_configuration(): inotify_add_watch() failed fd 13 ctrl_file_name /var/log/ctrace/show_logging/xr_ds_capi_conn/ctrace.ctrl. No such file or directory (2) ctrace_enable_configuration2 failed with error 0x5 ctrace_enable_configuration(): inotify_add_watch() failed fd 14 ctrl_file_name /var/log/ctrace/show_logging/xr_ds_capi_msc/ctrace.ctrl. No such file or directory (2) ctrace_enable_configuration2 failed with error 0x5 start_backing_thread:bind: Read-only file system RP/0/RP0/CPU0:Day MMM?X HH:MM:SS.418 UTC: syslog_dev[117]: syslog_infra_hm[141] PID-23708: ctrace abort handler: unable to open trace file /var/log/ctrace /_pkg_bin_logger/xr_ds_capi_msc/ctrace_0.trc (Read-only file system)))ctrace abort handler: unable to open trace file /var/log/ctrace/_pkg_bin_logger/xr_ds_capi_conn/ctrace_0.trc (Read-only file system))ctrace abort handler: unable to open trace file /var/log/ctrace /_pkg_bin_logger/xr_ds_capi_error/ctrace_0.trc (Read-only file system) RP/0/RP0/CPU0:Day MMM?X HH:MM:SS.418 UTC: syslog_dev[117]: syslog_infra_hm[141] PID-23708: ctrace abort handler: unable to open trace file /var/log/ctrace/_pkg_bin_logger/xr_ds_capi_info/ctrace_0.trc (Read-only file system))
In order to proactively prevent and permanently resolve this issue, and in order to prevent disruption to the network and operations, Cisco recommends an SSD firmware upgrade before the uptime reaches 28,224 hours.
Note: An RMA is not recommended, as the upgrade process resolves the issue.
Precheck Before You Upgrade the SSD Firmware
Sample output:
RP/0/RP0/CPU0:ios#admin show smart-monitor location all | inc "Temperature_Celsius|ID#"
Day MMM?X HH:MM:SS.048 UTC
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
194 Temperature_Celsius 0x0022 051 022 000 Old_age Always - 128 (0 78 0 254 0)
In order to perform the system upgrade, you need to install the required SMU. Refer to the table that follows.
IOS XR Release | SMU Link | SMU Install Type |
---|---|---|
IOS XR Release 6.5.2 | ncs5500-sysadmin-6.5.2.CSCvx16766.tar | Reload* |
IOS XR Release 6.5.3 | ncs5500-sysadmin-6.5.3.CSCvx16766.tar | Reload* |
IOS XR Release 6.6.25 | ncs5500-sysadmin-6.6.25.CSCvx16766.tar | Reload* |
IOS XR Release 6.6.3 | ncs5500-sysadmin-6.6.3.CSCvx16766.tar | Reload* |
IOS XR Release 7.0.1 | ncs5500-sysadmin-7.0.1.CSCvx16766.tar | Reload* |
IOS XR Release 7.0.2 | ncs5500-sysadmin-7.0.2.CSCvx16766.tar | Reload* |
IOS XR Release 7.1.1 | ncs5500-sysadmin-7.1.1.CSCvx16766.tar | Reload* |
IOS XR Release 7.1.2 | ncs5500-sysadmin-7.1.2.CSCvx16766.tar | Reload* |
IOS XR Release 7.2.1 | ncs5500-sysadmin-7.2.1.CSCvx16766.tar | Reload* |
IOS XR Release 7.2.2 | ncs5500-sysadmin-7.2.2.CSCvx16766.tar | Process Restart** |
IOS XR Release 7.3.1 | ncs5500-sysadmin-7.3.1.CSCvx16766.tar | Process Restart** |
* If the SMU type is Reload: upon activation of the SMU, the router reloads. ** If the SMU type is Process Restart: upon activiation of the SMU, the sata_fpd process restarts on the router. |
After the SMU is installed, the firmware must be upgraded. Run this command in order to check the device status and the firmware version:
Command for Fixed system:
[admin] show hw-module location <location-name> fpd
Here is a sample output for Fixed system:
RP/0/RP0/CPU0:ios#sh hw-module location 0/rp0 fpd Location Card type HWver FPD device ATR Status Running Programd ------------------------------------------------------------------------------- 0/RP0 NCS-55A2-MOD-HD-S 1.1 MB-MIFPGA CURRENT 0.19 0.19 0/RP0 NCS-55A2-MOD-HD-S 1.1 Bootloader CURRENT 1.14 1.14 0/RP0 NCS-55A2-MOD-HD-S 1.1 CPU-IOFPGA CURRENT 1.27 1.27 0/RP0 NCS-55A2-MOD-HD-S 1.1 MB-IOFPGA CURRENT 0.18 0.18 0/RP0 NCS-55A2-MOD-HD-S 1.1 SATA-M500IT-MC NEED UPGD 2.00 2.00 0/PM0 NC55-1200W-ACFW 1.0 LIT-PriMCU-ACFW CURRENT 2.09 2.09
Note: For NCS5500 products, the possible FPD device name can be SATA-M500IT-MC or SATA-M500IT-MU-A.
Command for Modular system:
[admin] show hw-module location <location-name> fpd
Sample output for Modular system (Line Card):
RP/0/RP0/CPU0:ios#sh hw-module location 0/0 fpd
Location Card type HWver FPD device ATR Status Running Programd ------------------------------------------------------------------------------------------ 0/0 NC55-36X100G-S 0.6 MIFPGA CURRENT 0.07 0.07 0/0 NC55-36X100G-S 0.6 Bootloader CURRENT 1.14 1.14 0/0 NC55-36X100G-S 0.6 IOFPGA CURRENT 0.11 0.11 0/0 NC55-36X100G-S 0.6 SATA-M500IT-MC NEED UPGD 2.00 2.00
Run this command in order to upgrade the firmware:
[admin] upgrade hw-module location <location-name> fpd <fpd-name>
Here is a sample output:
RP/0/RP0/CPU0:ios#upgrade hw-module location 0/RP0 fpd SATA-M500IT-MC
DAY MMM DD HH:MM:SS.933 UTC
upgrade command issued (use "show hw-module fpd" to check upgrade status)
RP/0/RP0/CPU0:ios#0/RP0/ADMIN0:Jan 13 06:11:09.050 UTC: fpdserv[3812]: %INFRA-FPD_Manager-1-UPGRADE_ALERT : Upgrade for the following FPDs has been committed:
0/RP0/ADMIN0:Jan 13 06:11:09.050 UTC: fpdserv[3812]: %INFRA-FPD_Manager-1-UPGRADE_ALERT : Location FPD name Force
0/RP0/ADMIN0:Jan 13 06:11:09.050 UTC: fpdserv[3812]: %INFRA-FPD_Manager-1-UPGRADE_ALERT : ==================================================
0/RP0/ADMIN0:Jan 13 06:11:09.053 UTC: fpdserv[3812]: %INFRA-FPD_Manager-1-UPGRADE_ALERT : 0/RP0 SATA-M500IT-MC FALSE
0/RP0/ADMIN0:Jan 13 06:11:17.861 UTC: sata_fpd[28389]: %INFRA-FPD_Driver-1-UPGRADE_ALERT : FPD SATA-M500IT-MC@0/RP0 image programming completed with UPGRADE DONE state Info: [SATA FPD upgrade Complete]
Run this command in order to verify that the firmware has been upgraded to the latest version:
[admin] show hw-module location <location-name> fpd
Here is a sample output:
RP/0/RP0/CPU0:ios#sh hw-module location 0/rp0 fpd Location Card type HWver FPD device ATR Status Running Programd ----------------------------------------------------------------------------------- 0/RP0 NCS-55A2-MOD-HD-S 1.1 MB-MIFPGA CURRENT 0.19 0.19 0/RP0 NCS-55A2-MOD-HD-S 1.1 Bootloader CURRENT 1.14 1.14 0/RP0 NCS-55A2-MOD-HD-S 1.1 CPU-IOFPGA CURRENT 1.27 1.27 0/RP0 NCS-55A2-MOD-HD-S 1.1 MB-IOFPGA CURRENT 0.18 0.18 0/RP0 NCS-55A2-MOD-HD-S 1.1 SATA-M500IT-MC CURRENT 3.00 3.00 0/PM0 NC55-1200W-ACFW 1.0 LIT-PriMCU-ACFW CURRENT 2.09 2.09
Ensure that the firmware is updated to the fixed version, as shown in the table in the How to Identify Affected Products section.
For a more detailed FAQ on the SSD issue please refer to addtional information here.
Cisco has identified the list of product Serial Numbers which are shipped with affected FW version. Refer to the Serial Number Validation section to determine if your product may potentially be affected. If the product is listed as “Affected” please check the FW version if it is already upgraded.
Run this command in order to check the firmware version:
admin show smart-monitor location all | inc "Location|Device Model|Firmware Version
Here is the expected output:
RP/0/RP0/CPU0:ios#admin show smart-monitor location all | inc "Location|Device Model|Firmware Version"
Day MMM DD HH:MM:SS.672 UTC
Location : 0/RP0
Device Model: Micron_M500IT_MTFDDAT256MBD
Firmware Version: MU01.00
Check the impacted firmware versions in the table that follows:
Device Model | Impacted Firmware Version | Fixed Firmware Version |
---|---|---|
Micron_M500IT_MTFDDAT064SBD | MU01.00 / MC02.00 | MU05.00 / MC03.00 or higher |
Micron_M500IT_MTFDDAT064MBD | MU01.00 | MU05.00 or higher |
Micron_M500IT_MTFDDAT256MBD | MU01.00 / MC02.00 | MU05.00 / MC03.00 or higher |
If the firmware is found to be impacted, upgrade the firmware version with the recommended Software Maintenance Update (SMU) before the system reaches 28,224 hours of operation.
Run this command in order to check Power On Hours (POHs):
sysadmin-vm:0_RP0# show smart-monitor location 0/RP0 | inc Power_On_Hours
Here is the sample output:
sysadmin-vm:0_RP0# show smart-monitor location 0/RP0 | inc Power_On_Hours
DDD MMM DD HH:MM:SS.538 UTC+00:00
9 Power_On_Hours 0x0032 100 100 001 Old_age Always - 17849
In this example, the SSD issue will occur after 10,375 hours of operation (28224-17894).
This field notice provides the ability to determine if the serial number(s) of a device is impacted by this issue. In order to verify your serial number(s), enter it in the Serial Number Validation tool at https://snvui.cisco.com/snv/FN72108.
If you require further assistance, or if you have any further questions regarding this field notice, please contact the Cisco Systems Technical Assistance Center (TAC) by one of the following methods:
My Notifications—Set up a profile to receive email updates about reliability, safety, network security, and end-of-sale issues for the Cisco products you specify.
Unleash the Power of TAC's Virtual Assistance