The documentation set for this product strives to use bias-free language. For the purposes of this documentation set, bias-free is defined as language that does not imply discrimination based on age, disability, gender, racial identity, ethnic identity, sexual orientation, socioeconomic status, and intersectionality. Exceptions may be present in the documentation due to language that is hardcoded in the user interfaces of the product software, language used based on RFP documentation, or language that is used by a referenced third-party product. Learn more about how Cisco is using Inclusive Language.
This document describes how to analyze commonly seen hardware failure symptoms on the Aggregation Services Routers 903 (ASR903) and their troubleshooting methodology.
Cisco recommends that you have basic knowledge of these topics:
The information in this document was created from devices in a specific lab environment where failure symptoms were observed. All of the devices used in this document started with a cleared (default) configuration. If your network is live, make sure that you understand the potential impact of any command
The Cisco ASR 903 Router is a fully-featured aggregation platform designed for the cost-effective delivery of converged mobile and business services. With shallow depth, low power consumption, and an extended temperature range, this compact 3 Rack-Unit (RU) router provides high service scale, full redundancy, and flexible hardware configuration. The Cisco ASR 903 Router is positioned as a pre-aggregation router in IP Radio Access Network (RAN) networks or an aggregation router in Carrier Ethernet networks.
The platform comprises of the following major Field Replaceable Units (FRU) as depicted in the figure below:
Label | Component |
1 | Interface modules (IM) |
2 | Two Route Switch Processor (RSP) unit slots. Supports RSP1A-55, RSP1B-55, RSP2A-64 and RSP2A-128 |
3 | Fan tray |
4 | Redundant DC power units |
During normal operation, any of the Field Replaceable Units (FRU's) can exhibit failure symptoms. Often this ends up in replacement of the hardware components which may not be necessarily a hardware failure. By following certain troubleshooting techniques you can recover these modules from its failure state and thereby reduce network downtime.
Note: Router can be operational with single Power Supply. The secondary Power Supply unit needs to be physically inserted if not powered ON.
The Cisco ASR 903 Router uses a modular fan tray that is separate from the power supply. The fan tray contains twelve fans and provides sufficient capacity to maintain operation even in the event of a fan failure. There are two types of Fan Tray modules (A903-FAN and A903-FAN-E) depending on the environment where the router is used. The latter (A903-FAN-E) comes with a 8mm fan dust filter which prevents dust from entering the unit and avoids possible damage to the components.
Use the command "show platform" or "show facility-alarm status" to determine status of Fans in the Tray. In the event of a Fan failure, the Fan tray status will be displayed as "fail" along with the details of the individual units which has failed.
ASR903#
show platform | in FAN|State
Chassis type: ASR-903
Slot Type State Insert time (ago)
P2 A903-FAN-E
f2, f4, f6, fail
05:00:00[an error occurred while processing this directive]
ASR903#
sh facility-alarm status
System Totals Critical: 1 Major: 3 Minor: 0[an error occurred while processing this directive]
Source Severity Description [Index] Fan Tray CRITICAL Multiple Fan Failures [2] Fan Tray MAJOR Fan 2 Failure [5] Fan Tray MAJOR Fan 4 Failure [7] Fan Tray MAJOR Fan 6 Failure [9]
These outputs show Fan modules in slot f2, f4 and f6 have failed and need to be replaced.
In some cases, the Fan Tray may be reported as "Unknown" in the "show platform" output and the Network Management System (NMS) station may generate an alarm as well.
ASR903#
sh platform | in P2
Chassis type: ASR-903[an error occurred while processing this directive]
Slot Type State Insert Time (ago)
P2 Unknown N/A never
Perform the following steps which may help recover the module:
Note: There is a known cosmetic defect which is documented in CSCuu75796 where the FAN tray will be reported as unknown. To avoid erroneous failure messages, allow at least 2 minutes for the system to reinitialize after the fan tray has been removed or replaced.
ASR903#
show platform | in R1
Chassis type: ASR-903[an error occurred while processing this directive]
Slot Type State Insert Time (ago)
R1 A903-RSP1B-55 unknown 1d01h
One of the common reasons for the standby RSP module to exhibit this behavior is because of configuration sync failure between the active and standby RSP. The following commands should be executed to verify this:
ASR903#show redundancy config-sync failures bem[an error occurred while processing this directive]
ASR903#show redundancy config-sync failures mcl
ASR903#show redundancy config-sync failures prc
If there are failures reported in any of the above commands then implement the following workaround and verify whether RSP is staying UP.
ASR903# config terminal[an error occurred while processing this directive]
ASR903config)#redundancy
ASR903(config)#mode sso
ASR903(config-red)#no policy config-sync lbl prc reload
ASR903config-red)#no policy config-sync bulk prc reload
ASR903(config-red)#end
If the RSP module continues to remain in a boot loop, check the device logs for any link errors as indicated below. If yes, the RSP module may need to be replaced if a physical reseat does not fix it.
%IOSXE-3-PLATFORM: R0/0: kernel: pciehp 0000:02:07.0:pcie24: Link Training Error occurs[an error occurred while processing this directive]
%IOSXE-3-PLATFORM: R0/0: kernel: pciehp 0000:02:07.0:pcie24: Failed to check link status
Whenever a module is installed, the IM transitions through specific states (out of service->inserted->booting->OK). If an Interface Module (IM) in any of the six available slots fails past the booting state, perform the following steps:
ASR903#sh platform Chassis type: ASR-903[an error occurred while processing this directive]
Slot Type State Insert Time (ago)
0/4 A900-IMA8S inserted/unkown 00:27:02 (physical)
ASR903#hw-module subslot 0/1 reload[an error occurred while processing this directive]
Proceed with reload of module? [confirm]
%IOSXE_OIR-6-SOFT_RELOADSPA: SPA(A900-IMA1X) reloaded on subslot 0/1
%IOSXE-3-PLATFORM: R0/0: kernel:pciehp 0000:02:07.0:pcie24: Link Training Error occurs[an error occurred while processing this directive]
%IOSXE-3-PLATFORM: R0/0: kernel:pciehp 0000:02:07.0:pcie24: Failed to check link status
The "link training" error basically means that there's a communication error along the Peripheral Component Interconnect Express (PCIe) bus for a particular slot. The PCIe hot plug module is hosted on the RSP engine. Perform a RSP switch-over so that the modules are registered with the PCIe bus of the standby RSP (Route-Switch Processor). If the module recovers post the switchover, the previous active RSP module needs to be replaced.
ASR903#redundancy force-switchover[an error occurred while processing this directive]
Proceed with switchover to standby RP? [confirm]
Note: For further assistance please open a service request with Cisco Technical Assistance Centre (TAC) with details of the troubleshooting done as well as the ‘show tech-support’ output from the router.