This document provides information on how to troubleshoot crashes on the Cisco® ASR 1000 Series Aggregation Services Routers.
There are no specific requirements for this document.
The information in this document is based on these software and hardware versions:
All Cisco ASR 1000 Series Aggregation Services Routers, including the 1002, 1004, and 1006.
All Cisco IOS XE Software versions that support the Cisco ASR 1000 Series Aggregation Services Routers.
The information in this document was created from the devices in a specific lab environment. All of the devices used in this document started with a cleared (default) configuration. If your network is live, make sure that you understand the potential impact of any command.
Refer to Cisco Technical Tips Conventions for more information on document conventions.
The Cisco ASR 1000 Series Aggregation Services Routers introduce the Cisco IOS XE Software as their software architecture. Based on the Cisco IOS Software, the Cisco IOS XE Software is a modular operating system built on a Linux kernel on a Route Processor (RP), Embedded Services Processor (ESP), or SPA Interface Processor (SIP). The IOS daemon (IOSD) and other IOS XE processes run on the Linux kernel, so there are several types of crashes shown in Table 1 on the Cisco ASR 1000 Series Aggregation Services Routers.
Table 1 – Types of Crashes
Types of Crashes | Module | Description |
---|---|---|
IOSD Crash | RP | Cisco IOS Software runs as IOSD on a Linux kernel on RP. |
SPA Driver Crash | SIP | Limited Cisco IOS Software runs to control SPA on SIP. |
Cisco IOS XE Process Crash | RP ESP SIP | Several Cisco IOS XE Processes run on a Linux kernel. For example, the chassis manager, the forwarding manager, interface manager, and so on run on RP. |
Cisco Quantum Flow Processor (QFP) Microcode Crash | ESP | The microcode runs on QFP. QFP is a packet forwarding ASICs on ESP. |
Linux Kernel Crash | RP ESP SIP | Linux kernel runs on RP, ESP, and SIP. |
If you encounter an unexpected reload of module, you must make sure that the console output, crashinfo file directory, and core dump file directory are available for troubleshooting. In order to determine the cause, the first step is to capture as much information about the problem as possible. This information is necessary to determine the cause of the problem:
Console logs — For more information, see Applying Correct Terminal Emulator Settings for Console Connections.
Syslog information — If you have set the router up to send logs to a syslog server, you are able to obtain information about what happened. For details, see How to Configure Cisco Devices for Syslog.
show platform — The show platform command displays the status for RPs, ESPs, SPAs, and the power supplies.
show tech-support — The show tech-support command is a compilation of many different commands that include show version and show running-config. When a router runs into problems, the Cisco Technical Assistance Center (TAC) engineer usually asks for this information to troubleshoot the hardware issue. You must collect the show tech-support before you do a reload or power-cycle because these actions can cause a loss of information about the problem.
Note: The show tech-support command does not include the show platform or show logging commands.
Boot Sequence Information — The complete bootup sequence if the router experiences boot errors.
Crashinfo file (if available) — See the Crashinfo File section.
Core Dump file (if available) — See the Core Dump File section.
Tracelog file (if available) — On the Cisco ASR 1000 Series Aggregation Services Routers, the trace logs of Cisco IOS XE processes are generated under harddisk:tracelogs (ASR 1006 or ASR 1004) or bootflash:tracelogs (ASR 1002) on the active RP. When the Cisco IOS XE processes crashes, the Cisco TAC engineer usually asks to collect this information in order to troubleshoot the issue.
When the IOSD or SPA driver crashes, a crashinfo file is generated under the location shown in Table 2.
Table 2 – Crashinfo File Location
Models | Types of Crashes | Crashinfo File Location |
---|---|---|
ASR 1002 | IOSD Crash SPA Driver Crash | bootflash: on the RP |
ASR 1004 ASR 1006 | IOSD Crash | bootflash: on the RP |
SPA Driver Crash | harddisk: on the RP |
Table 3 displays the crashinfo file names.
Table 3 – Crashinfo File Name
Types of Crashes | Crashinfo File Name | Example |
---|---|---|
IOSD Crash | crashinfo_RP_SlotNumber_00_Date-Time-Zone | crashinfo_RP_00_00_20080807-063430-UTC |
SPA Driver Crash | crashinfo_SIP_SlotNumber_00_Date-Time-Zone | crashinfo_SIP_00_00_20080828-084907-UTC |
When a process crashes, you can find a core dump file under the location shown in Table 4. A core dump is a full copy of the memory image of the process. It is recommended that you save the core dump files until troubleshooting is done. This is because a core dump includes much more information about a crash problem than a crashinfo file, and it is needed for deep investigation. In the case of the Cisco ASR 1002 Router, since it does not have a harddisk: device, a core dump file is generated under bootflash:core/.
Table 4 – Core Dump File Location
Models | Core Dump File Location |
---|---|
ASR 1002 | bootflash:core/ on the RP |
ASR 1004 ASR 1006 | harddisk:core/ on the RP |
Not only the core dump of RP, but the core dump of ESP or SIP processes are generated under the same location. In the case of the Cisco ASR 1006 Router, you must check the same location of the standby RP because it was the active RP when the problem occurred.
Table 5 – Core Dump File Name
Types of Crashes | Core Dump File Name | Example |
---|---|---|
IOSD Crash | hostname_RP_SlotNumber_ppc_linux_iosd-_ProcessID.core.gz | Router_RP_0_ppc_linux_iosd-_17407.core.gz |
SPA Driver Crash | hostname_SIP_SlotNumber_mcpcc-lc-ms_ProcessID.core.gz | Router_SIP_1_mcpcc-lc-ms_6098.core.gz |
IOS XE Process Crash | hostname_FRU_SlotNumber_ProcessName_ProcessID.core.gz | Router_RP_0_fman_rp_28778.core.gz Router_ESP_1_cpp_cp_svr_4497.core.gz |
Cisco QFP Crash | hostname_ESP_SlotNumber_cpp-mcplo-ucode_ID.core.gz | Router_ESP_0_cpp-mcplo-ucode_042308082102.core.gz |
Linux Kernel Crash | hostname_FRU_SlotNumber_kernel.core | Router_ESP_0_kernel.core |
The IOS Daemon (IOSD) runs as its own Linux process (ppc_linux_iosd-) on RP. On the dual IOS mode (Cisco ASR 1002 Router and Cisco ASR 1004 Router only), two IOSDs run on the RP.
In order to identify an IOSD crash, find the exception output below on the console. In the case of a Cisco ASR 1002 Router or Cisco ASR 1004 Router crash without dual IOS mode, the box is reloaded. In the case of a Cisco ASR 1002 Router or Cisco ASR 1004 Router crash with dual IOS mode, the IOSD is switched over on the RP. In the case of a Cisco ASR 1006 Router crash, the RP is switched over and a new standby RP is reloaded.
Exception to IOS Thread: Frame pointer 2C111978, PC = 1029ED60 ASR1000-EXT-SIGNAL: U_SIGSEGV(11), Process = Exec -Traceback= 1#106b90f504fce8544ce4979667ec2d5d :10000000+29ED60 :10000000+29ECB4 :10000000+2A1A9C :10000000+2A1DAC :10000000+492438 :10000000+1C22DC0 :10000000+4BBBE0 Fastpath Thread backtrace: -Traceback= 1#106b90f504fce8544ce4979667ec2d5d c:BC16000+C2AF0 c:BC16000+C2AD0 iosd_unix:BD73000+111DC pthread:BA1B000+5DA0 Auxiliary Thread backtrace: -Traceback= 1#106b90f504fce8544ce4979667ec2d5d pthread:BA1B000+95E4 pthread:BA1B000+95C8 c:BC16000+D7294 iosd_unix:BD73000+1A83C pthread:BA1B000+5DA0 PC = 0x1029ED60 LR = 0x1029ECB4 MSR = 0x0002D000 CTR = 0x0BD83C2C XER = 0x20000000 R0 = 0x00000000 R1 = 0x2C111978 R2 = 0x2C057890 R3 = 0x00000034 R4 = 0x000000B4 R5 = 0x0000003C R6 = 0x2C111700 R7 = 0x00000000 R8 = 0x12B04780 R9 = 0x00000000 R10 = 0x2C05048C R11 = 0x00000050 R12 = 0x22442082 R13 = 0x13B189AC R14 = 0x00000000 R15 = 0x00000000 R16 = 0x00000000 R17 = 0x00000001 R18 = 0x00000000 R19 = 0x00000000 R20 = 0x00000000 R21 = 0x00000000 R22 = 0x00000000 R23 = 0x00000001 R24 = 0x00000001 R25 = 0x34409AD4 R26 = 0x00000000 R27 = 0x2CE88448 R28 = 0x00000001 R29 = 0x00000000 R30 = 0x3467A0FC R31 = 0x2C1119B8 Writing crashinfo to bootflash:crashinfo_RP_00_00_20080904-092940-UTC Buffered messages: (last 4096 bytes only) ...
When the IOSD crashes, the crashinfo file and core dump file are generated on the RP.
Router#dir bootflash: Directory of bootflash: bootflash:crashinfo_RP_00_00_20080904-092940-UTC Router#dir harddisk:core Directory of harddisk:core/ 3620877 -rw- 10632280 Sep 4 2008 09:31:00 +00:00 Router_RP_0_ppc_linux_iosd-_17407.core.gz
The SPA drivers have limited IOS functions for SPA control and run on SIP because of the mcpcc-lc-ms process and one of the Cisco IOS XE processes. You can identify the SPA driver crash if you find that the process mcpcc-lc-ms is held down. After the SPA driver crashes, the SPA reloads.
Aug 28 08:52:12.418: %PMAN-3-PROCHOLDDOWN: SIP0: pman.sh: The process mcpcc-lc-ms has been helddown (rc 142) Aug 28 08:52:12.425: %ASR1000_OIR-6-REMSPA: SPA removed from subslot 0/0, interfaces disabled Aug 28 08:52:12.427: %SPA_OIR-6-OFFLINECARD: SPA (SPA-1X10GE-L-V2) offline in subslot 0/0 Aug 28 08:52:13.131: %ASR1000_OIR-6-INSSPA: SPA inserted in subslot 0/0 Aug 28 08:52:19.060: %LINK-3-UPDOWN: SIP0/0: Interface EOBC0/1, changed state to up Aug 28 08:52:20.064: %SPA_OIR-6-ONLINECARD: SPA (SPA-1X10GE-L-V2) online in subslot 0/0
When the SPA driver crashes, the crashinfo file and core dump file are generated on the RP.
Router#dir harddisk: Directory of harddisk:/ 14 -rw- 224579 Aug 28 2008 08:52:06 +00:00 crashinfo_SIP_00_00_20080828-085206-UTC Router#dir harddisk:core Directory of harddisk:/core/ 4653060 -rw- 1389762 Aug 28 2008 08:52:12 +00:00 Router_SIP_0_mcpcc-lc-ms_6985.core.gz
The Cisco IOS XE processes run on a Linux kernel on RP, ESP, and SIP. Table 6 lists their main processes. If a crash occurs, the module reloads.
Table 6 – Main Cisco IOS XE Processes
Title | Process Name | Module |
---|---|---|
Chassis Manager | cmand | RP |
cman_fp | ESP | |
cmcc | SIP | |
Environmental Monitoring | emd | RP, ESP, SIP |
Forwarding Manager | fman_rp | RP |
fman_fp_image | ESP | |
Host Manager | hman | RP, ESP, SIP |
Interface Manager | imand | RP |
imccd | SIP | |
Logging Manager | plogd | RP, ESP, SIP |
Pluggable Service | psd | RP |
QFP Client Control Process | cpp_cr_svr | ESP |
QFP Driver Process | cpp_driver | ESP |
QFP HA Server | cpp_ha_top_level_server | ESP |
QFP Client Service Process | cpp_sp_server | ESP |
Shell Manager | smand | RP |
In case the cpp_cp_svr process crashes on an ESP of the Cisco ASR 1006 Router, this message can appear on the console.
Jan 24 23:37:06.644 JST: %PMAN-3-PROCHOLDDOWN: F0: pman.sh: The process cpp_cp_svr has been helddown (rc 134) Jan 24 23:37:06.727 JST: %PMAN-0-PROCFAILCRIT: F0: pvp.sh: A critical processcpp_cp_svr has failed (rc 134) Jan 24 23:37:11.539 JST: %ASR1000_OIR-6-OFFLINECARD: Card (fp) offline in slot F0
You can find the core dump file on harddisk:core/.
Router#dir harddisk:core Directory of harddisk:/core/ 1032194 -rw- 38255956 Jan 24 2009 23:37:06 +09:00 Router_ESP_0_cpp_cp_svr_4714.core.gz
The tracelog of the process can include useful outputs.
Router#dir harddisk:tracelogs/cpp_cp* Directory of harddisk:tracelogs/ 4456753 -rwx 24868 Jan 24 2009 23:37:15 +09:00 cpp_cp_F0-0.log.4714.20090124233714
Cisco designed the Cisco Quantum Flow Processor as both hardware and software architecture. The first generation resides on two pieces of silicon; later generations can be single-chip solutions that adhere to the same software architecture described here. The term "Cisco QuantumFlow Processor" alone refers to the overall hardware and software architecture of the network processor.
When the QFP ucode crashes, ESP reloads. In order to identify the QFP ucode crash, find this output on the console or the core dump file of cpp-mcplo-ucode:
Dec 17 05:50:26.417 JST: %IOSXE-3-PLATFORM: F0: cpp_cdm: CPP crashed, core file /tmp/corelink/ Router_ESP_0_cpp-mcplo-ucode_121708055026.core.gz Dec 17 05:50:28.206 JST: %ASR1000_OIR-6-OFFLINECARD: Card (fp) offline in slot F0
You can find the core dump file.
Router#dir harddisk:core Directory of harddisk:core/ 3719171 -rw- 1572864 Dec 17 2008 05:50:31 +09:00 Router_ESP_0_cpp-mcplo-ucode_121708055026.core.gz
On the Cisco ASR 1000 Series, a Linux kernel runs on RP, ESP, and SIP. When a Linux kernel crashes, the module reloads without the crash output. After it boots up again, you can identify the Linux kernel crash if you find the core dump file of the Linux kernel. The size of kernel core file can be more than 100MByte.
Router#dir harddisk:core Directory of harddisk:/core/ 393230 ---- 137389415 Dec 19 2008 01:19:40 +09:00 Router_RP_0_kernel_20081218161940.core
If you still need assistance after you follow the steps above and want to open a service request with the Cisco TAC, be sure to include this information to troubleshoot a router crash: |
---|
Note: Do not manually reload or power-cycle the router before you collect this information unless you are required to troubleshoot a router crash because this can cause important information to be lost that is needed to determine the root cause of the problem. |
Revision | Publish Date | Comments |
---|---|---|
1.0 |
17-Mar-2009 |
Initial Release |