本文檔介紹在遇到各種奇偶校驗錯誤消息後,對Cisco 12000系列網際網路路由器的故障部件或元件進行故障排除和隔離的步驟。
注意:本文檔未說明奇偶校驗錯誤的原因。如果您對奇偶校驗錯誤(也稱為單事件顛覆 — SEU)及其可能的原因的更簡潔定義感興趣,我們建議您閱讀從增加網路可用性中連結的文檔。
如需文件慣例的詳細資訊,請參閱思科技術提示慣例。
在繼續閱讀本文檔之前,我們建議您閱讀以下文檔:
本檔案中的資訊是根據以下軟體和硬體版本。
Cisco 12000系列網際網路路由器
Cisco IOS®軟體的所有版本
本文中的資訊是根據特定實驗室環境內的裝置所建立。文中使用到的所有裝置皆從已清除(預設)的組態來啟動。如果您在即時網路中工作,請確保在使用任何命令之前瞭解其潛在影響。
大多數Cisco 12000系列網際網路路由器路由處理器和線卡都包含錯誤代碼校正(ECC)功能。但是,該欄位中存在一些不具備ECC功能的現有線卡。ECC功能僅涵蓋卡上的RAM或同步動態RAM(SDRAM)記憶體。其餘部分不受ECC保護。
以下是思科系統使用的線卡的ECC功能比12000:
所有引擎2及更高版本的卡都具有ECC功能。
FCS之後,引擎1卡更改為ECC。
引擎0卡沒有ECC功能。
某些卡可以升級到整合ECC功能的類似產品。
下表列出了具有ECC功能的產品:
非ECC產品 | ECC產品 |
---|---|
GRP(=) | GRP-B(=) |
GE-SX/LH-SC(=) | GE-GBIC-SC-B(=) |
GE-GBIC-SC-A(=) | GE-GBIC-SC-B(=) |
8FE-FX-SC(=) | 8FE-FX-SC-B(=) |
8FE-TX-RF45(=) | 8FE-TX-RJ45-B(=) |
6DS3-SMB(=) | 6DS3-SMB-B(=) |
12DS3-SBM(=) | 12DS3-SMB-B(=) |
OC12/SRP-IR-SC(=) | OC12/SRP-IR-SC-B(=) |
OC12/SRP-MM-SC(=) | OC12/SRP-mm-SC-B(=) |
OC12/SRP-LR-SC(=) | OC12/SRP-LR-SC-B(=) |
註:-B和ECC是獨立的。-B表示該產品是董事會的第二個主要可訂購版本。在某些情況下,這是對ECC的修訂。
思科提供技術遷移計畫(TMP),允許您將非ECC板升級到新的ECC板。購買新ECC板以換取非ECC板將應用貸項。
下面的流程圖可幫助您確定Cisco 12000系列網際網路路由器哪個元件負責千兆路由處理器(GRP)上的奇偶校驗/錯誤代碼校正(ECC)錯誤消息。
注意:在奇偶校驗/ECC錯誤事件期間,捕獲並記錄show tech-support輸出和控制檯日誌,並收集所有crashinfo檔案。
下面的流程圖可幫助您確定Cisco 12000系列Internet路由器線卡的哪個元件負責奇偶校驗/錯誤代碼校正(ECC)錯誤消息:
註:每當線卡遇到奇偶校驗/ECC錯誤事件時,請收集儘可能多的資訊(有關詳細資訊,請參閱Cisco 12000系列網際網路路由器上的線卡崩潰故障排除)。
Cisco 12000系列網際網路路由器從其他線卡儲存器(SDRAM和SRAM)中的奇偶校驗錯誤中恢復,而不會崩潰。
Cisco 12000系列Internet路由器上的任何讀取或寫入操作都可以通過多個奇偶校驗裝置報告奇偶校驗錯誤的資料。
GRP-B和PRP對共用儲存器(SDRAM)使用單位元糾錯和多位元錯誤檢測ECC。自動糾正SDRAM中的單個位錯誤,系統繼續正常運行。
PRP和GRP-B具有支援ECC的增強型動態RAM(DRAM)控制器。因此,它們可以糾正單位元錯誤並報告多位錯誤。單位元錯誤的更正如下所示:
%Tiger-3-SBE: Single bit error detected and corrected at <address>
SBE由糾錯電路進行糾正,不會影響GRP-B或PRP的功能。單位元錯誤不需要任何操作,除非它們經常發生。在這種情況下,建議更換處理器主機板。
通過匯流排錯誤異常或CPU快取奇偶校驗錯誤異常報告多位錯誤的檢測。
如果CPU在通過SysAD匯流排或CPU內部快取記憶體儲存器(L1或L2)訪問處理器的外部快取記憶體(GRP上的L3)時檢測到奇偶校驗錯誤,則會報告處理器記憶體奇偶校驗錯誤消息。表1列出了針對每種型別的快取奇偶校驗錯誤列印出來的消息的示例:
表1:快取奇偶校驗錯誤位置
奇偶校驗錯誤的位置 | 錯誤消息 |
---|---|
一級指令快取記憶體 | 錯誤:主快取、安裝快取、欄位:資料 |
一級資料快取記憶體 | 錯誤:主、資料快取、欄位:資料 |
二級指令快取 | 錯誤: SysAD、安裝程式快取、欄位:資料 |
二級資料快取 | 錯誤: SysAD,資料快取,欄位:資料 |
三級指令快取 | 錯誤: SysAD,安裝程式快取,欄位:第一個字 |
三級資料快取記憶體 | 錯誤: SysAD,資料快取,欄位:第一個字 |
範例:
錯誤消息的第一行指示奇偶校驗錯誤的位置,可以是表1中列出的任何位置。在本示例中,位置是L3 Instruction Cache。
Error: SysAD, instr cache, fields: data, 1st dword Physical addr(21:3) 0x000000, virtual addr 0x6040BF60, vAddr(14:12) 0x3000 virtual address corresponds to main:text, cache word 0 Low Data High Data Par Low Data High Data Par L1 Data: 0:0xAE620068 0x8C830000 0x00 1:0x50400001 0xAC600004 0x01 2:0xAC800000 0x00000000 0x02 3:0x1600000B 0x00000000 0x01 Low Data High Data Par Low Data High Data Par DRAM Data: 0:0xAE620068 0x8C830000 0x00 1:0x50400001 0xAC600004 0x01 2:0xAC800000 0x00000000 0x02 3:0x1600000B 0x00000000 0x01
show version的輸出應如下所示:
...System was restarted by processor memory parity error at PC 0x602310D0, address 0x0 at 03:18:21 GMT Sun Oct 27 2002 ...
在show context輸出中,您可以看到系統已因快取奇偶校驗異常而重新啟動:
Router#show context slot 11 CRASH INFO: Slot 11, Index 1, Crash at 19:08:07 CST Thu Nov 14 2002 VERSION: GS Software (GSR-P-M), Version 12.0(22)S1, EARLY DEPLOYMENT RELEASE SOFTWARE (fc1) TAC Support: http://www.cisco.com/tac Compiled Mon 16-Sep-02 17:36 by nmasa Card Type: Route Processor, S/N LC uptime was 0 minutes. System exception: sig=20, code=0xE42F3E4B, context=0x52CF3D44 System restarted by a Cache Parity Exception STACK TRACE: -Traceback= 5020453C 500E5E24 5010E6DC 5015F89C 501E9F6C 501E9F58 ...
在第二次故障後更換GRP或PRP。
控制檯輸出中可能會出現以下消息:
SEC 7: %GRP-3-PARITYERR: Parity error detected in the fabric buffers. Data (8)
此消息表示GRP上的交換矩陣介面硬體檢測到奇偶校驗錯誤。十六進位制數表示錯誤中斷向量。這通常表示GRP上的硬體問題報告錯誤(本例中為插槽7)。在第二次出現類似問題時應更換有故障的GRP。
當路由器收到具有錯誤奇偶校驗的資料時,會顯示此錯誤消息。
對於在Cisco 12000系列Internet路由器上執行的任何讀取或寫入操作,奇偶校驗存在錯誤的資料由多個奇偶校驗裝置報告。
PRP使用單位元糾錯和多位元錯誤檢測ECC來共用記憶體(SDRAM)。自動糾正SDRAM中的單個位錯誤,系統繼續正常運行。
單位元錯誤(SBE)由糾錯電路(ECC)校正,並且不影響PRP的功能。除非經常發生單位元錯誤,否則不需要執行任何操作。
如果錯誤經常發生,建議更換處理器主機板。
SDRAM單位元糾錯碼(ECC)錯誤
單位元錯誤是指從記憶體中讀取的單詞中錯誤的單位元資料。對於SBE,可在不中斷操作的情況下更正錯誤。
檢測單位元錯誤,並顯示校正後的資料。例如,在引擎4/4+上報告單位元錯誤,如下所示:
SLOT 6:Jul 19 07:37:34: %TX192-3-SDRAM_SBE: Error=0x2 - DIMM1 Syndrome=0x7600 Addr=0xBEA09 Data bit80-Traceback= 401C8C9C 401C9508 401CDE08 401CDE40 4007F674 4009ED0C 4009ECF8
SBE由糾錯電路進行糾正,不影響線卡的功能。單位元錯誤不需要任何操作,除非它們經常發生。在這種情況下,建議更換線卡。
SDRAM多位ECC錯誤
多位元錯誤是指同一個字中有多個位元不正確。對於MBE,檢測到錯誤且線卡崩潰。SBE和MBE的發生是非常罕見的。
以下是針對SDRAM中的多位ECC錯誤而列印到控制檯的消息示例:
SLOT 5:Jul 25 16:58:51: %MCC192-3-SDRAM_SBE: Error=0x808 - DIMM0 Syndrome=0x31000000 Addr=0x81034 Data bit120 -Traceback= 401C8C9C 401C9508 40450018 400BF7D4 SLOT 5:Jul 25 16:58:51: %MCC192-3-SDRAM_MBE: Error=0x808 - DIMM0 Syndrome=0x18000000 Addr=0x80834 -Traceback= 401C8D88 401C9508 40450018 400BF7D4
ECC無法糾正MBE,導致線卡崩潰。然後,路由處理器將重新載入線卡並使其恢復正常操作。
欄位診斷可用於檢查MBE的線卡記憶體。MBE由欄位診斷程式檢測為記憶體錯誤。以下是TX SDRAM上發生多位錯誤且現場診斷失敗的板卡的示例:
FDIAG_STAT_IN_PROGRESS(5): test #12 TX SDRAM Marching Pattern FD 5> RIM: FD 5> TX Registers FD 5> INT_CAUSE_REG = 0x00000680 FD 5> Unexpected L3FE Interrupt occured. FD 5> ERROR: TX BMA Asic Interrupt Occured FD 5> *** 0-INT: External Interrupt *** FDIAG_STAT_DONE_FAIL(5) test_num 12, error_code 1 Field Diagnostic: ****TEST FAILURE**** slot 5: last test run 12, TX SDRAM Marching Pattern, error 1 Field Diag eeprom values: run 5 fail mode 1 (TEST FAILURE) slot 5 last test failed was 12, error code 1
如果您有QOC48或OC192線卡,請參閱此現場通知:QOC48/OC192 SBE/MBE。否則,應在發生第二次故障後更換線卡。
檢查show context slot [slot#]輸出中sig=欄位的值:
Router#show context slot 4 CRASH INFO: Slot 4, Index 1, Crash at 04:28:56 EDT Tue Apr 20 1999 VERSION: GS Software (GLC1-LC-M), Version 11.2(15)GS1a, EARLY DEPLOYMENT RELEASE SOFTWARE (fc1) Compiled Mon 28-Dec-98 14:53 by tamb Card Type: 1 Port Packet Over SONET OC-12c/STM-4c, S/N CAB020500AL System exception: SIG=20, code=0xA414EF5A, context=0x40337424 System restarted by a Cache Parity Exception
當在非常特定的電壓和溫度條件下運行時,某些基於引擎1轉發引擎的卡容易出現內部快取損壞問題。
快取錯誤恢復功能(CERF)是Engine1線卡中的一種軟體功能,通過刷新外部CPU快取中的錯誤和刷新來自DRAM的快取行來檢測和糾正快取奇偶校驗錯誤。此功能在CPU快取管理演算法中提供了智慧,使CPU能夠從快取記憶體奇偶校驗錯誤中恢復,從而防止線卡崩潰,而不會導致效能下降。
註:默認情況下CERF處於開啟狀態。此軟體糾錯碼(ECC)的活動可以通過show controller cerf命令進行監控。要關閉該功能,請使用全域性配置命令no service cerf。
有關更多資訊,請參見Field Notice: Cache Parity Error on GSR 1GE Card。
要確定線卡基於哪個轉發引擎,請參閱Cisco 12000系列網際網路路由器:常見問題文檔中的如何確定機箱中運行的引擎卡?。
如果線卡基於引擎1,則解決方法是將Cisco IOS軟體升級為包含快取錯誤復原功能(CERF)的版本。此功能最初在Cisco IOS軟體版本12.0(21)S3中提供。如果它仍然被快取奇偶校驗異常崩潰,則需要更換線卡。
如果線卡基於另一引擎型別,則應在第二次發生類似故障時更換線卡。
在控制檯日誌中可能會看到以下消息:
SLOT 2:Oct 23 17:07:45.531 EST: %LC-3-L3FEERRS: L3FE DRAM error 12 address 41E9B9A0 SLOT 2:Oct 23 17:07:45.531 EST: %LC-3-L3FEERR: L3FE error: rxbma 0 addr 0 txbma 0 addr 0 dram 12 addr 41E9B9A0 io 0 addr 0 SLOT 2:Oct 23 17:07:45.531 EST: %GSR-3-INTPROC: Process Traceback= 40080BAC -Traceback= 40357084 40495D30 40496EE0 400CCF98
此消息報告CPU DRAM寫入奇偶校驗錯誤。L3FE代表第3層轉發引擎。應該在第二次出現類似問題時更換線卡。
以下是您可能會遇到的一些錯誤訊息:
在一埠Gigabit線路卡的記錄中:
SLOT 5: %LCGE-3-INTR: TX GigaTranslator external interface parity error
對於較新的主機板,一個修複方案是用現場可程式設計門陣列(FPGA)取代TX GigaTranslator ASIC。在第二次出現類似問題時,應更換主機板。
在控制檯輸出中:
SLOT 6: %LC-3-ECC: Salsa ECC: About to handle ECC single bit error, ECC status = 2 DRAM error status = = 21 SLOT 6: %LC-3-L3FEERR: L3FE error: rxbma 0 addr 0 txbma 0 addr 0 dram 21 addr 200020 io 0 addr 0 SLOT 6: %LC-3-ECC: Salsa ECC: Addresses: Salsa returned =429BFDE8 correcting on = 429BFDE8 SLOT 6: %MEM_ECC-3-SBE: Single bit error detected and corrected at 0x429BFDE8 SLOT 6: %MEM_ECC-3-SYNDROME_SBE: 8-bit Syndrome for the detected Single-bit error: 0x8A SLOT 4: %MEM_ECC-3-SBE_HARD: Single bit *hard* error detected at 0x6299FB60 SLOT 1:Jun 10 05:29:47.690 EDT: %LC-3-ECC: Salsa ECC: About to handle ECC single bit error,ECC status = 0 DRAM error status =12 SLOT 6:Sep 26 15:18:01: %LC-3-SWECC: L2 event cleared: EPC = 0x40631CCC, CERR = 0xE40BB933, SysAD Addr = 1, total = 1 SLOT 0:Dec 7 13:48:11.480: %LC-3-SWECC_DATA: L2 event cleared: EPC = 0x400A8040, CERR = 0xA01DCE58, l1v = 0x41E3C20441E3C1C5, dv =0x41E3C1C441E3C204, SysAD Addr = 0, total = 1
這些消息可以拆分為以下幾個部分:
%LC-3-ECC:Salsa ECC — 線卡的L3FE ASIC出錯。
%LC-3-L3FEERR — 線路卡的L3FE ASIC註冊資訊中存在錯誤。
%MEM_ECC-3-SBE — 在從DRAM讀取時檢測到單位元可更正錯誤。show memory ecc命令可用於轉儲到目前為止記錄的單位錯誤。這與%MEM_ECC-3-SBE_LIMIT錯誤消息相同。
%MEM_ECC-3-SYNDROME_SBE — 檢測到單位元錯誤的8位綜合徵。此值並不表示有錯誤的位元的確切位置,但可以用來逼近它們的位置。這與%MEM_ECC-3-SYNDROME_SBE_LIMIT錯誤消息相同。
基本上,線卡報告了一個位錯誤,並自動糾正了它。您無需執行任何操作,除非這經常發生。在這種情況下,建議更換線卡。
%LC-3-SWECC_DATA — 表示在插槽0中的LC處,已使用軟體糾錯代碼(SWECC)糾正了快取事件。
您可能會遇到的另一條消息是:
SLOT 4: %MEM_ECC-3-SBE_HARD: Single bit *hard* error detected at 0x6299FB60
此訊息表示從DRAM讀取的CPU上偵測到單位無法更正的錯誤[硬錯誤]。show memory ecc命令會轉儲到目前為止記錄的單位錯誤,並指示檢測到的硬錯誤地址位置。
使用show memory ecc命令監視系統,並在這些錯誤出現過多時更換DRAM。
在控制檯輸出中可能會看到以下錯誤:
SLOT 6: %LC-6-PSAECC: An TLU SDRAM ECC correctable error occurred address 19C49FD SLOT 2:035610: Feb 26 13:09:13.628 UTC: %LC-6-PSAECC: An PLU SDRAM ECC correctable error occurred address 1956059
這表示封包交換ASIC(PSA)ECC保護的SDRAM已識別出可更正的一位錯誤。您無需執行任何操作,除非這些消息經常出現。在這種情況下,建議更換線卡。
您可以在控制檯輸出中看到以下錯誤:
SLOT 6:00:03:53: %PM622-3-SAR_SRAM_PARITY_ERR: (6/0): Parity error in Reassembly SAR SRAM address: 80000000.Resetting the port SLOT 3:00:00:53: %PM622-3- SAR_MULTIBIT_ECC_ERR: (3/0): Multi-bit ECC Uncorrectable error in SAR SDRAM address: 80000000. Resseting the port. SLOT 4:00:00:53: %PM622-3 SAR_SINGLE_BIT_ECC_ERR: (3/0): ECC corrected an error in SAR SDRAM address: 800000. SLOT 0:Jun 25 20:45:53 KST: %EE48-6-ALPHAECC: RX ALPHA: An PLU SDRAM ECC correctable error occured address 1000C254 SLOT 0:Jun 25 20:45:53 KST: %EE48-6-ALPHAECC2: RX ALPHA: An PLU SDRAM ECC multibit error occured at address 1000E254 SLOT 5:Nov 17 09:46:30.171: %EE48-6-ALPHA_PARITY: TX ALPHA: Transient SRAM64 parity corrected error 3E Data 0 100000 Parity bits 0 SLOT 10:Feb 21 16:55:36: %EE48-3-ALPHA_SRAM64_ERR: TX ALPHA: ALPHA_PST_RANGE_ERR error 11003F Data 0 0 Parity bits 0 SLOT 4:Jan 15 06:30:00.942 UTC: %EE48-2-GULF_TX_SRAM_ERROR: ASIC GULF: TX SRAM uncorrectable error detected. Details=0x0000 SLOT 0:Mar 16 19:50:22.464 cst: %EE48-4-QM_ZBT_PARITY: ToFab Address 0xB95E Data 0x1 SLOT 5:May 17 06:17:35.507: %EE48-4-QM_NON_ZBT_PARITY: ToFab Error 0x10000028 SLOT 5:May 17 06:17:53.883: %EE48-4-QM_ZBT_PARITY_TRANSIENT: FrFab Address 0x0 Data 0x7E SLOT 5:May 17 06:17:53.883: %EE48-4- GULF_RX_TB_PARITY_ERROR: ASIC GULF: RX telecom bus parity error on port 0 SLOT 1:Dec 13 00:27:42: %EE48-3-SRAM_PARITY: SRAM parity: Unable to find shadow 281B9EB4 SLOT 0:Aug 4 08:55:37: %EE48-3-QM_PARITY: FrFab Address 0x1859E Data 0x10 SLOT 0:Aug 4 08:55:37: %EE48-3-QM_ERROR: FrFab error register 0x80000.
在基於引擎4/4+的線卡上可能會遇到以下消息:
SLOT 4: %RX192-3-HINTR: status = 0x4000000, mask = 0x3FFFFFFF - Parity error on rx_pbc_mem. -Traceback= 401C37C0 403D8814 400BE1EC SLOT 4: %LC-3-ERR_INTR: Error interrupt occurred -Traceback= 400CE028 400C8DF0 40010A24
或
SLOT 3: %RX192-3-HINTR: status = 0x4000000, mask = 0x3FFFFFFF - Parity error on rx_pbc_mem. -Traceback= 406012E0 406972A0 400C555C %FIB-3-FIBDISABLE: Fatal error, slot 3: IPC failure
或
SLOT 13:Dec 5 07:30:15.272 cst: %HERA-6-PAM_ACL_SBE: PKT CNT MEM Syndrome=0x8 Addr=0x523C SLOT 2:00:03:41: %MCC192-6-RED_PARAM1_SBE: Parameter 1 - Single Bit Error detected and corrected Syndrome = 0x7, Address = 0x43, samebit No, diffbit No SLOT 2:00:03:41: %MCC192-6-RED_PARAM2_SBE: Parameter 1 - Single Bit Error detected and corrected Syndrome = 0x7, Address = 0x43, samebit No, diffbit No SLOT 5:Apr 26 11:56:08.160: %MCC192-3-SDRAM_MBE: Error=0x200 - DIMM1 Syndrome=0x3000 Addr=0x811C3 SLOT 10:Mar 6 05:05:26.965: %RX192-3-ADJ_MEM_MBE: phy addr 0x7905E648, offset 0xBCC9, old ecc 0x0, new ecc 0x0, bit -1, value 0x0 - MBE on Adjacency Memory.. SLOT 13:Dec 5 07:30:15.272 cst: %HERA-6-PAM_ACL_MBE: PKT CNT MEM Syndrome=0x8 Addr=0x523C SLOT 2:00:03:41: %MCC192-6-RED_PARAM1_MBE: Parameter 1 - Single Bit Error detected and corrected Syndrome = 0x7, Address = 0x43, samebit No, diffbit No SLOT 2:00:03:41: %MCC192-3-RED: Error=0x80000 - RED PARAM 1 ECC SBE Error. -Traceback= 405AF5E0 405B1CEC 406DFF7C 406E057C 400FC7E SLOT 2:00:03:41: %MCC192-6-RED_PARAM2_MBE: Parameter 1 - Single Bit Error detected and corrected Syndrome = 0x7, Address = 0x43, samebit No, diffbit No Sep 8 14:32:09 jst: %MEM_ECC-3-SYNDROME_SBE_LIMIT: 8-bit Syndrome for the detected Single-bit error: 0xD5
此問題的症狀包括:
此線卡上的Cisco快速轉發被禁用
相關聯的連線埠保持up/up
線路卡可能會自動重置
如果線卡沒有重設,因應措施是執行microcode reload <slot>命令:
此消息並不總是表示RX192模組存在硬體問題。有些Cisco IOS軟體錯誤可能因此而產生此錯誤訊息作為副作用。如果此消息只出現一次,請繼續監視主機板。將重置裝置。如果問題仍然存在,卡將自動重置。如果此消息持續出現,請與您的思科技術支援代表聯絡以獲得幫助。
可以使用show controllers mcc192 ecc命令檢查E4/E4+上的SBE事件:
LC-Slot4#show controllers mcc192 ecc MCC192 SDRAM ECC Counters SBE = 0x0, MBE = 0x0 TX192 SDRAM ECC Counters SBE = 0x0, MBE = 0x0
這會報告RX和TX記憶體。
您可以在控制檯輸出中看到以下錯誤:
SLOT 1:Jun 26 20:45:53 KST: %EE192-6-WAHOOECC: RX WAHOO: An PLU SDRAM ECC correctable error occured address 20000254 SLOT 9:Sep 2 21:27:49.680 GMT+8: %MCC192-3-PKTMEM_SBE: Single bit error detected and corrected SLOT 14:Jul 18 07:19:24.637: RX_XBMA: 1-bit CPUIM_ECCERR1 error 0x2 SLOT 15:Jan 4 16:53:16.591: TX_XBMA: (1) QSRAM qinfo SBE detected. info: 0x82605455 SLOT 12:Dec 12 22:34:15: %EE192-4-BM_ERRSSS: FrFab BM BADDR ECC ERR info single bit error(s) corrected, error 8250F63E count: 2 SLOT 1:Nov 22 13:40:02 JST: %EE192-3-QM_ERROR: RX_XBMA OQLLM error error register 0x1 -Traceback= 40AE71AC 406078C4 405F5EC0 SLOT 7:001113: Oct 24 10:50:28.520 BST: %EE192-3-WAHOOERRS: RX WAHOO: WAHOO_CSRAM_CNTRL_INT PIPE0 error 8 SLOT 6:Oct 4 16:48:00.487: %EE192-3-WAHOOERRSSS: RX WAHOO: WAHOO_FFCRAM_CNTRL_INT PIPE0 error 4 addr 3FBFAB8 agent 94 SLOT 7:001114: Oct 24 10:50:28.520 BST: %EE192-3-WAHOOERRSSSS: RX WAHOO: WAHOO_PPC_INT PIPE1 error pl_ctl 4000226 pl_aa_avl F9F7B pl_aa_end 7FF9 pl_aa_fatal 4800000 SLOT 6:Oct 4 16:48:00.487: %EE192-3-WAHOOERRS: RX WAHOO WAHOO_NFC_SRAM_MULTI_ECC_ERR multi-bit CSSRAM error SLOT 6:Oct 4 16:48:00.487: %EE192-3-WAHOOERRS: WAHOO_CTCAM_CNTRL_INT multi-bit CSRAM error SLOT 6:Oct 4 16:48:00.487: %EE192-3-WAHOOERRS: WAHOO_FFCRAM_CNTRL_INT MBE SLOT 6:Oct 4 16:48:00.487: %EE192-3-WAHOOERRS: FSRAM not OK WAHOO_FSRAM_CNTRL_INT ECC_1_BIT_EE | ECC_UNCORR_EE SLOT 6:Oct 4 16:48:00.487: %EE192-3-WAHOOERRS: WAHOO_CTCAM_CNTRL_INT multi-bit CSRAM error SLOT 1:00:01:14: WEEKLY_THROTTLE_SOCKEYE_SBE: SOCKEYE SBE: addr: 0xC2A007C0, synd: 0xC4 SLOT 1:00:01:14: WEEKLY_THROTTLE_CBSRAM_SBE_TX+i: CBSRAM SBE TX: 1-bit CBSRAM error. SLOT 1:00:01:14: WEEKLY_THROTTLE_CBSRAM_SBE_RX+i: CBSRAM SBE RX: 1-bit CBSRAM error. SLOT 1:00:01:14: WEEKLY_THROTTLE_CSSRAM_SBE_TX+i: CSSRAM SBE TX: 1-bit CSSRAM error. SLOT 1:00:01:14: WEEKLY_THROTTLE_CSSRAM_SBE_RX+i: CSSRAM SBE RX: 1-bit CSSRAM error. SLOT 1:00:01:14: WEEKLY_THROTTLE_CSRAM_SBE_TX+i: CSRAM SBE TX: 1-bit CSRAM error. SLOT 1:00:01:14: WEEKLY_THROTTLE_CSRAM_SBE_RX+i: CSRAM SBE RX: 1-bit CSRAM error. SLOT 1:00:01:14: WEEKLY_THROTTLE_W_FW_TCAM_PRTY_TX+throttle_i: TX FTCAM PRTY error, status = 0x2 SLOT 1:00:01:14: WEEKLY_THROTTLE_W_FW_TCAM_PRTY_RX+throttle_i: RX FTCAM PRTY error, status = 0x2 SLOT 1:00:01:14: WEEKLY_THROTTLE_W_CL_TCAM_PRTY_TX+throttle_i: TX CLTCAM PRTY error, status = 0x2 SLOT 1:00:01:14: WEEKLY_THROTTLE_W_CL_TCAM_PRTY_RX+throttle_i: RX CLTCAM PRTY error, status = 0x2 SLOT 1:00:01:14: WEEKLY_THROTTLE_W_NF_TCAM_PRTY_TX+throttle_i: TX NFTCAM PRTY error, status = 0x2 SLOT 1:00:01:14: WEEKLY_THROTTLE_W_NF_TCAM_PRTY_RX+throttle_i: RX NFTCAM PRTY error, status = 0x2 SLOT 1:00:01:14: WEEKLY_THROTTLE_W_TCAM_PRTY_VMR: TCAM PRTY VMR error, status = 0x2 SLOT 1:00:01:14: WEEKLY_THROTTLE_W_TCAM_PRTY_NO-VMR: TCAM PRTY NO-VMR error, status = 0x3 SLOT 1:00:01:14: WEEKLY_THROTTLE_W_FCRAM_SBE_TX: FCRAM SBE TX error, status = 0x2 SLOT 1:00:01:14: WEEKLY_THROTTLE_W_FCRAM_SBE_RX: FCRAM SBE TX error, status = 0x3 SLOT 1:00:01:14: WEEKLY_THROTTLE_W_FCRAM_PER_CHIP_SBE_TX: FCRAM CHIP SBE error, status = 0x2 SLOT 1:00:01:14: WEEKLY_THROTTLE_W_ FCRAM_PER_CHIP_SBE_RX: FCRAM CHIP SBE error, status = 0x3 SLOT 1:00:01:14: WEEKLY_THROTTLE_W_FSRAM_SBE_TX: FSRAM SBE TX error, status = 0x2 SLOT 1:00:01:14: WEEKLY_THROTTLE_W_FSRAM_SBE_RX: FSRAM SBE RX error, status = 0x3 SLOT 1:00:01:14: WEEKLY_THROTTLE_W_ FSRAM_MBE_TX: FSRAM MBE RX error, status = 0x2 SLOT 1:00:01:14: WEEKLY_THROTTLE_W_ FSRAM_MBE_RX: FSRAM MBE RX error, status = 0x3 SLOT 1:00:01:14: WEEKLY_THROTTLE_BM_ISERR_TX: ISERR TX error, status = 0x2 SLOT 1:00:01:14: WEEKLY_THROTTLE_BM_ISERR_RX: ISERR RX error, status = 0x3 SLOT 1:00:01:14: WEEKLY_THROTTLE_BM_FCRAM_SBE_TX: FCRAM SBE TX error, status = 0x2 SLOT 1:00:01:14: WEEKLY_THROTTLE_BM_FCRAM_SBE_RX: FCRAM SBE RX error, status = 0x3 SLOT 1:00:01:14: WEEKLY_THROTTLE_QM_QSRAM_LINK_SBE_TX: QSRAM LINK SBE TX error, status = 0x2 SLOT 1:00:01:14: WEEKLY_THROTTLE_QM_QSRAM_LINK_SBE_RX: QSRAM LINK SBE RX error, status = 0x3 SLOT 1:00:01:14: WEEKLY_THROTTLE_QM_QSRAM_QEINFO_SBE_TX: QSRAM queue info sbe tx error, status = 0x2 SLOT 1:00:01:14: WEEKLY_THROTTLE_QM_QSRAM_QEINFO_SBE_TX: QSRAM queue info sbe rx error, status = 0x3 SLOT 1:00:01:14: WEEKLY_THROTTLE_QM_QSRAM_BADDR_SBE_TX: qsram bad addr sbe tx error, status = 0x2 SLOT 1:00:01:14: WEEKLY_THROTTLE_ QM_QSRAM_BADDR_SBE_RX: qsram bad addr sbe rx error, status = 0x3 SLOT 1:00:01:14: WEEKLY_THROTTLE_QM_OQLLM_SBE_TX: oqllm sbe tx error, status = 0x2 SLOT 1:00:01:14: WEEKLY_THROTTLE_QM_OQLLM_SBE_RX: oqllm sbe rx error status = 0x3
您可以在控制檯輸出中看到以下錯誤:
SLOT 0:Jan 14 08:53:44.581 GMT: %FIA-3-RAMECCERR: To Fabric ECC error was detected Single Bit Error RAM2 status = 0x8000 Syndrome = 0x0 addr = 0x0 SLOT 6:Apr 29 09:36:12: %E6LC-4-ECC_THRESHOLD: HERMES VID SBE exceeded threshold, possible memory failure SLOT 4:*Mar 13 23:38:19.295: %E6_RX192-3-MTRIE_SBE: Head1 Syndrome=0x94 Addr=0xFFF2B -Traceback= 40544830 40546A90 40688C94 400EDC18 SLOT 7:*Mar 4 1234:19.295: %E6_RX192-3-ADJ_SBE: Syndrome=0x59 Addr=0xFFF2B -Traceback= 40000830 40036A90 40555D44 400ddd23 SLOT 14:Dec 9 20:02:29: %E6_RX192-6-PBC_SBE: Single bit error detected and corrected RLDRAM Syndrome=0x61 Addr=0xF855 Dec 9 20:02:33: %GRP-4-RSTSLOT: Resetting the card in the slot: 14,Event: linecard error report SLOT 4:06:21:43: %E6_RX192-3-ACL_SBE: ACTION MEM Syndrome=0x7 Addr=0x0 -Traceback= 40549740 4054A7E0 4068D814 400EE018 SLOT 6:Mar 28 03:30:19: %RX192-3-HINTR: status = 0x1000000000000, mask = 0x7FFFFF0FA320F - L3X SBE error. -Traceback= 405816DC 406A1010 406A1650 400F70E8 SLOT 6:Mar 28 03:30:19: %E6_RX192-6-VID_SBE: Single bit error detected and corrected VID memory Syndrome=0x19 Addr=0xE51B SLOT 6:Nov 27 23:32:36: %HERA-3-PKTMEM_SBE: Single bit error detected and corrected Error=0x80 – Syndrome=0x5100000000000000 Addr=0x894620 Data bit116 SLOT 7:Oct 2 23:32:36: %HERA-6- MCD_SBE: Single bit error detected and corrected Error=0x50 – Syndrome=0x3100000000000000 Addr=0x331110 Data bit216 SLOT 1:Jun 22 03:32:36: %HERA-6- MRW_SBE: Single bit error detected and corrected Error=0x50 – Syndrome=0x3100000000000000 Addr=0x331110 Data bit216 SLOT 12:May 24 03:03:36: %HERA-6- UPF_SBE: Single bit error detected and corrected Error=0x60 – Syndrome=0x4100000000000000 Addr=0x451140 Data bit216 SLOT 13:Dec 5 07:30:15.272 cst: %HERA-6-PAM_ACL_SBE: PKT CNT MEM Syndrome=0x8 Addr=0x523C SLOT 9:May 5 18:52:14: %HERA-6-QM_FBF_SBE: Free Block FIFO - Single Bit Error detected and corrected Syndrom = 0x10, Addr = 0x778, samebit Yes, diffbit No SLOT 9:May 5 18:52:14: %HERA-3-QM: Error=0x40 - FBF RAM ECC SBE. -Traceback= 405AD4CC 405AF5D0 405F2E80 406DCDB8 406DD434 400FC500 SLOT 3:Aug 16 00:45:14: %MCC192-6-RED_AQD_SBE: Average Queue Depth - Single Bit Error detected and corrected Syndrome = 0x7, Address = 0x89, samebit No, diffbit No SLOT 2:Jan 23 06:29:56 KST: %MCC192-6-RED_STAT_SBE: Statistics - Single Bit Error detected and corrected Syndrome = 0x38, Address = 0xFF, samebit No, diffbit No SLOT 4:*Mar 13 23:38:19.295: %E6_RX192-3-MTRIE_MBE: Single bit error detected and corrected Head1 Syndrome=0x94 Addr=0xFFF2B SLOT 7:*Mar 4 1234:19.295: %E6_RX192-3-ADJ_MBE: Syndrome=0x59 Addr=0xFFF2B -Traceback= 40000830 40036A90 40555D44 400ddd23 00:00:18: %E6_RX192-3-PBC_MBE: ADJ OBANK LO Syndrome=0xE5 Addr=0x142 -Traceback= 405BF8B0 405C0F08 406E8D78 406E93B8 400FCCE0 SLOT 6:Mar 28 03:30:19: %E6_RX192-6-VID_MBE: Single bit error detected and corrected VID memory Syndrome=0x19 Addr=0xE51B SLOT 0:Apr 18 06:44:53.751 GMT: %HERA-3-PKTMEM_MBE: Error=0x1010 - Syndrome=0x9900000000 SLOT 7:Oct 2 23:32:36: %HERA-6- MCD_MBE: Single bit error detected and corrected Error=0x50 – Syndrome=0x3100000000000000 Addr=0x331110 Data bit216 SLOT 1:Jun 22 03:32:36: %HERA-6- MRW_MBE: Single bit error detected and corrected Error=0x50 - Syndrome=0x3100000000000000 Addr=0x331110 Data bit216 SLOT 13:Dec 5 07:30:15.272 cst: %HERA-6-PAM_ACL_MBE: PKT CNT MEM Syndrome=0x8 Addr=0x523C SLOT 9:May 5 18:52:14: %HERA-6-QM_FBF_MBE: Free Block FIFO - Single Bit Error detected and corrected Syndrome = 0x10, Addr = 0x778, samebit Yes, diffbit No SLOT 3:Aug 16 00:45:14: %MCC192-6-RED_AQD_MBE: Average Queue Depth - Single Bit Error detected and corrected Syndrome = 0x7, Address = 0x89, samebit No, diffbit No SLOT 2:Jan 23 06:29:56 KST: %MCC192-6-RED_STAT_MBE: Statistics - Single Bit Error detected and corrected Syndrome = 0x38, Address = 0xFF, samebit No, diffbit No
您可以在控制檯輸出中看到以下錯誤:
SLOT 7:Jan 4 02:04:00.487: %SPA_CHOC_DSX-3-UNCOR_PARITY_ERR: SPA4/0: CHOC SPA parity error(s) encountered SLOT 7:Jan 4 02:04:00.487: %MCT1E1-3-UNCOR_PARITY_ERR: SPA5/0: T1E1 SPA parity error(s) encountered SLOT 3: 00:33:48: %MCT1E1-3-UNCOR_MEM_ERR: SPA3/0: 1 uncorrectable HDLC SRAM memory error(s) encountered. SLOT 1:Oct 3 14:42:45.727: %SPA_PLIM-4-SBE_ECC: SPA-4XT3/E3[1/2] reports 2 SBE occurrence at 1 addresses SLOT 1: Jul 22 05:26:29.613 UTC: %SPA_DATABUS-3-SPI4_SINGLE_DIP4_PARITY: SIP Sbslt 0 Ingress Sink - A single DIP4 parity error has occurred on the data bus. SLOT 4: Dec 2 22:44:05: %SPA_DATABUS-3-SPI4_SINGLE_DIP2_PARITY: SIP Sbslt 0 Egress Source - A single DIP 2 parity error on the FIFO status bus has occurred. SLOT 1:Oct 3 14:42:45.727: %SPA_PLIM-4-SBE_OVERFLOW: SPA-4XT3/E3[1/2] reports SBE table (2 elements) overflows SLOT 1:Oct 3 14:42:45.727: % SPA_PLUGIN-3-SPI4_SETCB: SPA-4XT3/E3[1/2] : IPC SPI4 set callback failed(status 2).
Cisco 12000系列Internet路由器的硬體故障排除中詳細介紹了與交換交換矩陣卡相關的所有奇偶校驗錯誤消息。這些消息包括(非詳盡清單):
%FABRIC-3-PARITYERR: To Fabric parity error was detected. Grant parity error Data = 0x2. SLOT 1:%FABRIC-3-PARITYERR: To Fabric parity error was detected. Grant parity error Data = 0x1
修訂 | 發佈日期 | 意見 |
---|---|---|
1.0 |
06-Dec-2002 |
初始版本 |