此产品的文档集力求使用非歧视性语言。在本文档集中,非歧视性语言是指不隐含针对年龄、残障、性别、种族身份、族群身份、性取向、社会经济地位和交叉性的歧视的语言。由于产品软件的用户界面中使用的硬编码语言、基于 RFP 文档使用的语言或引用的第三方产品使用的语言,文档中可能无法确保完全使用非歧视性语言。 深入了解思科如何使用包容性语言。
思科采用人工翻译与机器翻译相结合的方式将此文档翻译成不同语言,希望全球的用户都能通过各自的语言得到支持性的内容。 请注意:即使是最好的机器翻译,其准确度也不及专业翻译人员的水平。 Cisco Systems, Inc. 对于翻译的准确性不承担任何责任,并建议您总是参考英文原始文档(已提供链接)。
本文档介绍当Nexus 7k中的硬件模块无响应或间歇性时,如何对其进行故障排除。
步骤1:对各种SNMP V3用户和/或SNMP V2社区字符串执行snmpwalk(即,遍历主机名mib)。
在连续循环中执行此操作。
步骤2. ssh至有问题的具有步骤1中主机名的间歇性无响应的snmpwalk。
由于步骤1和步骤2在60秒周期内同时受到影响,这似乎是N7K控制平面内的硬件故障,因为N7K始终运行硬件诊断运行状况检查。当您看到响应时间为30秒,非响应时间为30秒,然后循环重复时,这是硬件诊断运行状况检查扫描所有硬件的明确指示。30秒的响应时间是扫描正常硬件,而30秒的不响应时间是发生故障的硬件。
第三步:如果步骤2.清楚地描述了硬件故障,请执行以下步骤:
注:EOBC是N7K用于在SUP/交换矩阵模块/线卡之间进行通信的内部控制平面进程。如果此EOBC流程受到任何影响,管理VDC-1日志文件中描述的关联模块最有可能是先前测试中看到的间歇响应的罪魁祸首,因为SUP丢失了与管理VDC-1日志文件中描述的关联模块的100%一致通信,并且正在尝试与其恢复/通信,从而导致与其他控制平面进程的间歇响应。
示例:
lab-sw01-admin-vdc-1# show logging logfile | inc EOBC
2022 Feb 22 19:46:15 lab-sw01-admin-vdc-1 %MODULE-4-MOD_WARNING: Module 8 (Serial number: JAA00000000) reported warning 8/1-8/0 due to EOBC heartbeat failure on standby sup in device DEV_EOBC_MAC (device error 0xc0a0504f)
2022 Feb 22 19:46:15 lab-sw01-admin-vdc-1 %MODULE-4-MOD_WARNING: Module 8 (Serial number: JAA00000000) reported warning 8/1-8/0 due to EOBC heartbeat failure in device DEV_EOBC_MAC (device error 0xc0a0514d)
2022 Feb 22 19:46:16 lab-sw01-admin-vdc-1 %MODULE-4-MOD_WARNING: Module 8 (Serial number: JAA00000000) reported warning 8/1-8/0 due to EOBC heartbeat failure on standby sup in device DEV_EOBC_MAC (device error 0xc0a0504f)
2022 Feb 22 19:46:16 lab-sw01-admin-vdc-1 %MODULE-4-MOD_WARNING: Module 8 (Serial number: JAA00000000) reported warning 8/1-8/0 due to EOBC heartbeat failure in device DEV_EOBC_MAC (device error 0xc0a0514d)
2022 Feb 22 19:46:21 lab-sw01-admin-vdc-1 %MODULE-4-MOD_WARNING: Module 8 (Serial number: JAA00000000) reported warning 8/1-8/0 due to EOBC heartbeat failure in device DEV_EOBC_MAC (device error 0xc0a0514d)
2022 Feb 22 19:46:21 lab-sw01-admin-vdc-1 %MODULE-4-MOD_WARNING: Module 8 (Serial number: JAA00000000) reported warning 8/1-8/0 due to EOBC heartbeat failure on standby sup in device DEV_EOBC_MAC (device error 0xc0a0504f)
2022 Feb 22 19:46:22 lab-sw01-admin-vdc-1 %MODULE-4-MOD_WARNING: Module 8 (Serial number: JAA00000000) reported warning 8/1-8/0 due to EOBC heartbeat failure in device DEV_EOBC_MAC (device error 0xc0a0514d)
2022 Feb 22 19:46:23 lab-sw01-admin-vdc-1 %MODULE-4-MOD_WARNING: Module 8 (Serial number: JAA00000000) reported warning 8/1-8/0 due to EOBC heartbeat failure on standby sup in device DEV_EOBC_MAC (device error 0xc0a0504f)
2022 Feb 22 19:46:23 lab-sw01-admin-vdc-1 %MODULE-4-MOD_WARNING: Module 8 (Serial number: JAA00000000) reported warning 8/1-8/0 due to EOBC heartbeat failure in device DEV_EOBC_MAC (device error 0xc0a0514d)
2022 Feb 22 19:46:24 lab-sw01-admin-vdc-1 %MODULE-4-MOD_WARNING: Module 8 (Serial number: JAA00000000) reported warning 8/1-8/0 due to EOBC heartbeat failure on standby sup in device DEV_EOBC_MAC (device error 0xc0a0504f)
2022 Feb 22 19:46:24 lab-sw01-admin-vdc-1 %MODULE-4-MOD_WARNING: Module 8 (Serial number: JAA00000000) reported warning 8/1-8/0 due to EOBC heartbeat failure in device DEV_EOBC_MAC (device error 0xc0a0514d)
2022 Feb 22 19:46:26 lab-sw01-admin-vdc-1 %MODULE-4-MOD_WARNING: Module 8 (Serial number: JAA00000000) reported warning 8/1-8/0 due to EOBC heartbeat failure on standby sup in device DEV_EOBC_MAC (device error 0xc0a0504f)
2022 Feb 22 19:46:26 lab-sw01-admin-vdc-1 %MODULE-4-MOD_WARNING: Module 8 (Serial number: JAA00000000) reported warning 8/1-8/0 due to EOBC heartbeat failure in device DEV_EOBC_MAC (device error 0xc0a0514d)
此日志输出清楚地显示模块8在备用SUP上发生EOBC心跳故障,并且处于不正常状态,需要立即采取措施。
步骤1:执行show module并捕获数据以供参考:
lab-sw01-admin-vdc-1# show module
Mod Ports Module-Type Model Status
--- ----- ----------------------------------- ------------------ ----------
1 12 100 Gbps Ethernet Module N77-F312CK-26 ok
2 12 100 Gbps Ethernet Module N77-F312CK-26 ok
3 48 1/10 Gbps Ethernet Module N77-F348XP-23 ok
4 48 1/10 Gbps Ethernet Module N77-F348XP-23 ok
5 0 Supervisor Module-2 N77-SUP2E active *
6 0 Supervisor Module-2 N77-SUP2E ha-standby
7 24 10/40 Gbps Ethernet Module N77-F324FQ-25 ok
8 24 10/40 Gbps Ethernet Module N77-F324FQ-25 ok
Mod Sw Hw
--- --------------- ------
1 8.4(4) 1.5
2 8.4(4) 1.5
3 8.4(4) 1.9
4 8.4(4) 1.9
5 8.4(4) 1.3
6 8.4(4) 1.3
7 8.4(4) 1.2
8 8.4(4) 1.2
注意:所有模块均在线(即ok),模块5为主用(即active *)SUP,模块6为高可用性备用(即ha-standby)SUP。 尽管管理VDC日志文件中存在有关模块8的EOBC警告,但此输出将模块8描述为“正常”。
第二步:执行交换机重新加载或管理引擎切换(即管理VDC内均执行):
lab-sw01-admin-vdc-1# reload
- system (ie supervisor) switchover - NOTE: preferred method as this is a non-impacting procedure to the box with regards to active data flows
lab-sw01-admin-vdc-1# system switchover
注意:在任何一种情况下,执行重新加载或系统切换之前,请确保您位于两个管理引擎控制台上,以便您可以亲眼看到所有管理引擎输出。
第三步:如果模块8是可疑的罪魁祸首,您可能会看到控制台模块8在系统(即Supervisor)切换时出错:
lab-sw01-admin-vdc-1(standby) login: 2022 Feb 23 02:09:45 lab-sw01-admin-vdc-1 %$ VDC-1 %$ %KERN-2-SYSTEM_MSG: [12392164.927835] Switchover started by redundancy driver - kernel
2022 Feb 23 02:09:45 lab-sw01-admin-vdc-1 %$ VDC-1 %$ %SYSMGR-2-HASWITCHOVER_PRE_START: This supervisor is becoming active (pre-start phase).
2022 Feb 23 02:09:45 lab-sw01-admin-vdc-1 %$ VDC-1 %$ %SYSMGR-2-HASWITCHOVER_START: Supervisor 6 is becoming active.
2022 Feb 23 02:09:46 lab-sw01-vdc-2 %$ VDC-2 %$ %ELTM-2-ELTM_INTF_TO_LTL: Failed to get LTL for interface lc-eth0/8 return status No card found in slot
2022 Feb 23 02:09:46 lab-sw01-admin-vdc-1 %$ VDC-1 %$ %SYSMGR-2-SWITCHOVER_OVER: Switchover completed.
2022 Feb 23 02:09:47 lab-sw01-admin-vdc-1 %$ VDC-1 %$ %PLATFORM-1-PFM_ALERT: Disabling ejector based shutdown on sup in slot 6
2022 Feb 23 02:09:46 lab-sw01-vdc-2 %$ VDC-2 %$ %ELTM-2-ELTM_INTF_TO_LTL: Failed to get LTL for interface lc-eth1/8 return status No card found in slot
2022 Feb 23 02:09:46 lab-sw01-vdc-2 %$ VDC-2 %$ %ELTM-2-ELTM_INTF_TO_LTL: Failed to get LTL for interface lc-eth2/8 return status No card found in slot
2022 Feb 23 02:09:46 lab-sw01-vdc-2 %$ VDC-2 %$ %ELTM-2-ELTM_INTF_TO_LTL: Failed to get LTL for interface lc-eth3/8 return status No card found in slot
2022 Feb 23 02:09:46 lab-sw01-vdc-2 %$ VDC-2 %$ %ELTM-2-ELTM_INTF_TO_LTL: Failed to get LTL for interface lc-eth4/8 return status No card found in slot
2022 Feb 23 02:09:46 lab-sw01-vdc-2 %$ VDC-2 %$ %ELTM-2-ELTM_INTF_TO_LTL: Failed to get LTL for interface lc-eth5/8 return status No card found in slot
2022 Feb 23 02:09:46 lab-sw01-vdc-2 %$ VDC-2 %$ %ELTM-2-ELTM_INTF_TO_LTL: Failed to get LTL for interface lc-eth6/8 return status No card found in slot
2022 Feb 23 02:09:46 lab-sw01-vdc-2 %$ VDC-2 %$ %ELTM-2-ELTM_INTF_TO_LTL: Failed to get LTL for interface lc-eth7/8 return status No card found in slot
2022 Feb 23 02:09:46 lab-sw01-vdc-2 %$ VDC-2 %$ %ELTM-2-ELTM_INTF_TO_LTL: Failed to get LTL for interface lc-eth8/8 return status No card found in slot
2022 Feb 23 02:09:46 lab-sw01-vdc-2 %$ VDC-2 %$ %ELTM-2-ELTM_INTF_TO_LTL: Failed to get LTL for interface lc-eth9/8 return status No card found in slot
2022 Feb 23 02:09:46 lab-sw01-vdc-2 %$ VDC-2 %$ %ELTM-2-ELTM_INTF_TO_LTL: Failed to get LTL for interface lc-eth10/8 return status No card found in slot
2022 Feb 23 02:09:46 lab-sw01-vdc-2 %$ VDC-2 %$ %ELTM-2-ELTM_INTF_TO_LTL: Failed to get LTL for interface lc-eth11/8 return status No card found in slot
第四步:执行多个show模块并观察模块8是否重新联机/何时重新联机:
Module 5 dropped out and is powered-up:
Module 8 dropped out and is powered-up:
lab-sw01-admin-vdc-1# show module
Mod Ports Module-Type Model Status
--- ----- ----------------------------------- ------------------ ----------
1 12 100 Gbps Ethernet Module N77-F312CK-26 ok
2 12 100 Gbps Ethernet Module N77-F312CK-26 ok
3 48 1/10 Gbps Ethernet Module N77-F348XP-23 ok
4 48 1/10 Gbps Ethernet Module N77-F348XP-23 ok
5 0 Supervisor Module-2 powered-up
6 0 Supervisor Module-2 N77-SUP2E active *
7 24 10/40 Gbps Ethernet Module N77-F324FQ-25 ok
8 24 10/40 Gbps Ethernet Module powered-up
Mod Power-Status Reason
--- ------------ ---------------------------
8 powered-up Unknown. Issue show system reset mod ...
Mod Sw Hw
--- --------------- ------
1 8.4(4) 1.5
2 8.4(4) 1.5
3 8.4(4) 1.9
4 8.4(4) 1.9
6 8.4(4) 1.3
7 8.4(4) 1.2
lab-sw01-admin-vdc-1# 2022 Feb 23 02:11:11 lab-sw01-vdc-2 %$ VDC-2 %$ %PLATFORM-2-MOD_DETECT: Module 8 detected (Serial number JAA00000000) Module-Type 10/40 Gbps Ethernet Module Model N77-F324FQ-25
2022 Feb 23 02:11:11 lab-sw01-vdc-2 %$ VDC-2 %$ %PLATFORM-2-MOD_PWRUP: Module 8 powered up (Serial number JAA00000000)
2022 Feb 23 02:11:11 lab-sw01-admin-vdc-1 %$ VDC-1 %$ %PLATFORM-2-MOD_DETECT: Module 8 detected (Serial number JAA00000000) Module-Type 10/40 Gbps Ethernet Module Model N77-F324FQ-25
2022 Feb 23 02:11:11 lab-sw01-admin-vdc-1 %$ VDC-1 %$ %PLATFORM-2-MOD_PWRUP: Module 8 powered up (Serial number JAA00000000)
Module 8 is pwr-cycled:
lab-sw01-admin-vdc-1# show module
Mod Ports Module-Type Model Status
--- ----- ----------------------------------- ------------------ ----------
1 12 100 Gbps Ethernet Module N77-F312CK-26 ok
2 12 100 Gbps Ethernet Module N77-F312CK-26 ok
3 48 1/10 Gbps Ethernet Module N77-F348XP-23 ok
4 48 1/10 Gbps Ethernet Module N77-F348XP-23 ok
5 0 Supervisor Module-2 powered-up
6 0 Supervisor Module-2 N77-SUP2E active *
7 24 10/40 Gbps Ethernet Module N77-F324FQ-25 ok
8 24 10/40 Gbps Ethernet Module pwr-cycld
Mod Power-Status Reason
--- ------------ ---------------------------
8 pwr-cycld Unknown. Issue show system reset mod ...
Mod Sw Hw
--- --------------- ------
1 8.4(4) 1.5
2 8.4(4) 1.5
3 8.4(4) 1.9
4 8.4(4) 1.9
6 8.4(4) 1.3
7 8.4(4) 1.2
lab-sw01-admin-vdc-1# show module
Mod Ports Module-Type Model Status
--- ----- ----------------------------------- ------------------ ----------
1 12 100 Gbps Ethernet Module N77-F312CK-26 ok
2 12 100 Gbps Ethernet Module N77-F312CK-26 ok
3 48 1/10 Gbps Ethernet Module N77-F348XP-23 ok
4 48 1/10 Gbps Ethernet Module N77-F348XP-23 ok
5 0 Supervisor Module-2 powered-up
6 0 Supervisor Module-2 N77-SUP2E active *
7 24 10/40 Gbps Ethernet Module N77-F324FQ-25 ok
8 24 10/40 Gbps Ethernet Module N77-F324FQ-25 powered-up
Mod Sw Hw
--- --------------- ------
1 8.4(4) 1.5
2 8.4(4) 1.5
3 8.4(4) 1.9
4 8.4(4) 1.9
6 8.4(4) 1.3
7 8.4(4) 1.2
8 8.4(4) 1.2
Module 8 is checked by epld auto-upgrade and is good to go:
lab-sw01-admin-vdc-1# 2022 Feb 23 02:13:06 lab-sw01-admin-vdc-1 %$ VDC-1 %$ %USER-2-SYSTEM_MSG: <<%EPLD_AUTO-2-AUTO_UPGRADE_CHECK>> Automatic EPLD upgrade check for module 8: EPLD versions are up to date. - epld_auto
lab-sw01-admin-vdc-1# show module
Mod Ports Module-Type Model Status
--- ----- ----------------------------------- ------------------ ----------
1 12 100 Gbps Ethernet Module N77-F312CK-26 ok
2 12 100 Gbps Ethernet Module N77-F312CK-26 ok
3 48 1/10 Gbps Ethernet Module N77-F348XP-23 ok
4 48 1/10 Gbps Ethernet Module N77-F348XP-23 ok
5 0 Supervisor Module-2 powered-up
6 0 Supervisor Module-2 N77-SUP2E active *
7 24 10/40 Gbps Ethernet Module N77-F324FQ-25 ok
8 24 10/40 Gbps Ethernet Module N77-F324FQ-25 powered-up
Mod Sw Hw
--- --------------- ------
1 8.4(4) 1.5
2 8.4(4) 1.5
3 8.4(4) 1.9
4 8.4(4) 1.9
6 8.4(4) 1.3
7 8.4(4) 1.2
8 8.4(4) 1.2
Module 8 moves to testing by the hardware diagnostics:
lab-sw01-admin-vdc-1# show module
Mod Ports Module-Type Model Status
--- ----- ----------------------------------- ------------------ ----------
1 12 100 Gbps Ethernet Module N77-F312CK-26 ok
2 12 100 Gbps Ethernet Module N77-F312CK-26 ok
3 48 1/10 Gbps Ethernet Module N77-F348XP-23 ok
4 48 1/10 Gbps Ethernet Module N77-F348XP-23 ok
5 0 Supervisor Module-2 powered-up
6 0 Supervisor Module-2 N77-SUP2E active *
7 24 10/40 Gbps Ethernet Module N77-F324FQ-25 ok
8 24 10/40 Gbps Ethernet Module N77-F324FQ-25 testing
Mod Sw Hw
--- --------------- ------
1 8.4(4) 1.5
2 8.4(4) 1.5
3 8.4(4) 1.9
4 8.4(4) 1.9
6 8.4(4) 1.3
7 8.4(4) 1.2
8 8.4(4) 1.2
Module 8 moves to initializing after passing hardware diagnostics:
lab-sw01-admin-vdc-1# show module
Mod Ports Module-Type Model Status
--- ----- ----------------------------------- ------------------ ----------
1 12 100 Gbps Ethernet Module N77-F312CK-26 ok
2 12 100 Gbps Ethernet Module N77-F312CK-26 ok
3 48 1/10 Gbps Ethernet Module N77-F348XP-23 ok
4 48 1/10 Gbps Ethernet Module N77-F348XP-23 ok
5 0 Supervisor Module-2 powered-up
6 0 Supervisor Module-2 N77-SUP2E active *
7 24 10/40 Gbps Ethernet Module N77-F324FQ-25 ok
8 24 10/40 Gbps Ethernet Module N77-F324FQ-25 initializing
Mod Sw Hw
--- --------------- ------
1 8.4(4) 1.5
2 8.4(4) 1.5
3 8.4(4) 1.9
4 8.4(4) 1.9
6 8.4(4) 1.3
7 8.4(4) 1.2
8 8.4(4) 1.2
Module 8 comes online:
lab-sw01-admin-vdc-1# show module
Mod Ports Module-Type Model Status
--- ----- ----------------------------------- ------------------ ----------
1 12 100 Gbps Ethernet Module N77-F312CK-26 ok
2 12 100 Gbps Ethernet Module N77-F312CK-26 ok
3 48 1/10 Gbps Ethernet Module N77-F348XP-23 ok
4 48 1/10 Gbps Ethernet Module N77-F348XP-23 ok
5 0 Supervisor Module-2 powered-up
6 0 Supervisor Module-2 N77-SUP2E active *
7 24 10/40 Gbps Ethernet Module N77-F324FQ-25 ok
8 24 10/40 Gbps Ethernet Module N77-F324FQ-25 ok
Mod Sw Hw
--- --------------- ------
1 8.4(4) 1.5
2 8.4(4) 1.5
3 8.4(4) 1.9
4 8.4(4) 1.9
6 8.4(4) 1.3
7 8.4(4) 1.2
8 8.4(4) 1.2
Module 5 SUP going active:
lab-sw01-admin-vdc-1# show module
Mod Ports Module-Type Model Status
--- ----- ----------------------------------- ------------------ ----------
1 12 100 Gbps Ethernet Module N77-F312CK-26 ok
2 12 100 Gbps Ethernet Module N77-F312CK-26 ok
3 48 1/10 Gbps Ethernet Module N77-F348XP-23 ok
4 48 1/10 Gbps Ethernet Module N77-F348XP-23 ok
5 0 Supervisor Module-2 N77-SUP2E inserted
6 0 Supervisor Module-2 N77-SUP2E active *
7 24 10/40 Gbps Ethernet Module N77-F324FQ-25 ok
8 24 10/40 Gbps Ethernet Module N77-F324FQ-25 ok
Mod Sw Hw
--- --------------- ------
1 8.4(4) 1.5
2 8.4(4) 1.5
3 8.4(4) 1.9
4 8.4(4) 1.9
5 8.4(4) 1.3
6 8.4(4) 1.3
7 8.4(4) 1.2
8 8.4(4) 1.2
Module 5 SUP becomes ha-standby:
2022 Feb 23 02:16:38 lab-sw01-admin-vdc-1 %$ VDC-1 %$ %PLATFORM-1-PFM_ALERT: Enabling ejector based shutdown on sup in slot 6
lab-sw01-admin-vdc-1# show module
Mod Ports Module-Type Model Status
--- ----- ----------------------------------- ------------------ ----------
1 12 100 Gbps Ethernet Module N77-F312CK-26 ok
2 12 100 Gbps Ethernet Module N77-F312CK-26 ok
3 48 1/10 Gbps Ethernet Module N77-F348XP-23 ok
4 48 1/10 Gbps Ethernet Module N77-F348XP-23 ok
5 0 Supervisor Module-2 N77-SUP2E ha-standby
6 0 Supervisor Module-2 N77-SUP2E active *
7 24 10/40 Gbps Ethernet Module N77-F324FQ-25 ok
8 24 10/40 Gbps Ethernet Module N77-F324FQ-25 ok
Mod Sw Hw
--- --------------- ------
1 8.4(4) 1.5
2 8.4(4) 1.5
3 8.4(4) 1.9
4 8.4(4) 1.9
5 8.4(4) 1.3
6 8.4(4) 1.3
7 8.4(4) 1.2
8 8.4(4) 1.2
2022 Feb 23 02:15:44 lab-sw01-admin-vdc-1 %MODULE-5-MOD_OK: Module 8 is online (Serial number: JAA00000000)
2022 Feb 23 02:15:43 lab-sw01-admin-vdc-1 %SYSMGR-SLOT8-5-MODULE_ONLINE: System Manager has received notification of local module becoming online.
2022 Feb 23 02:15:44 lab-sw01-admin-vdc-1 %PLATFORM-5-MOD_STATUS: Module 8 current-status is MOD_STATUS_ONLINE/OK
2022 Feb 23 02:16:38 lab-sw01-admin-vdc-1 %MODULE-5-STANDBY_SUP_OK: Supervisor 5 is standby
注意:所有模块均在线(即正常),模块6为主用(即主用*)SUP,模块5为高可用性备用(即高可用性备用)。
第五步:所有模块均在线后,重复步骤1.并验证所有连接都已标准化。
版本 | 发布日期 | 备注 |
---|---|---|
1.0 |
24-Mar-2022 |
初始版本 |