此产品的文档集力求使用非歧视性语言。在本文档集中,非歧视性语言是指不隐含针对年龄、残障、性别、种族身份、族群身份、性取向、社会经济地位和交叉性的歧视的语言。由于产品软件的用户界面中使用的硬编码语言、基于 RFP 文档使用的语言或引用的第三方产品使用的语言,文档中可能无法确保完全使用非歧视性语言。 深入了解思科如何使用包容性语言。
思科采用人工翻译与机器翻译相结合的方式将此文档翻译成不同语言,希望全球的用户都能通过各自的语言得到支持性的内容。 请注意:即使是最好的机器翻译,其准确度也不及专业翻译人员的水平。 Cisco Systems, Inc. 对于翻译的准确性不承担任何责任,并建议您总是参考英文原始文档(已提供链接)。
本文档介绍如何对Cisco IOS® XR路由器上的“OS-SHMWIN-2-ERROR_ENCOUNTERED”错误进行故障排除。
错误消息的示例包括:
"%OS-SHMWIN-2-ERROR_ENCOUNTERED"
LC/0/0/CPU0:Dec 16 09:45:58 : fib_mgr[260]: %OS-SHMWIN-2-ERROR_ENCOUNTERED : SHMWIN: Error encountered: System memory state is severe, please check the availability of the system memory
LC/0/0/CPU0:Dec 16 09:45:39 : l2fib[328]: %OS-SHMWIN-2-ERROR_ENCOUNTERED : SHMWIN: Error encountered: System memory state is severe, please check the availability of the system memory
RP/0/RSP0/CPU0:Aug 11 21:15:47.174 IST: show_ip_interface[65961]: %OS-SHMWIN-2-ERROR_ENCOUNTERED : SHMWIN: Error encountered: 'shmwin' detected the 'fatal' condition 'mutex operation failed'
此错误指示系统的内存状态为严重。具体而言,存储多个进程之间的动态数据的共享内存存在问题。
首先确定线卡(或RP/RSP)和顶级内存消费者。
错误消息可能嵌入了进程或命令。但是,如果内存状况较低,如果没有足够的可用内存,任何操作都可能失败。您需要确定导致可用内存不足的原因。
错误消息本身指示了线路卡。 尝试查找内存的顶级使用者。
show memory location 0/x/CPUx
show memory summary location 0/x/CPUx
show watchdog memory-state location 0/x/CPUx
show processes memory location 0/x/CPUx
注意:可能存在其他错误消息,可能指示问题进程是什么。
例如:
RP/0/RSP0/CPU0:Apr 24 11:34:33.599 EST: wdsysmon[450]: %HA-HA_WD-4-MEMORY_ALARM : Memory threshold crossed: Normal with 892.125MB free
RP/0/RSP0/CPU0:Apr 24 13:23:12.947 EST: wdsysmon[450]: %HA-HA_WD-4-MEMORY_ALARM : Memory threshold crossed: Minor with 819.199MB free
RP/0/RSP0/CPU0:Apr 24 14:32:10.086 EST: wdsysmon[450]: %HA-HA_WD-4-MEMORY_STATE_CHANGE : New memory state: Severe
RP/0/RSP0/CPU0:Apr 24 14:32:10.086 EST: wdsysmon[450]: %HA-HA_WD-4-TOP_MEMORY_USERS_WARNING : Top 5 consumers of system memory (671084 Kbytes free):
RP/0/RSP0/CPU0:Apr 24 14:32:10.086 EST: wdsysmon[450]: %HA-HA_WD-4-TOP_MEMORY_USER_WARNING : 0: Process Name: eth_server[61], pid: 57385, Heap usage 54632 Kbytes, Virtual Shared memory usage: 73116 Kbytes.
RP/0/RSP0/CPU0:Apr 24 14:32:10.086 EST: wdsysmon[450]: %HA-HA_WD-4-TOP_MEMORY_USER_WARNING : 1: Process Name: bgp[1051], pid: 553285, Heap usage 28556 Kbytes, Virtual Shared memory usage: 90512 Kbytes.
RP/0/RSP0/CPU0:Apr 24 14:32:10.087 EST: wdsysmon[450]: %HA-HA_WD-4-TOP_MEMORY_USER_WARNING : 2: Process Name: instdir[252], pid: 184387, Heap usage 24808 Kbytes, Virtual Shared memory usage: 24800 Kbytes.
RP/0/RSP0/CPU0:Apr 24 14:32:10.087 EST: wdsysmon[450]: %HA-HA_WD-4-TOP_MEMORY_USER_WARNING : 3: Process Name: parser_server[352], pid: 204908, Heap usage 21896 Kbytes, Virtual Shared memory usage: 4184784 Kbytes.
RP/0/RSP0/CPU0:Apr 24 14:32:10.087 EST: wdsysmon[450]: %HA-HA_WD-4-TOP_MEMORY_USER_WARNING : 4: Process Name: ipv6_rib[1144], pid: 549174, Heap usage 21600 Kbytes, Virtual Shared memory usage: 24688 Kbytes.
如果进程是BGP或任何其他路由协议,请验证您没有在网络中造成此问题的任何更改。
使用这些命令可获取已用内存的概述以及确定占用内存最多的进程。
0/x/CPUx是错误中的特定板卡。
show memory summary location 0/x/CPUx
show memory summary location 0/x/CPUx
show shared-memory location 0/x/CPUx
show memory-top-consumers location 0/x/CPUx
show shmwin summary location 0/x/CPUx
示例:
RP/0/RSP1/CPU0:R1#show memory summary location 0/RSP0/CPU0
node: node0_RSP0_CPU0
Physical Memory: 6144M total--------------------------------------
Application Memory : 5738M (2795M available)
Image: 117M (bootram: 117M)
Reserved: 224M, IOMem: 0, flashfsys: 0
Total shared window: 76M
RP/0/RSP1/CPU0:R1#show memory summary location 0/RSP0/CPU0
node: node0_RSP0_CPU0
Physical Memory: 6144M total--------------------------------------
Application Memory : 5738M (2797M available)
Image: 117M (bootram: 117M)
Reserved: 224M, IOMem: 0, flashfsys: 0
Total shared window: 76M
RP/0/RSP1/CPU0:R1#show shared-memory location 0/0/cpu0
Total Shared memory: 1527M
ShmWin: 236M
Image: 703M
LTrace: 353M
AIPC: 33M
SLD: 3M
SubDB: 1M
CERRNO: 144K
GSP-CBP: 64M
EEM: 0
XOS: 4M
CHKPT: 2M
CDM: 4M
XIPC: 594K
DLL: 64K
SysLog: 0
Miscellaneous: 119M
LTrace usage details:
Used: 353M, Max: 2075M
Current: default(dynamic)
Configured: dynamic with scale-factor: 8 (changes take effect after reload)
RP/0/RP0/CPU0:R1#show memory-top-consumers location 0/RP0/CPU0
Execute 'show memory-snapshots process <> location <>' to check memory usage trend.
###################################################################
Top memory consumers on 0/RP0/CPU0 (at 2023/Nov/8/15:41:42)
###################################################################
PID Process Total(MB) Heap(MB) Shared(MB)
7366 mibd_interface 233.2 192.64 37.7
2552 spp 228.2 9.71 222.1
49132 bgp 225.9 83.62 165.9
4844 l2rib 211.8 21.12 190.1
2787 gsp 137.9 24.64 113.1
3869 mpls_lsd 122.8 12.85 107.8
3804 fib_mgr 121.0 13.43 108.7
2975 parser_server 116.7 66.39 44.6
6685 l2vpn_mgr 116.5 43.77 82.3
3310 dpa_port_mapper 114.8 2.96 110.2
RP/0/RSP1/CPU0:R1#show shmwin summary location 0/0/cpu0
----------------------------------------
Shared memory window summary information
----------------------------------------
Data for Window "subdb_sco_tbl":
-----------------------------
Virtual Memory size : 1536 MBytes
Virtual Memory Range : 0x7c000000 - 0xdc000000
Virtual Memory Group 2 size : 352 MBytes
Virtual Memory Group 2 Range : 0x66000000 - 0x7c000000
Window Name ID GRP #Usrs #Wrtrs Ownr Usage(KB) Peak(KB) Peak Timestamp
---------------- --- --- ----- ------ ---- --------- -------- -------------------
subdb_sco_tbl 70 1 1 1 158 3 0 --/--/---- --:--:--
Data for Window "ptp":
-----------------------------
ptp 131 P 1 1 0 35 35 10/18/2023 11:56:31
Data for Window "cfmd-sla":
-----------------------------
cfmd-sla 53 1 1 1 0 99 99 10/18/2023 11:56:20
Data for Window "cfmd":
-----------------------------
cfmd 36 1 1 1 0 99 99 10/18/2023 11:56:30
Data for Window "vkg_pbr_ea":
-----------------------------
vkg_pbr_ea 83 1 1 1 0 147 147 10/18/2023 11:56:27
Data for Window "span_ea_pd":
-----------------------------
span_ea_pd 40 1 1 1 362 34 34 10/18/2023 11:56:13
Data for Window "vkg_l2fib_vqi":
-----------------------------
vkg_l2fib_vqi 97 1 2 2 0 3 0 --/--/---- --:--:--
Data for Window "statsd_db":
-----------------------------
statsd_db 60 1 1 1 0 3 0 --/--/---- --:--:--
Data for Window "statsd_db_l":
-----------------------------
statsd_db_l 130 P 1 1 0 1131 1131 10/18/2023 11:56:17
Data for Window "arp":
-----------------------------
arp 20 1 1 1 0 227 227 10/18/2023 11:56:37
Data for Window "bm_lacp_tx":
-----------------------------
bm_lacp_tx 54 1 1 1 132 1 0 --/--/---- --:--:--
Data for Window "ether_ea_shm":
-----------------------------
ether_ea_shm 26 1 4 4 406 227 227 10/18/2023 11:56:27
Data for Window "vkg_l2fib_evpn":
-----------------------------
vkg_l2fib_evpn 100 1 3 3 0 3 0 --/--/---- --:--:--
Data for Window "l2fib":
-----------------------------
l2fib 14 1 10 10 262 45265 45265 11/08/2023 15:03:18
Data for Window "ether_ea_tcam":
-----------------------------
ether_ea_tcam 58 1 5 5 313 595 595 10/18/2023 11:55:55
Data for Window "vkg_vpls_mac":
-----------------------------
vkg_vpls_mac 35 1 3 3 0 6291 6291 10/25/2023 13:15:04
Data for Window "prm_stats_svr":
-----------------------------
prm_stats_svr 24 1 21 21 0 12419 12419 10/18/2023 11:56:24
Data for Window "prm_srh_main":
-----------------------------
prm_srh_main 66 1 31 31 0 60163 60163 10/18/2023 11:56:31
Data for Window "prm_tcam_mm_svr":
-----------------------------
prm_tcam_mm_svr 23 1 1 1 0 22067 22163 10/18/2023 12:04:59
Data for Window "prm_ss_lm_svr":
-----------------------------
prm_ss_lm_svr 65 1 1 1 0 3233 3233 10/18/2023 11:56:33
Data for Window "prm_ss_mm_svr":
-----------------------------
prm_ss_mm_svr 22 1 5 5 0 3867 3867 10/18/2023 11:55:52
Data for Window "vkg_gre_tcam":
-----------------------------
vkg_gre_tcam 63 1 2 2 388 35 35 10/18/2023 11:55:54
Data for Window "tunl_gre":
-----------------------------
tunl_gre 62 1 2 2 388 39 39 10/18/2023 11:55:38
Data for Window "pd_fib_cdll":
-----------------------------
pd_fib_cdll 28 1 1 1 0 35 35 10/18/2023 11:55:36
Data for Window "SMW_TEST_2":
-----------------------------
SMW_TEST_2 86 1 1 1 0 1067 1067 10/18/2023 11:55:35
Data for Window "ifc-mpls":
-----------------------------
ifc-mpls 13 1 18 18 188 7161 9057 11/02/2023 18:32:41
Data for Window "ifc-ipv6":
-----------------------------
ifc-ipv6 17 1 18 18 188 25249 25665 11/02/2023 18:33:13
Data for Window "ifc-ipv4":
-----------------------------
ifc-ipv4 16 1 18 18 188 24205 24893 10/31/2023 18:12:27
Data for Window "ifc-protomax":
-----------------------------
ifc-protomax 18 1 18 18 188 6057 6297 10/18/2023 11:56:06
Data for Window "bfd_offload_shm":
-----------------------------
bfd_offload_shm 94 1 1 1 0 2 0 --/--/---- --:--:--
Data for Window "netio_fwd":
-----------------------------
netio_fwd 34 1 1 1 0 0 0 --/--/---- --:--:--
Data for Window "mfwd_info":
-----------------------------
mfwd_info 1 1 2 2 254 1373 1373 10/18/2023 11:56:24
Data for Window "mfwdv6":
-----------------------------
mfwdv6 15 1 1 1 258 737 737 10/18/2023 11:55:57
Data for Window "vkg_bmp_adj":
-----------------------------
vkg_bmp_adj 30 1 2 2 129 235 235 10/18/2023 11:55:55
Data for Window "rewrite-db":
-----------------------------
rewrite-db 101 1 3 3 0 4115 4115 10/18/2023 11:55:32
Data for Window "inline_svc":
-----------------------------
inline_svc 88 1 1 1 0 755 755 10/18/2023 11:55:33
Data for Window "im_rd":
-----------------------------
im_rd 33 1 75 75 217 1131 1131 10/18/2023 11:55:32
Data for Window "ipv6_pmtu":
-----------------------------
ipv6_pmtu 98 1 1 1 256 3 0 --/--/---- --:--:--
Data for Window "im_db_private":
-----------------------------
im_db_private 129 P 1 1 0 1131 1131 10/18/2023 11:55:34
Data for Window "infra_ital":
-----------------------------
infra_ital 19 1 3 3 340 387 387 10/18/2023 11:55:41
Data for Window "infra_statsd":
-----------------------------
infra_statsd 8 1 5 5 370 3 0 --/--/---- --:--:--
Data for Window "ipv6_nd_pkt":
-----------------------------
ipv6_nd_pkt 128 P 1 1 0 107 107 10/18/2023 11:55:30
Data for Window "aib":
-----------------------------
aib 2 1 10 10 114 2675 2675 10/18/2023 11:56:42
Data for Window "vkg_pm":
-----------------------------
vkg_pm 5 1 34 1 313 307 307 11/03/2023 11:25:06
Data for Window "subdb_fai_tbl":
-----------------------------
subdb_fai_tbl 75 2 11 1 0 51 51 10/18/2023 11:55:26
Data for Window "subdb_ifh_tbl":
-----------------------------
subdb_ifh_tbl 74 2 2 1 0 35 35 10/18/2023 11:55:26
Data for Window "subdb_ao_tbl":
-----------------------------
subdb_ao_tbl 72 2 1 1 0 43 43 10/18/2023 11:55:26
Data for Window "subdb_do_tbl":
-----------------------------
subdb_do_tbl 73 2 11 1 0 35 35 10/18/2023 11:55:26
Data for Window "subdb_co_tbl":
-----------------------------
subdb_co_tbl 71 2 11 1 0 4107 4107 10/18/2023 11:55:26
Data for Window "rspp_ma":
-----------------------------
rspp_ma 3 1 14 14 0 3 0 --/--/---- --:--:--
Data for Window "cluster_dlm":
-----------------------------
cluster_dlm 61 1 26 26 0 3 0 --/--/---- --:--:--
Data for Window "pfm_node":
-----------------------------
pfm_node 29 1 1 1 0 195 195 10/18/2023 11:56:11
Data for Window "im_rules":
-----------------------------
im_rules 31 1 85 85 217 453 453 10/18/2023 11:55:32
Data for Window "im_db":
-----------------------------
im_db 32 1 85 1 0 2065 2065 10/18/2023 11:56:26
Data for Window "spp":
-----------------------------
spp 27 1 51 51 88 1403 1403 10/18/2023 11:56:29
Data for Window "qad":
-----------------------------
qad 6 1 1 1 0 134 134 01/01/1970 02:00:08
Data for Window "pcie-server":
-----------------------------
pcie-server 39 1 1 1 0 39 39 01/01/1970 02:00:07
---------------------------------------------
Total SHMWIN memory usage : 235 MBytes
确定所有进程没有内存泄漏:
您可以进行“内存比较”。此进程显示每个进程内存在一段时间内增加或减少(如您指定)。这是一个示例;请注意“difference”列。
RP/0/RSP0/CPU0:R1#show memory compare start
Successfully stored memory snapshot /harddisk:/malloc_dump/memcmp_start.out
RP/0/RSP0/CPU0:R1#show memory compare end
Successfully stored memory snapshot /harddisk:/malloc_dump/memcmp_end.out
RP/0/RSP0/CPU0:R1#show memory compare report
JID name mem before mem after difference mallocs restart/exit/new
--- ---- ---------- --------- ---------- ------- ----------------
376 parser_server 32069512 32070976 1464 1
463 sysdb_svr_local 10064204 10065084 880 20
459 sysdb_shared_nc 4103104 4103560 456 12
66013 exec 209964 210052 88 3
1241 xtc_agent 4796436 4796432 -4 0
1087 bgp 51646552 51646120 -432 -3
457 sysdb_mc 5094852 5094188 -664 -8
358 netio 19185724 19183804 -1920 -45
334 lpts_pa 76234948 76228484 -6464 -97
1031 ospf 9107084 9098232 -8852 -1
476 tcp 5725148 5708444 -16704 -8
254 gsp 9473460 9424452 -49008 14
1153 mdtd 25206084 24750076 -456008 -25
You are now free to remove snapshot memcmp_start.out and memcmp_end.out under /harddisk:/malloc_dump
如果ltrace是需要大量内存的进程,并且是消耗最多内存的进程之一,请考虑降低其使用的内存量。
以下示例说明了如何配置ltrace以减少内存:在ASR9K路由处理器和板卡上配置ltrace缩放因子,以实现有效的内存管理
如果您在本文档中未找到问题的解决方案,请提供以下输出:
0/x/CPUx是错误中的特定板卡。可使用命令show processes找到进程的作业ID (JID)。
show tech-support
show hw-module fpd
show memory location 0/x/CPUx
show memory summary location all
show watchdog memory-state location all
show watchdog trace location all
show processes memory location all
show shmwin all header location 0/x/CPUx
show shmwin all bands location 0/x/CPUx
show shmwin all banks location 0/x/CPUx
show shmwin all list all location 0/x/CPUx
show shmwin all malloc-stats location 0/x/CPUx
show shmwin all mutexlocation 0/x/CPUx
show shmwin all participants all-stats location 0/x/CPUx
show shmwin all pool all-pools location
show shmwin trace all location all
show memory <job id process> location 0/x/CPUx
版本 | 发布日期 | 备注 |
---|---|---|
1.0 |
01-Dec-2023 |
初始版本 |