简介
本文档介绍在冗余配置管理器(RCM)切换后对用户平面功能(UPF)升级进行故障排除。
问题
第1步:在活动UPF(基于RCM)中,观察到缺失数据块的实例:
[local]UPF# context n6
[n6]UPF# show ipv6 chunks
Failure: This CLI is only for User-plane
注意:请始终检查托管UPF的源和目标UCS服务器上的DIMM/ECC/UEC/ADDDC错误,并进行RCM tac调试/
第2步:在主用UPF上,如果缺少数据块,则监控SNMP陷阱事件,以使UPF状态从备用状态转换到主用。
[n6]UPF# show snmp trap history verbose | grep RCM
Tuesday November 14 21:16:45 UTC 2023
Mon Oct 13 08:24:42 2023 Internal trap notification 1426 (RCMChassisState) RCM Chassis State: (0) Chassis State Init
Mon Oct 13 08:24:49 2023 Internal trap notification 1414 (RCMServiceStart) Context Name:rcm Service Name:rcm started
Mon Oct 13 08:25:04 2023 Internal trap notification 1425 (RCMTCPConnect) Context Name: rcm
Mon Oct 13 08:25:04 2023 Internal trap notification 1421 (RCMConfigPushCompleteSent) Context Name: rcm
Mon Oct 13 08:25:04 2023 Internal trap notification 1426 (RCMChassisState) RCM Chassis State: (2) Chassis State Standby
Mon Oct 13 08:33:47 2023 Internal trap notification 1420 (RCMConfigPushCompleteReceived) Context Name:
Mon Oct 13 08:33:47 2023 Internal trap notification 1421 (RCMConfigPushCompleteSent) Context Name: rcm
Mon Oct 13 08:48:10 2023 Internal trap notification 1421 (RCMConfigPushCompleteSent) Context Name: rcm
Mon Oct 13 08:48:10 2023 Internal trap notification 1420 (RCMConfigPushCompleteReceived) Context Name: up
Mon Oct 13 08:48:12 2023 Internal trap notification 1426 (RCMChassisState) RCM Chassis State: (1) Chassis State Active
第3步:在活动UPF上,当遇到缺少数据块时,检查系统日志中是否存在指示相应冗余组(RG-1)中rest(5)sx-demux服务停止的日志事件,同时备用UPF转换为活动状态。
Oct 13 08:48:11 UPF evlogd: [local-60sec11.091] [sessctrl 8066 info] [1/0/9050 <sessctrl:0> ctrl_mgrs_cfg.c:2201] [context: up, contextID: 2] [software internal system critical-info syslog] Session Controller: stopping SX-DEMUX service up1 2023-10-13T08:48:11.000+0000
Oct 13 08:48:11 UPF evlogd: [local-60sec11.483] [sessctrl 8066 info] [1/0/9050 <sessctrl:0> ctrl_mgrs_cfg.c:2201] [context: up, contextID: 2] [software internal system critical-info syslog] Session Controller: stopping SX-DEMUX service up2 2023-10-13T08:48:11.000+0000
Oct 13 08:48:11 UPF evlogd: [local-60sec11.582] [sessctrl 8066 info] [1/0/9050 <sessctrl:0> ctrl_mgrs_cfg.c:2201] [context: up, contextID: 2] [software internal system critical-info syslog] Session Controller: stopping SX-DEMUX service up3 2023-10-13T08:48:11.000+0000
Oct 13 08:48:11 UPF evlogd: [local-60sec11.726] [sessctrl 8066 info] [1/0/9050 <sessctrl:0> ctrl_mgrs_cfg.c:2201] [context: up, contextID: 2] [software internal system critical-info syslog] Session Controller: stopping SX-DEMUX service up5 2023-10-13T08:48:11.000+0000
Oct 13 08:48:18 UPF evlogd: [local-60sec18.749] [sessctrl 8066 info] [1/0/9050 <sessctrl:0> ctrl_mgrs_cfg.c:2201] [context: up, contextID: 2] [software internal system critical-info syslog] Session Controller: stopping SX-DEMUX service up6 2023-10-13T08:48:18.000+0000
第4步:在缺少块的活动UPF上,启用debug mode(cli test-commands password <password>)并执行命令以监控与UPF活动期间一致的Sx DeReg事务。
[n6]UPF# show ip pool vpn-sx-transactions
Context: n6
Sx transactions:
sent: 0, received: 0
Failed transactions: 0
**************************************************************************************
Sx Deregistration transactions:
**************************************************************************************
Peer Address Deregistration Time
================================ ====================================================
192.168.1.55 Mon Oct 13 08:48:18 2023
192.168.1.49 Mon Oct 13 08:48:18 2023
192.168.1.49 Mon Oct 13 08:48:18 2023
192.168.2.55 Mon Oct 13 08:48:18 2023
192.168.2.55 Mon Oct 13 08:48:18 2023
192.168.2.49 Mon Oct 13 08:48:18 2023
192.168.2.49 Mon Oct 13 08:48:18 2023
[n6]UPF#
第5步:在缺少数据块的主用UPF上,搜索系统日志,查找发生在正在转换为主用状态的UPF附近的日志。
Oct 13 08:48:12 UPF evlogd: [local-60sec12.060] [vpn 5013 error] [1/0/9399 <vpnmgr:3> _cups_ip_pool.c:16149] [context: n6, contextID: 3] [software internal system syslog] #01Prefix fd12:3456:7890:abcd::/64 is not allocated to this UP: Closest chunk found with id -2146435055 prefix0: 638057330 start_prefix1: -1391067126 end_prefix1: -1391050752 2023-10-13T08:48:12.000+0000
观察来自vpnmgr实例3任务(vpnmgr:3)的连续日志事件。
localsystem:$ less UPF-Destination-UPF-Syslog.log | grep "Pool_name is not present" | head -1
Oct 13 08:48:18 UPF evlogd: [local-60sec18.811] [vpn 5013 error] [1/0/9399 <vpnmgr:3> vpn_ip_pool.c:27493] [context: n6, contextID: 3] [software internal system syslog] #01Pool_name is not present in release request for prefixfd1:3456:7892:abcd::/64 2023-10-13T08:48:18.000+0000
localsystem:$
localsystem:$ less UPF-Destination-UPF-Syslog.log | grep "Pool_name is not present" | tail -1
Oct 13 09:29:59 UPF evlogd: [local-60sec59.671] [vpn 5013 error] [1/0/9399 <vpnmgr:3> vpn_ip_pool.c:27493] [context: n6, contextID: 3] [software internal system syslog] #01Pool_name is not present in release request forprefixfd1:3456:7894:abcd::/64 2023-10-13T09:29:59.000+0000
localsystem:$
解决方案
要解决此问题,请参阅漏洞报告了解更多详细信息:Cisco Bug ID CSCwh97931
此问题的解决方法涉及增强SxDemux以防止SRP过渡期间的IP块清理,同时提高日志调试功能。
如果所提到的CDETS修复程序在您使用的UPF版本中尚不可用,则可以采用以下解决方法:
在MW时间范围内执行标准N4关联取消关联/关联MOP。