EGTP路径故障排除

下载选项

PDF (390.3 KB)
在各种设备上使用 Adobe Reader 查看

已更新: 2023 年 12 月 4 日

文档 ID:221216

非歧视性语言

此产品的文档集力求使用非歧视性语言。在本文档集中，非歧视性语言是指不隐含针对年龄、残障、性别、种族身份、族群身份、性取向、社会经济地位和交叉性的歧视的语言。由于产品软件的用户界面中使用的硬编码语言、基于 RFP 文档使用的语言或引用的第三方产品使用的语言，文档中可能无法确保完全使用非歧视性语言。深入了解思科如何使用包容性语言。

关于此翻译

思科采用人工翻译与机器翻译相结合的方式将此文档翻译成不同语言，希望全球的用户都能通过各自的语言得到支持性的内容。请注意：即使是最好的机器翻译，其准确度也不及专业翻译人员的水平。 Cisco Systems, Inc. 对于翻译的准确性不承担任何责任，并建议您总是参考英文原始文档（已提供链接）。

简介

本文档介绍如何对EGTP路径故障问题进行故障排除。

概述

演进型GPRS隧道协议(EGTP)路径故障是指移动网络中GTP节点之间的通信路径问题。GTP是用于在不同网络元素之间传输用户数据和信令消息的协议。

EGTP路径失败的可能原因

1. 连通性问题-网络连接问题

2. 重新启动计数器值更改

3. 巨大的传入流量请求-网络拥塞

4.DSCP/QOS等配置问题

5. EGTPC链路上没有用户/会话

所需的日志

1. SSD/系统日志涵盖问题时间范围，从问题开始之前至少两小时一直持续到当前时间。

2. 使用日志（即发现路径故障的路径的ping和traceroute）确认可达性。

3. 有问题和无问题节点之间的配置检查。

4. 需要确认同一路径上的流量是否突然增加或拒绝率是否增加。

5. 问题时段内的批量统计数据涵盖问题发生前至少2-3天的时间范围。

注意：根据问题类型，可能要求使用前面提到的日志。并非每次都需要所有日志。

故障排除命令

show egtpc peers interface

show egtpc peers path-failure-history

show egtpc statistics path-failure-reasons

show egtp-service all

show egtpc sessions

show egtpc statistics

egtpc test echo gtp-version 2 src-address <source node IP address>  peer-address <remote node IP address>

For more details related to above command refer doc as mentioned below

https://www.cisco.com/c/en/us/support/docs/wireless-mobility/gateway-gprs-support-node-ggsn/119246-technote-ggsn-00.html

SNMP 陷阱:

Sun Feb 05 03:00:20 2023 Internal trap notification 1112 (EGTPCPathFail) context s11mme, service s11-mme, interface type mme, self address x.x.x.x.,  peer address 10.0.219.57, peer old restart counter 4,  peer new restart counter 4,  peer session count  240, failure reason no-response-from-peer,  path failure detection Enabled

Tue Jul 09 18:41:36 2019 Internal trap notification 1112 (EGTPCPathFail) context pgw, service s5-s8-sgw-egtp, interface type sgw-egress, self address x.x.x.x,  peer address x.x.x.x, peer old restart counter 27,  peer new restart counter 27,  peer session count  1, failure reason no-response-from-peer,  path failure detection Enabled

场景/原因简介

连通性问题-网络连接问题

当路由路径中的问题可能在SGSN/MME和SPGW/GGSN之间的路由器端或防火墙上时，就会发生连通性问题。

ping <destination IP>

traceroute <destination IP> src <source IP>

注意：必须从运行EGTP服务的内容中检查用于检查可达性的两个命令。

重新启动计数器值更改

EGTP路径维护SGSN/MME和GGSN/SPGW之间路径两端的重新启动计数器。

Restart_counter

要详细了解此类问题，请参阅https://www.cisco.com/c/en/us/support/docs/wireless/asr-5000-series/200026-ASR-5000-Series-Troubleshooting-GTPC-and.html链接。

巨大的传入流量请求-网络拥塞

每当突然出现高流量事务时，就有EGTP Tx和Rx数据包丢弃的机会。确认此场景的基本检查：

1. 您必须检查egtpinmgr的CPU使用率是否过高。

Mar 25 14:30:48 10.224.240.132 evlogd: [local-60sec48.142] [resmgr 14907 debug] [6/0/10088 <rmmgr:60> _resource_log.c:1391] [software internal system critical-info syslog] RM-60: rmmgr_collect_cpustats_coproc_done: ahm cpustats logged for egtpinmgr instance 2 in cpu warn state file <cpustats-5e7b40db-06-00-egtpinmgr-2-6192>
Mar 25 14:31:05 10.224.240.132 evlogd: [local-60sec5.707] [resmgr 14907 debug] [6/0/10088 <rmmgr:60> _resource_log.c:1391] [software internal system critical-info syslog] RM-60: rmmgr_collect_cpustats_coproc_done: ahm cpustats logged for egtpinmgr instance 2 in cpu warn state file <cpustats-5e7b40f9-06-00-egtpinmgr-2-6192>

2. 检查ECHO请求/响应是否失败（之前共享的命令）。

3. 可以检查从解复用器卡中是否有任何丢包。

所有EGTP入站流量都必须通过同一egtpmgr。如果发现一个节点出现路径故障，入站流量可能会增加。并且，您可能会遇到egtpmgr进程级别的流量丢弃。即使是协同定位的进程也必须通过同一个egtpmgr队列进行并受到影响。

下面是检查数据包丢失的步骤，必须通过多次迭代完成

debug shell card <> cpu 0

cat /proc/net/boxer



******** card1-cpu0 /proc/net/boxer *******
Wednesday March 25 17:34:54 AST 2020
what                   total_used next    refills     hungry  exhausted  system_rate_kbps      system_credits    max_bhn
bdp_rld           4167990936249KB  094   51064441        292          1  3557021/65000000 7825602KB/7934570KB   793457KB
what                 bhn local            remote                   ver         rx    rx_drop         tx    tx_drop    no_dest     no_src
total cpu 34         *   *                *                          *    3274522         59         60          0    1307242          0
total cpu 35         *   *                *                          *    6330639         46        121          0    1086591          0
total cpu 46         *   *                *                          *    5076520         27      15524          0     786982          0
total cpu 47         *   *                *                          * 4163101019      83922  133540922          0     886241          0

4. 如果看到egtpinmgr的CPU使用率较高，则必须捕获egtpinmgr CPU分析器输出。

如果上述所有条件均有效，则可以检查上述可能解决方案。

解决方案

1. 增加EGTP回应超时-如果5秒没有帮助，您可以尝试15或25秒。您可以与您的AS团队讨论以调整这一点。

2. 减少对等体拯救超时-超时值越小，非活动对等体的数量就越少，因此，您可以使用以下命令更改时间值：

gtpc peer-salvation min-peers 2000 timeout 24

3. 过载保护-可以根据流量趋势进行过载保护优化，因为在egpinmgr遇到问题之前不知道确切的传入流量速率，因此很难对其进行调整。此外，由于静默丢弃，错误调整可能会导致额外的信令流量。

因此，对于过载保护优化，您可以按照前面提到的从egtpinmgr和CPU分析器的解复用器卡收集一些数据包丢弃。

4. EGTPC链路上没有用户/会话-当不存在通过特定隧道的会话时，GTP响应功能将停止。如果没有连接的用户，则不得发送GTPC回应。

以下是停止回声功能时所看到的错误：

2019-Jul-26+08:41:51.261 [egtpmgr 143047 debug] [1/0/4626 <egtpinmgr:2> egtpmgr_pm.c:798] [context: EPC, contextID: 2]  [software internal system syslog] service : S5-S8-PGW - GTP-C Periodic ECHO timer stopped for peer address 10.2.1.51
2019-Jul-26+08:41:51.261 [egtpmgr 143048 debug] [1/0/4626 <egtpinmgr:2> egtpmgr_pm.c:818] [context: EPC, contextID: 2]  [software internal system syslog] service : S5-S8-PGW - GTP-C ECHO stopped towards peer 10.2.1.51

解决方法

您可以尝试重新启动egtpinmgr任务以进行恢复。但是，重新启动egtpinmgr可能会带来短期影响，最终用户不会察觉，而NPU流会在新任务中重新安装。

此操作必须少于1秒才能完成。

1. 禁用路径故障检测：



    egtp-service S5-PGW

       no gtpc path-failure detection-policy

2. 终止egtpinmgr任务：

task kill facility egtpinmgr  all

3. 启用路径故障检测：

    egtp-service S5-PGW

       gtpc path-failure detection-policy

注意：此解决方法只能在MW中实施，因为它可能造成一些影响。

配置更改

可以检查DSCP/QOS/EGTP IP路径/服务映射方面的配置。

注意：这些是导致EGTP路径故障的主要原因，但在找不到任何场景的情况下，您可以进一步收集一些跟踪和调试日志。

调试日志

（如果需要）

logging filter active facility egtpc level<critical/error/debug>
logging filter active facility egtpmgr level<critical/error/debug>
logging filter active facility egtpinmgr level<critical/error/debug>

修订历史记录

版本	发布日期	备注
1.0	04-Dec-2023	初始版本

由思科工程师提供

索米亚坎塔·萨胡
思科TAC工程师
巴拉蒂·乔杜里
思科TAC工程师
维沙卡·塔库尔
思科TAC工程师

此文档是否有帮助?

反馈

联系我们

提交支持案例
(需要思科服务合同)