简介
本文档介绍如何对4G配售成功率(ASR)关键绩效指标(KPI)降级进行故障排除。
可能的情况
4G ASR性能下降可能由多种因素引起:
- 网络问题
- 呼叫流特定问题
- 特定于节点的问题
- 配置问题
- RAN结束问题
初始分析所需的日志
- 突出显示降级的KPI趋势图。
- 用于衡量的KPI公式。
- 自问题开始以来,原始批量统计数据计数器和原因代码趋势。
- 在问题时段内,以30分钟间隔捕获两个显示支持详细信息(SSD)实例。
- 从降级前两小时直到当前时间收集的系统日志。
- 捕获以下日志:
Mon-sub/pro traces
Logging monitor msid <imsi>
故障排除顺序
1. 确定ASR公式:
1-((emm-msgtx-decode-failure+emm-msgtx-attach-rej-gw-reject+emm-msgtx-attach-rej-activation-reject+emm-msgtx-attach-rej-svc-temp-out-of-order+emm-msgtx-attach-rej-protocol-error+emm-msgtx-attach-auth-failed+attach-proc-fail-max-retx-auth-req+attach-proc-fail-max-retx-sec-mode-cmd+attach-proc-fail-max-retx-attach-accept+attach-proc-fail-setup-timeout-exp+attach-proc-fail-sctp-fail+attach-proc-fail-guard-timeout-exp+attach-proc-fail-max-retx-esm-info-req+emm-msgtx-attach-rej-gw-auth-failed+emm-msgtx-attach-rej-insuff-resources+emm-msgtx-attach-reject-congestion+emm-msgtx-attach-reject-severe-network-failure+emm-msgtx-network-failure ) / (epsattach-imsi-attempted+epsattach-guti-local-attempted+epsattach-guti-foreign-attempted+epsattach-ptmsi-attempted+combinedattach-imsi-attempted+combinedattach-guti-local-attached+combinedattach-guti-foreign-attempted+combinedattach-ptmsi-attempted))
2. 根据公式,有多个计数器用于计算ASR,因此从批量统计数据中,您需要检查每个计数器的KPI趋势。
3. KPI趋势与无问题的时间表和有问题的时间表进行比较。
4. 从KPI公式中确定有问题的批量统计数据计数器后,您需要检查如何根据流定义此计数器并尝试建立模式。
5. 此外,通过时间间隔3至5分钟的多次迭代从节点收集断开原因。
您可以从在不同时间戳收集的两个SSD中找到断开原因的差异。从Delta断开快速增加的断开原因可以归结为KPI降低的原因。此外,有关所有断开连接的说明,请参阅《思科统计数据和计数器参考》;https://www.cisco.com/c/en/us/td/docs/wireless/asr_5000/21-23/Stat-Count-Reference/21-23-show-command-output/m_showsession.html。
show session disconnect-reasons verbose
以下是解决因断开原因“MME-HSS-User-Unknown”增加而导致的性能降低问题的故障排除步骤示例。请参阅https://www.cisco.com/c/en/us/support/docs/wireless/mme-mobility-management-entity/214633-troubleshoot-4g-asr-kpi-degradation-due.html。
6. 根据节点类型检查egtp统计信息。
--- SGW end -----
show egtpc statistics interface sgw-ingress path-failure-reasons
show egtpc statistics interface sgw-ingress summary
show egtpc statistics interface sgw-ingress verbose
show egtpc statistics interface sgw-ingress sessmgr-only
show egtpc statistics interface sgw-egress path-failure-reasons
show egtpc statistics interface sgw-egress summary
show egtpc statistics interface sgw-egress verbose
show egtpc statistics interface sgw-egress sessmgr-only
---- PGW end -----
show egtpc statistics interface pgw-ingress path-failure-reasons
show egtpc statistics interface sgw-ingress summary
show egtpc statistics interface sgw-ingress verbose
show egtpc statistics interface sgw-ingress sessmgr-only
--- MME end -----
show egtpc statistics interface mme path-failure-reasons
show egtpc statistics interface mme summary
show egtpc statistics interface mme verbose
show egtpc statistics interface mme sessmgr-only
7. 要进一步分析和排除KPI降级故障,请捕获mon-sub/mon pro呼叫跟踪,并考虑使用外部工具获取Wireshark跟踪。这些跟踪有助于确定导致问题的特定呼叫流。
用于捕获Mon子跟踪的命令如下:
monitor subscriber imsi <IMSI number> ---------- verosity level +++++,A, S, X, Y, 19. 26, 33, 34, 35
More options can be enabled depending on the protocol or call flow we need to capture specifically
8. 如果由于KPI降级百分比极小而无法捕获mon-sub之类的跟踪,请捕获系统级调试日志。此外,捕获sessmgr和egptc的调试日志,如果可疑问题涉及HSS/RAN等实体,请根据特定问题捕获s1-ap/diameter的调试日志。
logging filter active facility sessmgr level debug
logging filter active facility egtpc level debug
logging filter active facility diameter level debug ----- depending on scenario
logging filter active facility s1-ap evel debug ----- depending on scenario
logging active ----------------- to enable
no logging active ------------- to disable
Note :: Debugging logs can increase CPU utilization so need to keep a watch while executing debugging logs
9. 一旦从debuglogs中获取任何线索,您就可以捕获该特定事件的核心文件,在该事件中您会看到错误日志:
logging enable-debug facility sessmgr instance <instance-ID> eventid 11176 line-number 3219 collect-cores 1
For example :: consider we are getting below error log in debug logs which we suspect can be a cause of issue
and we don;t have any call trace
[egtpc 141027 info] [15/0/6045 <sessmgr:93> _handler_func.c:10068] [context: MME01, contextID: 6] [software internal user syslog] [mme-egress] Sending reject response for the message EGTP_MSG_UPDATE_BEARER_REQUEST with cause EGTP_CAUSE_NO_RESOURCES_AVAILABLE to <Host:x.x.x.x, Port:31456, seq_num:82011>
So in this error event
facility :: sessmgr
event ID = 141027
line number = 10068
以下是排除此问题的各个步骤。