简介
本文档介绍如何排除与以下内容相关的会话管理功能(SMF)日志警报的故障: All Peers are Dead, Setting status code to 0
.
问题
已报告SMF上的会话影响。
分析
记录所有对等体已停用
日志指示SelectedProfileName:CHF-OFF中的所有对等体已停用。
日志涵盖SMF上配置的所有终端,当您看到所有对等体在配置文件中停止时,会始终导致会话影响。
master-1 b26897bce81a[2516]:
master-1 c77834f772f7[2516]: ************* TRANSACTION: 2475167152 *************
master-1 c77834f772f7[2516]: ************* TRANSACTION: 2475167152 *************
master-1 c77834f772f7[2516]: TRANSACTION SUCCESS:
master-1 c77834f772f7[2516]: TRANSACTION SUCCESS:
master-1 c77834f772f7[2516]: GR Instance ID : 1
master-1 c77834f772f7[2516]: GR Instance ID : 1
master-1 c77834f772f7[2516]: Txn Type : N40ChargingDataReq(3585)
master-1 c77834f772f7[2516]: Txn Type : N40ChargingDataReq(3585)
master-1 c77834f772f7[2516]: Priority : 1
master-1 c77834f772f7[2516]: Priority : 1
master-1 c77834f772f7[2516]: Session Namespace : smf(1)
master-1 c77834f772f7[2516]: Session Namespace : smf(1)
master-1 c77834f772f7[2516]: CDL Slice Name : smf
master-1 c77834f772f7[2516]: CDL Slice Name : smf
master-1 c77834f772f7[2516]: LOG MESSAGES:
master-1 c77834f772f7[2516]: LOG MESSAGES:
master-1 c77834f772f7[2516]: 2023/09/10 15:00:00.007 [ERROR] [nrfClient.Discovery.nrf] All Peers are Dead, Setting status code to 0 (timeout)
master-1 c77834f772f7[2516]: 2023/09/10 15:00:00.007 [ERROR] [nrfClient.Discovery.nrf] All Peers are Dead, Setting status code to 0 (timeout)
master-1 c77834f772f7[2516]: 2023/09/10 15:00:00.007 [ERROR] [nrfClient.Discovery.nrf] Message send failed, response [Type:CHF ServiceName:nchf-convergedcharging SelectedProfileName:"CHF-OFF" FailureProfile:"Fail-H-CHF-OFF" GroupID:"CHF-*" ]
master-1 c77834f772f7[2516]: 2023/09/10 15:00:00.007 [ERROR] [nrfClient.Discovery.nrf] Message send failed, response [Type:CHF ServiceName:nchf-convergedcharging SelectedProfileName:"CHF-OFF" FailureProfile:"Fail-H-CHF-OFF" GroupID:"CHF-*" ]
master-1 c77834f772f7[2516]: ***********************************************
master-1 c77834f772f7[2516]: ***********************************************
根据配置,如果HTTP代码为504(超时)系统,SMF将尝试访问优先级较高的主服务器,然后SMF尝试访问辅助服务器。如果失败,那么系统也会将会话设置为继续模式。
在本例中,Offline的辅助计费功能(CHF)是10.10.10.2。SMF收到504错误,操作为FailureContinueAction。
master-2 42013075464a[2621]: 2023/09/10 15:00:00.063 rest-ep [ERROR] [RestClient.go:175] [infra.rest_client.core] Error in rest call err Post "http://10.10.10.2:1090/OFFLINE/nchf-convergedcharging/v2/chargingdata": context deadline exceeded
master-2 42013075464a[2621]: 2023/09/10 15:00:00.063 rest-ep [ERROR] [Config.go:1721] [nrfClient.Discovery.nrf] Send to NF rpcName[CHF], method:[DataRequest] EndPoint[http://10.10.10.2:1090/OFFLINE/nchf-convergedcharging/v2] failed
master-2 42013075464a[2621]: ************* TRANSACTION: 2252879781 *************
master-2 42013075464a[2621]: TRANSACTION SUCCESS:
master-2 42013075464a[2621]: GR Instance ID : 1
master-2 42013075464a[2621]: Txn Type : N40ChargingDataReq(3521)
master-2 42013075464a[2621]: Priority : 1
master-2 42013075464a[2621]: Session Namespace : smf(1)
master-2 42013075464a[2621]: CDL Slice Name : smf
master-2 42013075464a[2621]: LOG MESSAGES:
master-2 42013075464a[2621]: 2023/09/10 15:00:00.063 [ERROR] [rest_ep.app.ChargingIntf] {imsi-1234567891011121:21} Received Charging Data Response error with timediff 10001557123 - Request message {{"invocationSequenceNumber":1,"invocationTimeStamp":"2025-11-10T14:29:29Z","nfConsumerIdentification":{"nFIPv4Address":"10.10.10.12","nFName":"dce0c1d7-aa37-4f2c-870b-6f7c1be10af1","nFPLMNID":{"mcc":"123","mnc":"456"},"nodeFunctionality":"SMF"},"notifyUri":"http://10.10.10.12:8195/callbacks/v2/notifyUri/1909959397/chargingNotification","pDUSessionChargingInformation":{"chargingId":1909959397,"pduSessionInformation":{"authorizedQoSInformation":{"5qi":1,"arp":{"preemptCap":"NOT_PREEMPT","preemptVuln":"PREEMPTABLE","priorityLevel":1}},"authorizedSessionAMBR":{"downlink":"2048000 bps","uplink":"2048000 bps"},"chargingCharacteristicsSelectionMode":"VISITING_DEFAULT","dnnId":"data","hPlmnId":{"mcc":"123","mnc":"456"},"networkSlicingInfo":{"sNSSAI":{"sst":1}},"pduAddress":{"iPv6dynamicPrefixFlag":true,"pduIPv6AddresswithPrefix":"x:x:x:x::"},"pduSessionID":21,"pduType":"IPV6","ratType":"WLAN","servingCNPlmnId":{"mcc":"123","mnc":"456"},"sscMode":"SSC_MODE_1","startTime":"2025-11-10T14:29:29Z"},"userInformation":{"roamerInOut":"IN_BOUND","servedGPSI":"msisdn-12345678901"},"userLocationinfo":{"n3gaLocation":{"portNumber":4505,"ueIpv4Addr":"x.x.x.x"}}},"subscriberIdentifier":{"subscriberIdentityType":"SUPI","supi":"imsi-1234567891011121"}}}
master-2 42013075464a[2621]: 2023/09/10 15:00:00.063 [ERROR] [nrfClient.SendMesg.NRF] FHI status 504 timediff 1000332537, Uri: http://10.10.10.2:1090/OFFLINE/nchf-convergedcharging/v2, retryCount = 0 loopMaxRetry = 0, maxRetry = 0
master-2 42013075464a[2621]: 2023/09/10 15:00:00.063 [ERROR] [nrfClient.Discovery.nrf] Message send failed, response [Type:CHF Http2_Status:504 FailAction:FailureContinueAction MsgType:3587 ServiceName:nchf-convergedcharging SelectedProfileName:"CHF-OFF" FailureProfile:"Fail-H-CHF-OFF" GroupID:"CHF-*" ]
master-2 42013075464a[2621]: ***********************************************
SMF检查
在SMF上,检查与报告问题的终端相关的对等体及其连接时间。
smf# show peers
GR POD CONNECTED ADDITIONAL
INSTANCE ENDPOINT LOCAL ADDRESS PEER ADDRESS DIRECTION INSTANCE TYPE TIME RPC DETAILS INTERFACE NAME VRF
--------------------------------------------------------------------------------------------------------------------------------------------------------------
1 <none> 192.168.1.1 10.10.10.2:1090 Outbound rest-ep-0 Rest 4 hours CHF <none> n40 NA
1 <none> 192.168.1.2 10.10.10.2:1090 Outbound rest-ep-1 Rest 4 hours CHF <none> n40 NA
1 <none> 192.168.1.3 10.10.10.1:1090 Outbound rest-ep-2 Rest 4 hours CHF <none> n40 NA
1 <none> 192.168.1.3 10.10.10.2:1090 Outbound rest-ep-2 Rest 4 hours CHF <none> n40 NA
1 <none> 192.168.1.4 10.10.10.1:1090 Outbound rest-ep-3 Rest 4 hours CHF <none> n40 NA
1 <none> 192.168.1.2 10.10.10.1:1090 Outbound rest-ep-1 Rest 4 hours CHF <none> n40 NA
1 <none> 192.168.1.4 10.10.10.2:1090 Outbound rest-ep-3 Rest 2 hours CHF <none> n40 NA
1 <none> 192.168.1.1 10.10.10.1:1090 Outbound rest-ep-0 Rest 4 hours CHF <none> n40 NA
// CHF related profiles
profile network-element chf CHF-OFFLINE
nf-client-profile CHF-OFF
failure-handling-profile Fail-H-CHF-OFF
discovery local
exit
// Here is configuration for CHF profile where all peers are dead
profile nf-client nf-type chf
chf-profile CHF-OFF
locality LOC1
priority 1
service name type nchf-convergedcharging
responsetimeout 1000
endpoint-profile epprof
capacity 10
api-root OFFLINE
uri-scheme http
version
uri-version v2
exit
exit
endpoint-name ep1
priority 1
capacity 10
primary ip-address ipv4 10.10.10.1
primary ip-address port 1090
exit
endpoint-name ep2
priority 2
capacity 10
primary ip-address ipv4 10.10.10.2
primary ip-address port 1090
exit
exit
exit
exit
exit
// Failure handling that in case of timeout (HTTP code 504) then try secondary server one time and then proceed with continuing the session
profile nf-client-failure nf-type chf
profile failure-handling Fail-H-CHF-OFF
service name type nchf-convergedcharging
responsetimeout 1000
message type ChfConvergedchargingCreate
status-code httpv2 504
retry 1
action continue
exit
exit
message type ChfConvergedchargingUpdate
status-code httpv2 504
retry 1
action continue
exit
exit
message type ChfConvergedchargingDelete
status-code httpv2 504
retry 1
action continue
exit
exit
exit
Grafana检查
已观察HTTP 504超时与问题时间之间的直接关联。
query: sum(increase(smf_restep_http_msg_total{nf_type="chf", namespace=~"$namespace"}[15m])) by (api_name, response_cause, response_status)
Nexus检查
检查是否发生任何摆动。
Nexus# show logging last 500 | include BFD
解决方案
此问题的解决方案在本例中有所不同,因为SMF是客户端,而CHF是服务器。
连接丢失不是由SMF导致的。