簡介
本文描述問題並提供關於SMF系統同步失敗的解決方案。
必要條件
需求
本文件沒有特定需求。
採用元件
本文件所述內容不限於特定軟體和硬體版本。
本文中的資訊是根據特定實驗室環境內的裝置所建立。文中使用到的所有裝置皆從已清除(預設)的組態來啟動。如果您的網路運作中,請確保您瞭解任何指令可能造成的影響。
背景資訊
會話管理功能(SMF)無法啟動服務,當發生問題時,會在通用執行環境(CEE)上引發警報。
問題
SMF-RCDN繼續循環執行Ops Center System Upgrade,然後失敗。
在CEE上,您會看到以下警報:
[smf-rcdn/cee-rcdn] cee# show alerts active summary | inc ops
ops-system-sync-runni 687ca7b9266c minor 09-07T17:59:36 smf-rcdn-mas ops center system upgrade for smf-rcdn is in progress
ops-latest-sync-faile 31531915bf54 major 09-07T10:52:26 smf-rcdn-mas ops center latest system sync for smf-rcdn failed
在SMF上看到以下錯誤:
[smf-rcdn/smf-rcdn] smf#
Message from confd-api-manager at 2022-09-07 17:49:32...
Helm update is STARTING. Trigger for update is STARTUP.
[smf-rcdn/smf-rcdn] smf#
Message from confd-api-manager at 2022-09-07 17:49:51...
Helm update is ERROR. Trigger for update is STARTUP. Message is:
InterruptedException: one or multiple helm chart installations failed
javax.ws.rs.WebApplicationException: HTTP 500 Internal Server Error
at com.broadhop.confd.config.proxy.dao.HelmRepositoryDAO.sendConfiguration(HelmRepositoryDAO.java:272)
at com.broadhop.confd.config.proxy.service.ConfigurationSynchManager.run(ConfigurationSynchManager.java:233)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.InterruptedException: one or multiple helm chart installations failed
at com.broadhop.confd.config.proxy.dao.HelmRepositoryDAO.sendConfiguration(HelmRepositoryDAO.java:266)
... 8 more
分析
要對此進行故障排除,您必須從SMF運營中心Pod中檢視內部日誌。
在此案例中,smf-rcdn沒有啟動與smf相關的pod。
cloud-user@smf-rcdn-master-1:~$ kubectl get pods -n smf-rcdn
NAME READY STATUS RESTARTS AGE
documentation-69768456cb-klq8d 1/1 Running 0 102d
ops-center-smf-rcdn-ops-center-85899d6b90-9kx6h 5/5 Running 1 40m
smart-agent-smf-rcdn-ops-center-6b9cd64f85-8f8cz 1/1 Running 0 22h
cloud-user@smf-rcdn-master-1:~$
記下ops center pod的名稱,並收集容器confd-api-bridge的日誌。
cloud-user@smf-rcdn-master-1:~$ kubectl logs ops-center-smf-rcdn-ops-center-85899d6b90-9kx6h -n smf-rcdn -c confd-api-bridge
Preparing upgrade logic for helm ...
日誌內部是系統無法啟動的原因。在本示例中,問題是由於sgw-service配置。配置檔案沒有配置介面。
WARN [2022-09-13 19:44:55,860] com.broadhop.confd.config.proxy.dao.helm.ReleaseInstallCallable: [436] Install or upgrade failure for chart: sgw-service,
release-name: smf-rcdn-sgw-service, command: [/usr/local/bin/helm, upgrade, smf-rcdn-sgw-service, /tmp/chart1014799367411807494.tgz,
--install, -f, /tmp/override1205042274924409625.yaml, -f, /tmp/values4318819924777544020.yaml, --namespace, smf-rcdn, --dry-run]
WARN [2022-09-13 19:44:55,860] com.broadhop.confd.config.proxy.dao.helm.ReleaseInstallCallable: Command result:
Release "smf-rcdn-sgw-service" does not exist. Installing it now.
Error: template: sgw-service/templates/sgw-service.yaml:14:30: executing "sgw-service/templates/sgw-service.yaml" at
<$endpoint.service.nodeCount>: nil pointer evaluating interface {}.nodeCount
INFO [2022-09-13 19:44:55,860] com.broadhop.confd.config.proxy.dao.helm.ReleaseInstallCallable: Command result:
Release "smf-rcdn-udp-proxy" does not exist. Installing it now.
NAME: smf-rcdn-udp-proxy
LAST DEPLOYED: Tue Sep 13 19:44:55 2022
NAMESPACE: smf-rcdn
STATUS: pending-install
REVISION: 1
TEST SUITE: None
HOOKS:
MANIFEST:
在SMF上,檢查 show running-configuration.
This
配置包含sgw-service的配置檔案,但未定義強制引數。
profile smf smfprof
mode offline
locality LOC1
allowed-nssai [ slice1 ]
instances 1 fqdn xxx
instances 2 fqdn xxx
plmn-list mcc 123 mnc 456
exit
service name nsmf-pdu
type pdu-session
schema http
version 1.0.2
http-endpoint base-url http://smf-service
icmpv6-profile icmpprf1
compliance-profile June19
priority 20
access-profile idft
subscriber-policy polsub
exit
exit
profile sgw cn-sgw
exit
解決方案
解決方式為移除組態錯誤。
相關資訊