Troubleshoot 5G SMI Protocol VMs Time Sync Issues

Available Languages

Download Options

ePub (84.0 KB)
View in various apps on iPhone, iPad, Android, Sony Reader, or Windows Phone
Mobi (Kindle) (71.9 KB)
View on Kindle device or Kindle app on multiple devices

Updated:January 27, 2022

Document ID:217662

Bias-Free Language

The documentation set for this product strives to use bias-free language. For the purposes of this documentation set, bias-free is defined as language that does not imply discrimination based on age, disability, gender, racial identity, ethnic identity, sexual orientation, socioeconomic status, and intersectionality. Exceptions may be present in the documentation due to language that is hardcoded in the user interfaces of the product software, language used based on RFP documentation, or language that is used by a referenced third-party product. Learn more about how Cisco is using Inclusive Language.

Background Information

What is SMI?

What is CEE?

What are Protocol VMs?

Introduction

This document describes how to implement a workaround for Network Time Protocol (NTP) time sync issues in 5G Subscriber Microservices Infrastructure (SMI) protocol virtual machines (protocol VMs).

Prerequisites

Requirements

Cisco recommends that you have the knowledge of these topics:

Cisco SMI
5G Cloud Native Deployment Platform (CNDP) architecture
Cloud - Red Hat OpenStack Platform 13 ("Queens" release)
Dockers and Kubernetes

Components used

The information in this document is based on these software and hardware versions:

SMI 2020.01.1-21
Kubernetes v1.16.2
Red Hat OpenStack Platform 13 director

The information in this document was created from the devices in a specific lab environment. All of the devices used in this document started with a cleared (default) configuration. If your network is live, ensure that you understand the potential impact of any command.

Background Information

What is SMI?

Cisco SMI is a layered stack of cloud technologies and standards that enable microservices-based applications from the Cisco Mobility, Cable, and Broadband Network Gateway (BNG) business units. These applications have similar subscriber management functions and similar datastore requirements.

Attributes:

The Layer Cloud Stack (technologies and standards) provides top-to-bottom deployments and accommodates current cloud infrastructures.
All applications share the Common Execution Environment (CEE) for non-application functions (data storage, deployment, configuration, telemetry, alarm), which provides a consistent interaction and experience for all customer touchpoints and integration points.
Applications and the CEE are deployed in microservice containers and are connected with an Intelligent Service Mesh.
An exposed API for deployment, configuration, and management enables automation.

What is CEE?

The CEE is a software solution that was developed to monitor mobile and cable applications that are deployed on the SMI. The CEE captures information (key metrics) from the applications in a centralized way for engineers to debug and troubleshoot.
The CEE is the common set of tools that are installed for all the applications. It comes equipped with a dedicated Ops Center, which provides the Command Line Interface (CLI) and APIs to manage the monitor tools. Only one CEE is available for each cluster.

What are Protocol VMs?

Protocol VMs host the Session Management Function (SMF) application microservices that are responsible for the protocol translation. Two protocol VMs are deployed per instance of SMF. A protocol VM has pods such as GTPc, Lawful Intercept (LI), Radius, Rest, and so on.

Problem

From a node (protocol VM) that exhibits the issue, you can see that the network attempts to use two different interfaces to route the connectivity to the NTP server. However, only one of the networks can reach the IP.

When you run a ping without the interface for some time, you can see that it causes packet loss.

Sample alert from CEE:

[pod-name-cnat/global] cee# show alerts active summary NAME UID SEVERITY STARTS AT SOURCE SUMMARY ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- VoNR_Ded_Bearer_Creat be183f61dd65 major 01-18T16:32:17 System Success percentage of VoNR - PCF Initiated Dedicated Bearer Creation procedure in NR turns less than t... clock-is-not-in-synch 38a35e9a8bd8 major 01-18T13:11:15 pod-name-cnat-cnat-co Clock not in synch detected on hostname pod-name-cnat-cnat-core-protocol-data1 . Ensure NTP is configu... clock-is-not-in-synch 4a7e138b8bae major 01-18T13:07:35 pod-name-cnat-cnat-co Clock not in synch detected on hostname pod-name-cnat-cnat-core-protocol-ims2 . Ensure NTP is configur... clock-is-not-in-synch 5b3e128f0101 major 01-18T12:05:55 pod-name-cnat-cnat-co Clock not in synch detected on hostname pod-name-cnat-cnat-core-protocol-data2 . Ensure NTP is configu... container-memory-usag f588aa627792 critical 12-28T15:47:30 path-provisioner-v6zx Pod cee-global/path-provisioner-v6zxj/k8s_path-provisioner_path-provisioner-v6zxj_cee-global_f1a1e8fa-... container-memory-usag 55b2ea84466f critical 12-27T22:27:10 path-provisioner-v6zx Pod cee-global/path-provisioner-v6zxj/ uses high memory 82.69%. N11_SM_Timeout_SR f6fb82cac197 major 10-31T15:51:16 System This alert is fired when the increase in timeout for N11 messages toward AMF crosses threshold Radius_Server_RTT 94ac2cff43a3 warning 10-28T05:08:56 System RTT for Radius Server: 216.x.x.x, Port: 1813 in namespace: smf-mvno is more than 5 ms. Radius_Server_RTT d02b60b3a3a4 warning 10-28T05:05:56 System RTT for Radius Server: 52.x.x.x, Port: 1813 in namespace: smf-mvno is more than 5 ms. Radius_Server_RTT 9afcee101013 warning 10-28T05:05:36 System RTT for Radius Server: 52.x.x.x, Port: 1813 in namespace: smf-mvno is more than 5 ms. N11_SM_Timeout_SR 206c4bbccf21 major 10-20T13:26:16 System This alert is fired when the increase in timeout for N11 messages toward AMF crosses threshold N4_Total_Outbound_Int 4e0f3c1d6300 major 10-20T09:01:23 System This alert is fired when the percentage of N4 outbound responses sent is lesser than threshold %. N4_Total_Outbound_Int 3e2fed624704 major 10-20T08:21:23 System This alert is fired when the percentage of N4 outbound responses sent is lesser than threshold %. watchdog 97a7976a4103 minor 05-05T09:10:58 System This is an alert meant to ensure that the entire alerting pipeline is functional. This alert is always...

Troubleshoot

Log into the impacted VMs and run this command: ip route | grep -i default

In this example, you can see two default routes, which is incorrect. (There should be only one default route.)

ubuntu@pod-name-cnat-cnat-core-protocol-ims2:~$ ip route | grep -i default
default via 172.16.x.x dev ens4 proto dhcp src 172.16.x.x metric 100
default via 172.16.x.x dev ens3 proto dhcp src 172.16.x.x metric 100

Check the status of the chronyd (NTP) service. Even though the service is up, you will see "Can't synchronise" errors.

Sample output:

root@pod-name-cnat-cnat-core-protocol-data1:/home/ubuntu# systemctl status chronyd.service chrony.service - chrony, an NTP client/serverLoaded: loaded (/lib/systemd/system/chrony.service; enabled; vendor preset: enabled) Active: active (running) since Fri 2021-01-08 07:19:19 UTC; 11 months 24 days ago Docs: man:chronyd(8) man:chronyc(1) man:chrony.conf(5) Main PID: 4300 (chronyd) Tasks: 1 (limit: 4915) CGroup: /system.slice/chrony.service └─4300 /usr/sbin/chronyd Dec 30 17:06:28 pod-name-cnat-cnat-core-protocol-data1 chronyd[4300]: Selected source 5.196.x.x Dec 31 06:37:11 pod-name-cnat-cnat-core-protocol-data1 chronyd[4300]: Can't synchronise: no selectable sources Dec 31 17:15:31 pod-name-cnat-cnat-core-protocol-data1 chronyd[4300]: Selected source 5.196.x.x Jan 01 06:29:43 pod-name-cnat-cnat-core-protocol-data1 chronyd[4300]: Can't synchronise: no selectable sources Jan 01 16:50:02 pod-name-cnat-cnat-core-protocol-data1 chronyd[4300]: Selected source 5.196.x.x Jan 01 19:25:05 pod-name-cnat-cnat-core-protocol-data1 chronyd[4300]: Can't synchronise: no selectable sources

Check the status of the network-config service. (It is inactive.)

ubuntu@pod-name-cnat-cnat-core-protocol-data1:~$ systemctl status network-config
network-config.service - Job that configures for routes
Loaded: loaded (/etc/systemd/system/network-config.service; enabled; vendor preset: enabled)
Active: inactive (dead) since Fri 2021-01-08 09:29:05 UTC; 1 years 0 months ago
Process: 1185 ExecStart=/etc/cisco/network-config.sh start (code=killed, signal=TERM)
Main PID: 1185 (code=killed, signal=TERM)

Workaround

Note: This procedure does not cause any downtime in the application.

Edit the network-config service script (/etc/systemd/system/network-config.service) in the affected node to add these lines:

root@pod-name-cnat-cnat-core-protocol-data1:/home/ubuntu# vim /etc/systemd/system/network-config.service
Restart=always
RestartSec=10s
StartLimitIntervalSec=300s
StartLimitBurst=30

Start the network-config service:

root@pod-name-cnat-cnat-core-protocol-data1:/home/ubuntu# systemctl start network-config

Reload the configuration with systemctl:

root@pod-name-cnat-cnat-core-protocol-data1:/home/ubuntu# systemctl daemon-reload

Restart the chronyd service:

root@pod-name-cnat-cnat-core-protocol-data1:/home/ubuntu# systemctl restart chronyd.service

Verification

After you complete the Workaround steps, run this command to verify that the issues are fixed:

for server in $(awk '/protocol/{print $1}' /etc/hosts); do ssh $server "hostname && timedatectl status && chronyc sources -vn" ; done

Note: A 'System clock synchronized' value of 'yes' indicates that the issues are fixed.

Sample output:

ubuntu@pod-name-cnat-cnat-core-master1:~$ for server in $(awk '/protocol/{print $1}' /etc/hosts); do ssh $server "hostname && timedatectl status && chronyc sources -vn" ; done pod-name-cnat-cnat-core-protocol-data1 Local time: Tue 2021-09-07 10:22:16 UTC Universal time: Tue 2021-09-07 10:22:16 UTC RTC time: Tue 2021-09-07 10:22:17 Time zone: Etc/UTC (UTC, +0000) System clock synchronized: yes systemd-timesyncd.service active: no RTC in local TZ: no 210 Number of sources = 1 MS Name/IP address Stratum Poll Reach LastRx Last sample =============================================================================== ^* svtsnq01.mgmt 1 6 170 282 -1100us[-1858us] +/- 47ms pod-name-cnat-cnat-core-protocol-data2 Local time: Tue 2021-09-07 10:22:16 UTC Universal time: Tue 2021-09-07 10:22:16 UTC RTC time: Tue 2021-09-07 10:22:17 Time zone: Etc/UTC (UTC, +0000) System clock synchronized: yes systemd-timesyncd.service active: no RTC in local TZ: no 210 Number of sources = 1 MS Name/IP address Stratum Poll Reach LastRx Last sample =============================================================================== ^* svtsnq01.mgmt 1 6 74 206 -218us[-1550us] +/- 46ms pod-name-cnat-cnat-core-protocol-ims1 Local time: Tue 2021-09-07 10:22:17 UTC Universal time: Tue 2021-09-07 10:22:17 UTC RTC time: Tue 2021-09-07 10:22:18 Time zone: Etc/UTC (UTC, +0000) System clock synchronized: yes systemd-timesyncd.service active: no RTC in local TZ: no 210 Number of sources = 1 MS Name/IP address Stratum Poll Reach LastRx Last sample =============================================================================== ^* svtsnq01.mgmt 1 6 77 24 +796us[+1044us] +/- 47ms pod-name-cnat-cnat-core-protocol-ims2 Local time: Tue 2021-09-07 10:22:17 UTC Universal time: Tue 2021-09-07 10:22:17 UTC RTC time: Tue 2021-09-07 10:22:18 Time zone: Etc/UTC (UTC, +0000) System clock synchronized: yes systemd-timesyncd.service active: no RTC in local TZ: no 210 Number of sources = 1 MS Name/IP address Stratum Poll Reach LastRx Last sample =============================================================================== ^* svtsnq01.mgmt 1 6 17 98 +2176us[+2275us] +/- 47ms

Revision History

Revision	Publish Date	Comments
1.0	27-Jan-2022	Initial Release

Contributed by Cisco Engineers

Adithian Arathi
Cisco TAC Engineer

Was this Document Helpful?

Feedback

Contact Cisco

Open a Support Case
(Requires a Cisco Service Contract)

Troubleshoot 5G SMI Protocol VMs Time Sync Issues

Available Languages

Download Options

Bias-Free Language

Contents

Introduction

Prerequisites

Requirements

Components used

Background Information

What is SMI?

What is CEE?

What are Protocol VMs?

Problem

Troubleshoot

Workaround

Verification

Revision History

Contributed by Cisco Engineers

Was this Document Helpful?

Contact Cisco

This Document Applies to These Products