Ultra Cloud Core 5G Policy Control Function, Release 2022.02 - Configuration and Administration Guide - Troubleshooting Information [Cisco Ultra Cloud Core - Policy Control Function]

Feature Summary and Revision History

Summary Data

Table 1. Summary Data
Applicable Product(s) or Functional Area	SMI
Applicable Platform(s)	PCF
Feature Default Setting	Enabled – Configuration required to disable
Related Documentation	Not Applicable

Revision History

Table 2. Revision History
Revision Details	Release
First introduced.	2020.01.0

Debugging the PCF Deployment Issues

This section describes how to debug the issues that may occur when you deploy PCF through the SMI Deployer.

To debug the deployment issues, use the following checklist. If the checklist does not assist you in resolving the issue, analyze the diagnostic data that is available in the form of logs.

Table 3. Troubleshooting Checklist
Task	Resolution
Verify if the Ops Center is refreshing with the latest configurations	Manually verify if the configurations are refreshed. If the Ops Center is not refreshing or displaying the recent changes, then reinstall the helm charts. For information on reinstalling the charts, see Issue with Refreshing the PCF Ops Center.
Validate if the external IPs and ports are accessible.	Use Telnet or any other application protocol and access the external IP address. This is to confirm that the IP address is accessible. If you are unsure of the IP address, run the following in the Kubernetes service to view the configured external IP addresses and port number: `kubectl get services -n namespace`
Ensure that the IP addresses and ports that are configured for PCF are open in the firewall.	Use the following command to open the ports: `firewall-cmd –zone=public –add-port= port/tcp –permanent`
Confirm if PCF connects with the other NFs.	Use the following command on the master node to verify that a healthy connection is available between the NFs: `nc -v` Alternatively, from the proto VM, run the nc -v command on the Telnet CLI.
Validate that the successfully deployed helm chart is listed in the helm list.	Use the following steps to determine which helm chart is not listed in the helm list: Run the following on the master node to view the list of deployed helm charts: `helm list` If the helm chart is not found, run the following in the operational mode to view the charts irrespective of their deployment status. `show helm charts` Review the pcf-ops-center logs to identify the helm chart which has the issue. Depending on the issue, take the appropriate action. Alternatively, you can review the consolidated set of logs, using the following command: `kubectl logs -n namespace consolidated-logging-0` For information about the event logs, see Event Logs.

Issue with Refreshing the PCF Ops Center

This section describes how to refresh the PCF Ops Center to display the latest configurations.

Problem

The PCF Ops Center is not considering the recent configurations due to which you may observe stale data or not get the expected response.

Resolution

You can refresh the PCF Ops Center using the basic and advanced steps. Perform the advanced steps only when the basic steps do not resolve the issue.

Basic

Run the following to undeploy PCF from the Ops Center:
```
system mode shutdown 
```
Use the following to manually purge any pending deployments from the helm:
```
helm delete --purge helm_chart_name 
```
From the master node, run the following to delete the configMaps from the namespace where PCF is installed:
```
kubectl delete cm config_map_name -n namespace 
```
Run the following to delete the product-specific configMaps from the CNEE namespace.
1. Use the following to list the available configMaps:
```
kubectl get configmaps -n namespace 
```
  From the list, determine the configMap that you want to delete.
2. Run the following to delete the configMap:
```
Kubectl delete configmap configmap_name -n namespace 
```
Use the following commands to reinstall the helm chart. Once the chart is installed, a new instance of the PCF Ops Center is available.
```
helm upgrade –install release name addR/chart_name -f filenames --namespace namespace 
```

Advanced

Remove the cnee-ops-center.
Delete the configMaps from the namespace.

For more information on step 1 and 2, see the Basic steps.
Install the PCF Ops Center. For information on how to PCF Ops Center, see Deploying and Accessing PCF.

The recent configuration is not rendered because the responsible pods are not in a healthy state to process the refresh request. To investigate the issue at the pod level, review the pod's state.

Use the following command to view the pod's logs:

kubectl describe pod pod_name -n namespace

Alternatively, you can review the consolidated set of logs, using the following command:

kubectl logs -n namespace consolidated-logging-0

For information about the event logs, see Event Logs.

In the logs, the values in the Status and Ready columns indicate the following:

If the Status column displays the state as Running, and the Ready column has the same number of containers on both sides of the forward-slash (/), then the pod is healthy and operational. This implies that the issue is at the application level. To investigate the application issue, check the logs of all the containers residing within the pods to detect the issue. Or, log into the container and review the logs.
If the Status column displays the state as Pending, Waiting, or CrashLoopBackOff, then run the following to review the details such as the messages, reasons, and other relevant information:
```
kubectl describe pod pod_name -n namespace 
```
If the Status is init or ContainerCreating, it signifies that the pod is in the process of starting up.
If the Status is Running, and in the Ready column the number of containers on both sides of forward-slash (/) are different, then the containers have issues.

Run the following to view the details:
```
kubectl describe pod pod_name -n namespace 
```
When reviewing the details, if the Ready column has the value as false then it indicates that the corresponding container has issues. Review the associated logs to understand the issue.
If the Status and Ready columns, and logs of the container do not indicate any issue, then verify that the required ingress or the service that is required to reach the application is up and running.

Subscriber Not Found or Primary Key Not Found

This section describes how to resolve the issues that report the Subscriber Not Found or Primary Key Not Found messages.

Problem

When the NFs cannot find the subscriber details, they send the Subscriber Not Found or Primary Key Not Found to PCF.

Resolution

Analyse the logs of the PCF Engine and REST endpoint pod for the subscriber or primary key related issues.

On the master node, run the following command to determine the engine and rest-ep pod.
```
kubectl logs -n namespace pod_name 
```
Navigate to the pods and review the subscriber availability status and the subscriber count in the database. Based on the subscriber's status, take the appropriate action to resolve the issue.
```
cdl show session count/summary 
```

Message Routing Issues

This section describes how to troubleshoot the message routing issues.

Problem

You may observe a message routing failure when a message from the PCF endpoint incorrectly routes a message from Canary to the PCF Engine. The issue occurs when the message is sent to an incorrect PCF group.

Resolution

The following conditions might be causing the message routing failure. Check for these conditions and correct them, if necessary.

From the PCF Ops Center, manually verify that the routing rules are configured correctly and they match the incoming traffic.
Ensure that the Istio proxy is injected in the pcf-rest-ep pod.
Verify that the virtual services are generated using the istioctl command. For more information on the traffic routing logs, see Collecting the Troubleshooting Information.
Enable the DEBUG level for com.cisco.pcf.endpoint.routing and review the pcf-rest-ep logs for any issues. Use the following command to enable the DEBUG level:
```
debug logging logger com.cisco.pcf.endpoint.routing level debug 
```

Collecting the Troubleshooting Information

Alerts

If you encounter issues in your PCF environment, gather and analyse the information associated to the failed action or process. Having this information enables you to detect the component that experiences the failure and resolve the issue faster.

The following table covers the components which might experience an issue, and the logs that contain the information corresponding to the issue.

Table 4. Issues
Issue	Logs
Deployment errors	Review the following logs to determine the issue. These logs assist you in identifying the component that may be the source of the error. Use the following commands on the master node: View the available pods and review the pod status: `kubectl logs -n namespace pod_name` Depending on the pod's state, perform the appropriate remediation actions. To understand the pod's states, see States. View the configured helm charts and their status: `helm list` View the helm chart details for the REST endpoint: `helm get namespace -pcf-rest-ep`
Communication issues between the NFs	On the master node, run the following command to identify the pod that is responsible for the communication: `kubectl logs -n namespace pod_name` Use the tcpdump utility to trace the packets.
Registration and deregistration issues	Use the following command to review the PCF REST endpoint logs: `helm get namespace -pcf-rest-ep`
Ops Center issues	Review the pod's log that hosts the Ops Center to determine the issue. `kubectl logs -n namespace pod_name` To resolve the issue, if you require the configuration information, then run one of the following commands: `show full-configuration` Or, `show running-config`
Traffic routing issues	To view the traffic routing-specific logs, use the following configuration: `kubectl get pod -o yaml -n namespace pcf-rest-ep pod_name` `istioctl get virtualservice -n namespace -o yaml` `istioctl get destinationrules -n namespace -o yaml` Also, review the logs of the following pods: Pcf-rest-ep instance Pcf-engine instance Datastore or Session DB
Subscriber issues	Review the logs associated to the PCF Engine and REST endpoint to determine the issue. For additional information about the subscriber availability status and the subscriber count in the database, run the following command: `cdl show session count/summary`

Alerts are notification messages that are generated when incidents requiring your attention or response occur. Review the historical and active alerts to determine the issue.

Alerts for PCF are generated through the CEE utility. To view these alerts, run the following command in the CEE Ops Center:

For active alerts:

show alerts active

For historical alerts:

show alerts history

Note

You must have appropriate permission to view the alert details.

For information on application-based alerts, see PCF Application-Based Alerts.

Interface Error Codes

This section describes the codes that PCF reports for the interface errors.

Interface codes are generated as part of the logs or captured in the statistics.

The following tables describes the error and the corresponding codes:

Table 5. N7 Error Codes
Error	Error Code	Description
USER_UNKNOWN	400 Bad Request	The HTTP request is rejected because the end user who is specified in the request is unknown to the PCF.
ERROR_INITIAL_PARAMETERS	400 Bad Request	The HTTP request is rejected. This error is reported when the set of session or subscriber information which PCF requires for a rule selection is incomplete, erroneous, or unavailable for decision making. For example, QoS, RAT type, and subscriber information.
ERROR_TRIGGER_EVENT	400 Bad Request	The HTTP request is rejected because the set of session information sends a message that originated due to a trigger is incoherent with the previous set of session information for the same session. For example, trigger met was RAT changed, and the RAT notified is the same as before.
TRAFFIC_MAPPING_ INFO_REJECTED	403 Forbidden	The HTTP request is rejected because the PCF doesn’t accept one or more of the traffic mappings filters provided by the SMF in a PCC Request.
ERROR_CONFLICTING_REQUEST	403 Forbidden	The HTTP request is rejected because the PCF can’t accept the UE-initiated resource request as a network initiated resource allocation is already in-progress. This resource allocation has packet filters that cover the packet filters in the received UE-initiated resource request. The SMF rejects the attempt for a UE-initiated resource request.
POLICY_CONTEXT_DENIED	403 Forbidden	The HTTP request is rejected because the PCF doesn’t accept the SMF request due to operator policies and local configuration.

Table 6. N28 Error Codes
Error	Error Code	Description
USER_UNKNOWN	400 Bad Request	The subscriber that is specified in the request isn’t known at the CHF and the subscription can’t be created.
NO_AVAILABLE_POLICY _COUNTERS	400 Bad Request	There are no policy counters available for the subscriber at the CHF.

Note

The generic error codes are applicable for all the network interfaces.

Table 7. Generic Error Codes
Error	Error Code	Description
TIMEOUT	408 Request Timeout	The HTTP request to the server took longer than the period the server is configured to wait.
OVERLOAD	429 Too Many Requests	The server has received too many consecutive requests to process within a short interval.
INTERNAL_ERROR	500 Internal Server Error	The server has encountered an unprecedented condition, which does not have an appropriate message.
SERVICE_UNAVAILABLE	503 Service Unavailable	The server cannot process the request because it is either, overloaded or is unavailable due to scheduled maintenance. This is a transient state.

Forwarding logs to the Splunk Server

This section describes how to enable PCF to forward the logs to the Splunk server.

Splunk is a third-party monitoring application that stores the log files and provides index-based search capability. You can configure PCF to send the logs securely to a Splunk server which could be an external server.

Important

The Splunk server is a third-party component. Cisco does not take the responsibility of installing, configuring, or maintaining this server.

Use the following configuration to forward the logs to the Splunk server.

config 
  debug splunk 
    batch-count no_events_batch 
    batch-interval-msbatch_interval_ms 
    batch-size-bytes batch_size 
    hec-tokenhec_token 
    hec-url hec_url  
    end

The following is an example configuration:

configure
 debug splunk hec-url https://splunk.10.86.73.80.nip.io:8088
 debug splunk hec-token 68a81ab4-eae9-4361-92ea-b948f31d26ef 
 debug splunk batch-interval-ms 100
 debug splunk batch-count 10
 debug splunk batch-size-bytes 102400
 end

NOTES:

debug splunk —Enters the configuration debug mode.
batch-count no_events_batch —Specify the maximum number of events to be sent in each batch.
batch-interval-ms batch_interval_ms —Specify the interval in milliseconds at which a batch event is sent.
batch-size-bytes batch_size —Specify the maximum size in bytes of each batch of events.
hec-token hec_token —Specify the HTTP Event Collector (HEC) token for the Splunk server.
hec-url hec_url —Specify the protocol, hostname, and HTTP Event Collector port of the Splunk server. The default port is 8088.

Pods stop running when PCF is upgraded through the Rolling Upgrade process

This section describes how to ensure that the pods are running when PCF is upgraded.

Problem

When the PCF version is upgraded to the subsequent available version, some pods such as CRD and Policy Engine stop running.

Resolution

Whenever you configure PCF ensure that you configure the following parameters:

db global-settings db-replica replica_count
db spr shard-count shard_count
rest-endpoint ips ip_address1, ip_address2, ip_address3
rest-endpoint port port_number
engine engine_name

replicas replica_count

unified-api-replicas api_replica_count

subversion-run-url repository_url

subversion-config-url configuration_url

tracing-service-name service_name
service-registration profile locality profile_name
service-registration profile plmn-list [ mcc mnc ]
service-registration profile snssais [ sst sd ]

Ultra Cloud Core 5G Policy Control Function, Release 2022.02 - Configuration and Administration Guide

Bias-Free Language

Book Title

Ultra Cloud Core 5G Policy Control Function, Release 2022.02 - Configuration and Administration Guide

Chapter Title

Troubleshooting Information

Results

Chapter: Troubleshooting Information

Troubleshooting Information

Feature Summary and Revision History

Summary Data

Revision History

Debugging the PCF Deployment Issues

Issue with Refreshing the PCF Ops Center

Problem

Resolution

Basic

Advanced

Subscriber Not Found or Primary Key Not Found

Problem

Resolution

Message Routing Issues

Problem

Resolution

Collecting the Troubleshooting Information

Alerts

Interface Error Codes

Forwarding logs to the Splunk Server

Pods stop running when PCF is upgraded through the Rolling Upgrade process

Was this Document Helpful?

Contact Cisco