Introduction
This document describes how to troubleshoot the consolidated-engine.log generation issue in Cisco Policy Suite (CPS).
Prerequisites
Requirements
Cisco recommends that you have knowledge of these topics:
Cisco recommends that you must have privilege access to Root access to CPS CLI.
Components used
The information in this document is based on these software and hardware versions:
The information in this document was created from the devices in a specific lab environment. All of the devices used in this document started with a cleared (default) configuration. If your network is live, ensure that you understand the potential impact of any command.
Background Information
In CPS, policy engine logs are collected from all Quantum Network Suite (QNS) Virtual Machine (VMs) and segregated at pcrfclient VM.
Logback framework is used to collect policy engine-related logs and is saved/segregated at active pcrfclient VM.
Logback is a logging framework for Java applications, created as a successor to the popular log4j project.
Here is the relevant configuration from the /etc/broadhop/logback.xml file for the generation and collection of engine logs.
1. Policy engine logs are sent to a SOCKET Appender.
<logger name="policy.engine" level="info" additivity="false">
<appender-ref ref="SOCKET" />
</logger>
2. SOCKET Appender is referenced to SOCKET-BASE Appender.
<appender name="SOCKET" class="com.broadhop.logging.appenders.AsynchAppender">
<appender-ref ref="SOCKET-BASE"/>
3. SOCKET-BASE has a configuration, by which the logs are sent to a Remote Host: Port.
<appender name="SOCKET-BASE" class="com.broadhop.logging.net.SocketAppender">
<RemoteHost>${logging.controlcenter.host:-lbvip02}</RemoteHost>
<Port>${logging.controlcenter.port:-5644}</Port>
<ReconnectionDelay>10000</ReconnectionDelay>
<IncludeCallerData>false</IncludeCallerData>
</appender>
Problem
If there is any kind of network flap or TCP related errors within the CPS environmental setup, pcrfclient VM stops to receive the SOCKET appender type logs from individual VMs.
Port 5644, configured under SOCKET-BASE shows TIMEWAIT.
[root@dc1-pcrfclient01 ~]# netstat -plan|grep 5644
tcp6 0 0 192.168.10.135:5644 192.168.10.137:47876 TIME_WAIT -
tcp6 0 0 192.168.10.135:5644 192.168.10.137:57042 TIME_WAIT -
tcp6 0 0 192.168.10.135:5644 192.168.10.137:60888 TIME_WAIT -
tcp6 0 0 192.168.10.135:5644 192.168.10.137:60570 TIME_WAIT -
tcp6 0 0 192.168.10.135:5644 192.168.10.137:32902 TIME_WAIT -
tcp6 0 0 192.168.10.135:5644 192.168.10.137:57052 TIME_WAIT -
tcp6 0 0 192.168.10.135:5644 192.168.10.137:47640 TIME_WAIT -
tcp6 0 0 192.168.10.135:5644 192.168.10.137:36484 TIME_WAIT -
tcp6 0 0 192.168.10.135:5644 192.168.10.137:57040 TIME_WAIT -
tcp6 0 0 192.168.10.135:5644 192.168.10.137:55788 TIME_WAIT -
[root@dc1-pcrfclient01 ~]#
If you check the same status after a few minutes, there are no entries related the port 5644.
[root@dc1-pcrfclient01 ~]# netstat -plan|grep 5644
[root@dc1-pcrfclient01 ~]#
Solution
The procedure to Restore the SOCKET connection is to restart the qns-1 processes in the active pcrfclient.
[root@dc1-pcrfclient01 ~]# monit stop qns-1
[root@dc1-pcrfclient01 ~]# monit status qns-1
Monit 5.26.0 uptime: 4d 22h 43m
Process 'qns-1'
status Not monitored
monitoring status Not monitored
monitoring mode active
on reboot start
data collected Tue, 04 Jan 2022 11:52:38
[root@dc1-pcrfclient01 ~]# monit start qns-1
[root@dc1-pcrfclient01 ~]# monit status qns-1
Monit 5.26.0 uptime: 4d 22h 42m
Process 'qns-1'
status OK
monitoring status Monitored
monitoring mode active
on reboot start
pid 25368
parent pid 1
uid 0
effective uid 0
gid 0
uptime 0m
threads 31
children 0
cpu 0.0%
cpu total 0.0%
memory 1.2% [197.4 MB]
memory total 1.2% [197.4 MB]
security attribute -
disk read 0 B/s [112 kB total]
disk write 0 B/s [60.2 MB total]
port response time -
data collected Tue, 04 Jan 2022 11:51:04