Configure the Cluster Manager CEE to Prevent a Node-Exporter Disc Full Condition

Available Languages

Download Options

PDF (73.6 KB)
View with Adobe Reader on a variety of devices
ePub (85.5 KB)
View in various apps on iPhone, iPad, Android, Sony Reader, or Windows Phone
Mobi (Kindle) (71.7 KB)
View on Kindle device or Kindle app on multiple devices

Updated:May 4, 2022

Document ID:217862

Bias-Free Language

The documentation set for this product strives to use bias-free language. For the purposes of this documentation set, bias-free is defined as language that does not imply discrimination based on age, disability, gender, racial identity, ethnic identity, sexual orientation, socioeconomic status, and intersectionality. Exceptions may be present in the documentation due to language that is hardcoded in the user interfaces of the product software, language used based on RFP documentation, or language that is used by a referenced third-party product. Learn more about how Cisco is using Inclusive Language.

Introduction

This document describes the node-exporter disk full problem noticed in a user's network.

Background

When an audit of the Cluster Manager Common Execution Environment (CEE) is performed, the audit result indicates the node-exporter disk is full.

Problem

A critical severity alert condition exists because a disk full condition is projected to occur in the next 24 hours, this alert was noticed on CEE:

" Device /dev/sda3 of node-exporter cee03/node-exporter-4dd4a4dd4a is projected to be full within the next 24 hours"

Analysis

The alert reported is on the CEE that tracks hardware issues for the rack and projects the full disk condition to occur in the next 24 hours.

cisco@deployer-cm-primary:~$ kubectl get pods -A -o wide | grep node
cee03 node-exporter-4dd4a4dd4a 1/1 Running 1 111d 10.10.1.1 deployer-cm-primary <none> <none>

root@deployer-cm-primary:/# df -h
Filesystem Size Used Avail Use% Mounted on
overlay 568G 171G 368G 32% /
tmpfs 64M 0 64M 0% /dev
tmpfs 189G 0 189G 0% /sys/fs/cgroup
tmpfs 189G 0 189G 0% /host/sys/fs/cgroup
/dev/sda1 9.8G 3.5G 5.9G 37% /host/root
udev 189G 0 189G 0% /host/root/dev
tmpfs 189G 0 189G 0% /host/root/dev/shm
tmpfs 38G 15M 38G 1% /host/root/run
tmpfs 5.0M 0 5.0M 0% /host/root/run/lock
/dev/sda3 71G 67G 435M 100% /host/root/var/log

When an audit is performed, it appears to fill up the /dev/sda3 disc.

root@deployer-cm-primary:/host/root/var/log# du -h --max-depth=1
76M ./sysstat
16K ./lost+found
4.0K ./containers
4.0K ./landscape
9.3M ./calico
1.1G ./apiserver
808K ./pods
5.6G ./journal
60G ./audit
36K ./apt
67G .

A check of the audit shows it keeps the logs and as a result, the server condition of exporter-node disk full is likely to occur.

cisco@deployer-cm-primary:~$ sudo cat /etc/audit/auditd.conf
#
# This file controls the configuration of the audit daemon
#

local_events = yes
write_logs = yes
log_file = /var/log/audit/audit.log
log_group = adm
log_format = RAW
flush = INCREMENTAL_ASYNC
freq = 50
max_log_file = 8
num_logs = 5
priority_boost = 4
disp_qos = lossy
dispatcher = /sbin/audispd
name_format = NONE
##name = mydomain
max_log_file_action = keep_logs
space_left = 75
space_left_action = email
verify_email = yes
action_mail_acct = root
admin_space_left = 50
admin_space_left_action = halt
disk_full_action = SUSPEND
disk_error_action = SUSPEND
use_libwrap = yes
##tcp_listen_port = 60
tcp_listen_queue = 5
tcp_max_per_addr = 1
##tcp_client_ports = 1024-65535
tcp_client_max_idle = 0
enable_krb5 = no
krb5_principal = auditd
##krb5_key_file = /etc/audit/audit.key
distribute_network = no
cisco@deployer-cm-primary:~$

Solution

Preform the command code listed next, on both the deployer-cm-primary and the deployer-cm-secondary to remediate the potential node-exporter disk full condition.

sudo vim /etc/audit/auditd.conf

Then, use the code listed next to change the inside file from keep_logs to rotate.

max_log_file_action = rotate

After the code is changed, restart the service.

sudo systemctl restart auditd.service

Verify the critical alert is removed.

Revision History

Revision	Publish Date	Comments
1.0	10-May-2022	Initial Release

Contributed by Cisco Engineers

Nebojsa Kosanovic
Customer Delivery Engineer
James Dunk
Customer Delivery

Was this Document Helpful?

Feedback

Contact Cisco

Open a Support Case
(Requires a Cisco Service Contract)

Configure the Cluster Manager CEE to Prevent a Node-Exporter Disc Full Condition

Available Languages

Download Options

Bias-Free Language

Contents

Introduction

Background

Problem

Analysis

Solution

Revision History

Contributed by Cisco Engineers

Was this Document Helpful?

Contact Cisco