简介
本文档介绍对策略控制功能(PCF)中出现的路径调配器内存警报进行故障排除的过程。
先决条件
要求
Cisco 建议您了解以下主题:
- PCF
- 5G云原生部署平台(CNDP)
- 多克和库伯内特
使用的组件
本文档中的信息基于以下软件和硬件版本:
- PCF REL_2023.01.2
- Kubernetes v1.24.6
本文档中的信息都是基于特定实验室环境中的设备编写的。本文档中使用的所有设备最初均采用原始(默认)配置。如果您的网络处于活动状态,请确保您了解所有命令的潜在影响。
背景信息
在此设置中,CNDP托管PCF。
在计算机系统和基础架构的上下文中,路径调配器通常是指为应用程序或服务管理和调配存储路径或卷的组件或工具。
路径调配器通常与云环境或容器化设置中的动态存储分配和管理相关联。它允许应用程序或容器按需请求存储卷或路径,无需手动干预或预分配。
路径调配器可以处理诸如创建或装载存储卷、管理访问权限以及将其映射到特定应用程序实例等任务。它抽象化了底层存储基础架构,为应用程序与存储资源交互提供了一个简化的界面。
问题
登录到通用执行环境(CEE)操作中心,并验证路径上调配器Pod是否报告内存不足(OOM)警报。
Command:
cee# show alerts active summary summary
Example:
[pcf01/pcfapp] cee# show alerts active summary
NAME UID SEVERITY STARTS AT DURATION SOURCE SUMMARY
--------------------------------------------------------------------------------------------------------------------------------------------
container-memory-usag 10659b0bcae0 critical 01-22T22:59:46 path-provisioner-pxps Pod cee-pcf/path-provisioner-pxpss/k8s_path-p...
container-memory-usag b2f10b3725e7 critical 01-22T15:51:36 path-provisioner-pxps Pod cee-pcf/path-provisioner-pxpss/ uses high...
分析
当您在Pod或路径调配器的容器上收到有关高内存使用率的警报时。Kubernetes(K8s)在Pod达到最大内存限制时重新启动它。
或者,Pod在超过80%阈值时可以手动重新启动,以避免高内存警报。
步骤1:检查并验证此命令的活动摘要和输出中报告的Pod名称。
Command:
cloud-user@pcf01-master-1$ kubectl get pods --all-namespaces | grep "path-provisioner"
Example:
cloud-user@pcf01-master-1:~$ kubectl get pods --all-namespaces | grep "path-provisioner"
NAMESPACE NAME READY STATUS RESTARTS AGE
cee-pcf path-provisioner-27bjx 1/1 Running 0 110d
cee-pcf path-provisioner-4mlq8 1/1 Running 0 110d
cee-pcf path-provisioner-4zvjd 1/1 Running 0 110d
cee-pcf path-provisioner-566pn 1/1 Running 0 110d
cee-pcf path-provisioner-6d2dr 1/1 Running 0 110d
cee-pcf path-provisioner-7g6l4 1/1 Running 0 110d
cee-pcf path-provisioner-8psnx 1/1 Running 0 110d
cee-pcf path-provisioner-94p9f 1/1 Running 0 110d
cee-pcf path-provisioner-bfr5w 1/1 Running 0 110d
cee-pcf path-provisioner-clpq6 1/1 Running 0 110d
cee-pcf path-provisioner-dbjft 1/1 Running 0 110d
cee-mpcf path-provisioner-dx9ts 1/1 Running 0 110d
cee-pcf path-provisioner-fx72h 1/1 Running 0 110d
cee-pcf path-provisioner-hbxgd 1/1 Running 0 110d
cee-pcf path-provisioner-k6fzc 1/1 Running 0 110d
cee-pcf path-provisioner-l4mzz 1/1 Running 0 110d
cee-pcf path-provisioner-ldxbb 1/1 Running 0 110d
cee-pcf path-provisioner-lf2xx 1/1 Running 0 110d
cee-pcf path-provisioner-lxrjx 1/1 Running 0 110d
cee-pcf path-provisioner-mjhlw 1/1 Running 0 110d
cee-pcf path-provisioner-pq65p 1/1 Running 0 110d
cee-pcf path-provisioner-pxpss 1/1 Running 0 110d
cee-pcf path-provisioner-q4b7m 1/1 Running 0 110d
cee-pcf path-provisioner-qlkjb 1/1 Running 0 110d
cee-pcf path-provisioner-s2jth 1/1 Running 0 110d
cee-pcf path-provisioner-vhzhg 1/1 Running 0 110d
cee-pcf path-provisioner-wqpmr 1/1 Running 0 110d
cee-pcf path-provisioner-xj5k4 1/1 Running 0 110d
cee-pcf path-provisioner-z4h98 1/1 Running 0 110d
cloud-user@pcf01-master-1:~$
第二步:验证活动路径调配器Pod的总计数。
cloud-user@pcf01-master-1:~$ kubectl get pods --all-namespaces | grep "path-provisioner" | wc -l
29
cloud-user@pcf01-master-1:~$
解决方案
步骤1:在CEE名称空间下执行路径调配器Pod的重新启动,以登录到主节点。
cloud-user@pcf01-master-1:~$ kubectl delete pod -n cee-pcf path-provisioner-pxpss
pod "path-provisioner-pxpss" deleted
第二步:检验Kubernetes的Pod是否重新联机。
cloud-user@pcf01-master-1:~$ kubectl get pods --all-namespaces | grep "path-provisioner"
cee-pcf path-provisioner-27bjx 1/1 Running 0 110d
cee-pcf path-provisioner-4mlq8 1/1 Running 0 110d
cee-pcf path-provisioner-4zvjd 1/1 Running 0 110d
cee-pcf path-provisioner-566pn 1/1 Running 0 110d
cee-pcf path-provisioner-6d2dr 1/1 Running 0 110d
cee-pcf path-provisioner-7g6l4 1/1 Running 0 110d
cee-pcf path-provisioner-8psnx 1/1 Running 0 110d
cee-pcf path-provisioner-94p9f 1/1 Running 0 110d
cee-pcf path-provisioner-bfr5w 1/1 Running 0 110d
cee-pcf path-provisioner-clpq6 1/1 Running 0 110d
cee-pcf path-provisioner-dbjft 1/1 Running 0 110d
cee-pcf path-provisioner-dx9ts 1/1 Running 0 110d
cee-pcf path-provisioner-fx72h 1/1 Running 0 110d
cee-pcf path-provisioner-hbxgd 1/1 Running 0 110d
cee-pcf path-provisioner-k6fzc 1/1 Running 0 110d
cee-pcf path-provisioner-l4mzz 1/1 Running 0 110d
cee-pcf path-provisioner-ldxbb 1/1 Running 0 110d
cee-pcf path-provisioner-lf2xx 1/1 Running 0 110d
cee-pcf path-provisioner-lxrjx 1/1 Running 0 110d
cee-pcf path-provisioner-mjhlw 1/1 Running 0 110d
cee-pcf path-provisioner-pq65p 1/1 Running 0 110d
cee-pcf path-provisioner-pxpss 1/1 Running 0 7s
cee-pcf path-provisioner-q4b7m 1/1 Running 0 110d
cee-pcf path-provisioner-qlkjb 1/1 Running 0 110d
cee-pcf path-provisioner-s2jth 1/1 Running 0 110d
cee-pcf path-provisioner-vhzhg 1/1 Running 0 110d
cee-pcf path-provisioner-wqpmr 1/1 Running 0 110d
cee-pcf path-provisioner-xj5k4 1/1 Running 0 110d
cee-pcf path-provisioner-z4h98 1/1 Running 0 110d
cloud-user@pcf01-master-1:~$
第三步:验证Active Path-Provisioner Pod的总计数是否与重新启动前相同。
cloud-user@pcf01-master-1:~$ kubectl get pods --all-namespaces | grep "path-provisioner" | wc -l
29
cloud-user@pcf01-master-1:~$
第四步:验证活动警报并确保清除与路径调配器相关的警报。
[pcf01/pcfapp] cee# show alerts active summary
NAME UID SEVERITY STARTS AT SOURCE SUMMARY
-----------------------------------------------------------------------------------------------------------------
watchdog 02d125c1ba48 minor 03-29T10:48:08 System This is an alert meant to ensure that the entire a...