Introduction
This document describes the solution for high reclaimable memory utilization on the MIO card on ASR5500 after Event Data Record (EDR) enablement.
Problem
The ASR 5500 chassis showed high reclaimable memory usage on the active Management Input/Output (MIO) card after EDR configuration had been added.
Background Information
The ASR 5500 uses an array of solid state drives (SSDs) for short-term persistent storage. The RAID 5 is used by the ASR 5500 and it is called the hd-raid. Various data records are stored on the hd-raid as files. These files are transferred off the ASR 5500. The number of records and files can be large and that creates a large number of reclaimable memory pages to store the files. Reclaimable pages are file-backed pages (ie. pages that are allocated through mapped files) that are not currently mapped to any process. From process and memory, the reclaimable pages is calculated as Active(file) + Inactive(file) - Mapped.
There is a threshold that can be reached based on memory reclaims that can block the process. If this happens during a critical task, the system can switch over the cards when it does not respond in time. The minimum, low and high values determine when Kernel Swap Daemon (kswapd) starts and stops. The kswapd is an asynchronous process to do these reclaims until the free memory goes over the high mark.
Examples of the memory details for MIO cards before and after EDR configuration are shown.
Before cached memory was about 0.8 Gb
******* card5-cpu0 /proc/meminfo *******
MemTotal: 98941752 kB
MemFree: 93932096 kB
Buffers: 4324 kB
Cached: 838580 kB
After EDRs enablement it became 70Gb
******** card5-cpu0 /proc/meminfo *******
MemTotal: 98941752 kB
MemFree: 21543700 kB
Buffers: 4004 kB
Cached: 70505556 kB
Card 5, CPU 0:
Status : Active, Kernel Running, Tasks Running
File Usage : 12320 open files, 9881352 available
Memory Usage : 8875M 9.0% used, 67804M 69.0% reclaimable
Memory Details:
Static : 1437M kernel, 243M image
System : 63M tmp, 3M buffers, 3077M kcache, 68004M cache
Process/Task : 3707M (1276M small, 2082M huge, 349M other)
Other : 141M shared data
Free : 21624M free
Usable : 94940M usable (21624M free, 141M shared data, 67804M reclaimable, 4728M reserved by tasks)
Solution
The high amount of EDRs generated and the long time to purge old records can cause high reclaimable memory usage. It is suggested to verify the time between the files are pushed outside of the ASR 5500 and the time of purge of old files. The file purge timer has to be adjusted based on the node operations. The general flow of the memory lifecycle is shown in the picture.
Note: The files have to be wiped out after they were transferred outside of the ASR 5500. The preferred method is to use cdr remove-file-after-transfer
configuration. The configuration is applicable to CDR and EDR.
The commands to enable the deletion are shown in the snippet.
[local]ASR5500# config
[local]ASR5500(config)# context (name)
ASR5500(config-ctx)# edr-module active-charging-service
ASR5500(config-ctx)# cdr use-harddisk
ASR5500(config-ctx)# cdr-remove-file-after-transfer
Useful Commands
show cdr statistics
- To monitor reclaimable memory. It shows results for the last reading of 5 minutes,15 minutes, minimum and maximum values respectively.
show cpu info card [5|6] verbose | grep reclaimable
show cdr file-space-usage
show gtpp storage-server local file statistics
In the example of the output, 89Gb can be purged.
[local]ASR5500# show cpu info card 5 verbose | grep reclaim
Memory Usage : 10984M 11.2% used, 86380M 87.9% reclaimable
Usable : 74076M usable (939M free, 86380M reclaimable, 13242M reserved by tasks)
Memory Usage : 10985M 11.2% used, 86445M 87.9% reclaimable
Usable : 74065M usable (872M free, 86445M reclaimable, 13253M reserved by tasks)
Memory Usage : 11064M 11.3% used, 86387M 87.9% reclaimable
Usable : 73904M usable (851M free, 86387M reclaimable, 13334M reserved by tasks)
Memory Usage : 9803M 10.0% used, 87803M 89.3% reclaimable
Usable : -NA- (697M free, 87803M reclaimable, 13511M reserved by tasks)
gtpp group <>
gtpp storage-server local file purge-processed-files purge-interval 720