Table Of Contents
Embedded Resource Manager (ERM)
Prerequisites for Embedded Resource Manager
Restrictions for Embedded Resource Manager
Information About Embedded Resource Manager
Benefits of the Embedded Resource Manager
Resource Accounting and Thresholds Tracking in ERM
System Resources Monitored by the Embedded Resource Manager
How to Configure Embedded Resource Manager
Managing Resource Utilization by Defining Resource Policy
Setting Expected Operating Ranges for Buffer Resources
Setting Expected Operating Ranges for CPU Resources
Setting Expected Operating Ranges for Memory Resources
Enabling Automatic Tuning of Buffers
Configuring a CPU Process to Be Included in the Extended Load Monitor Report
Managing Extended CPU Load Monitoring
Managing Automatic CPUHOG Profiling
Applying a Policy to Resource Users
Setting a Critical Rising Threshold for Global I/O Memory
Configuration Examples for Embedded Resource Manager
Managing Resource Utilization by Defining Resource Policy: Example
Setting Expected Operating Ranges for Resource Owners: Example
Setting a System Global Thresholding Policy for I/O Memory: Example
Feature Information for Embedded Resource Manager
Embedded Resource Manager (ERM)
First Published: December 07, 2004Last Updated: February 27, 2008The Embedded Resource Manager (ERM) feature allows you to monitor internal system resource utilization for specific resources such as the buffer, memory, and CPU. ERM monitors resource utilization from the perspective of various subsystems within the Cisco IOS software such as resource owners (ROs) and resource users (RUs). ERM allows you to configure threshold values for system resources.
The ERM infrastructure is designed to allow for granular monitoring on a task basis within the Cisco IOS software. Network administrators can define thresholds to create notifications according to the real-time resource consumption. ERM goes beyond simply monitoring for total CPU utilization. Through the use of ERM, network administrators and operators can gain a better understanding of the device's operational characteristics, leading to better insight into system scalability and improved system availability.
Finding Feature Information in This Module
Your Cisco IOS software release may not support all of the features documented in this module. To reach links to specific feature documentation in this module and to see a list of the releases in which each feature is supported, use the "Command Reference" section.
Finding Support Information for Platforms and Cisco IOS and Catalyst OS Software Images
Use Cisco Feature Navigator to find information about platform support and Cisco IOS and Catalyst OS software image support. To access Cisco Feature Navigator, go to http://www.cisco.com/go/cfn. An account on Cisco.com is not required.
Contents
•
Prerequisites for Embedded Resource Manager
•
Restrictions for Embedded Resource Manager
•
Information About Embedded Resource Manager
•
How to Configure Embedded Resource Manager
•
Configuration Examples for Embedded Resource Manager
Prerequisites for Embedded Resource Manager
You must be running Cisco IOS Release 12.4(6)T or a later release to use the Packet Memory Reclamation functionality.
Restrictions for Embedded Resource Manager
Additional instructions from a Cisco technical support representative may be required.
Information About Embedded Resource Manager
ERM promotes resource availability by providing the infrastructure to track resource usage.
To configure threshold values for resource manager entities, you should understand the following concepts:
•
Benefits of the Embedded Resource Manager
•
Resource Accounting and Thresholds Tracking in ERM
•
System Resources Monitored by the Embedded Resource Manager
Benefits of the Embedded Resource Manager
The ERM framework tracks resource utilization and resource depletion by monitoring finite resources. Support for monitoring CPU, buffer, and memory utilization at a global or IOS-process level is available.
The ERM framework provides a mechanism to send notifications whenever the specified threshold values are exceeded by any resource user. This notification helps network administrators diagnose any CPU, buffer, and memory utilization issues.
The ERM architecture is illustrated in Figure 1.
Figure 1 ERM Architecture
ERM provides a framework for monitoring any finite resource within the Cisco IOS software and provides information that a user can analyze to better understand how network changes might impact system operation. ERM helps in addressing infrastructure problems such as reloads, memory allocation failure, and high CPU utilization by performing the following functions:
•
Monitoring system resource usage.
•
Setting the resource threshold at a granular level.
•
Generating alerts when resource utilization reaches the specified level.
•
Generating internal events using the Cisco IOS Embedded Event Manager feature.
Resource Accounting and Thresholds Tracking in ERM
ERM tracks the resource usage for each RU internally. A RU is a subsystem or process task within the Cisco IOS software; for example, the Open Shortest Path First (OSPF) hello process is a resource user. Threshold limits are used to notify network operators of specific conditions. The ERM infrastructure provides a means to notify the internal RU subsystem of threshold indications as well. The resource accounting is performed by individual ROs. ROs are part of the Cisco IOS software and are responsible for monitoring certain resources such as the memory, CPU, and buffer. When the utilization for each RU exceeds the threshold value you have set, the ROs send internal notifications to the RUs and to network administrators in the form of system logging (syslog) messages or Simple Network Management Protocol (SNMP) alerts.
You can set rising and falling values for critical, major, and minor levels of thresholds. When the resource utilization exceeds the rising threshold level, an Up notification is sent. When the resource utilization falls below the falling threshold level, a Down notification is sent.
ERM provides for three types of thresholds to be defined:
•
The System Global Threshold is the point when the entire resource reaches a specified value. A notification is sent to all RUs once the threshold is exceeded.
•
The User Local Threshold is the point when a specified RUs utilization exceeds the configured limit.
•
The User Global Threshold is the point when the entire resource reaches a configured value. A notification is sent to the specified RU once the threshold is exceeded.
System Resources Monitored by the Embedded Resource Manager
ERM monitors CPU, buffer, and memory utilization at a global and task-based level. To avoid infrastructure issues and promote the availability of system resources, the resource owners described in the following sections are monitored:
CPU Resource Owner
The ERM feature uses the existing loadometer process to calculate the load information displayed by the show processes cpu command. This method generates a report of the extended load statistics and adds it to a circular buffer every five seconds. You can obtain a record of the load statistics for the past one minute through the CLI. This feature also provides an intelligent CPUHOG profiling mechanism that helps to reduce the time required to diagnose error conditions.
The functions described in the following sections help in load monitoring.
•
Snapshot Management Using Event Trace
Loadometer Process
The loadometer process generates an extended load monitor report every five seconds. The loadometer function, which calculates process CPU usage percentage, is enhanced to generate the loadometer process reports.
Scheduler
The scheduler collects data when a process is executed, which enables the loadometer to generate reports. The scheduler collects data when the process is launched or when the process transfers control to the scheduler.
Snapshot Management Using Event Trace
Snapshot management manages the buffer in which snapshots of reports are stored. The snapshot management infrastructure stores, displays, and releases the snapshots.
Automatic CPUHOG Profiling
The timer Interrupt Service Routine (ISR) provides automatic CPUHOG profiling. The timer ISR begins profiling a process when it notices that the process has exceeded the configured value or a default of twice the maximum scheduling quantum (maximum time taken for the execution of a task).
On beginning the profiling, the timer ISR saves the interrupted program counter (pc) and return address (ra) in a preallocated buffer. This process provides information that can help the user analyze the CPUHOG.
The profiling continues until the CPUHOG is reported or the buffer is full. To analyze the computation of a long running process you must specify a process ID (PID) and a threshold to start the profiling. When this process takes up more than the specified time (in milliseconds), the profiling begins.
When the data belonging to a particular process exceeds the default size of the buffer, it is reported as a CPUHOG. The default size of the buffer is 1250 entries and can store up to five seconds of profiling data.
Memory Resource Owner
The Embedded Resource Manager feature enhances the memory manager in Cisco IOS devices. The enhancements are described in the following sections:
•
Interface Wedging and Packet Memory Leaks
•
Memory Resource Reclamation for Interfaces
Memory Usage History
The Embedded Resource Manager feature helps in maintaining memory fragmentation information and thus reduces the need for maintenance of separate scripts for collecting such information.
Memory Accounting
ERM performs the accounting of information for memory by tracking the memory usage of individual RUs. When a process is created, a corresponding RU is also created, against which the usage of memory is recorded. The process of RU creation helps the user to migrate from a process-based accounting to a resource user-based accounting scheme for memory.
The memory RO maintains a global threshold and a per-RU memory usage threshold that can be configured through the ERM infrastructure. The memory RO also tracks the global free memory. When a particular RU's memory usage exceeds the global free memory, a notification is sent to the registered resource monitors (RMs). Similarly when a particular RU exceeds its threshold of memory usage, a notification is sent to that RU. These notifications are sent using the ERM infrastructure.
A memory RO has the intelligence to assign memory to a RU. When a memory RO receives an allocation request, the memory is assigned to the current RU. When a free request is received, the memory RO reduces the memory assigned to the RU.
Interface Wedging and Packet Memory Leaks
In certain situations, errors in the system accounting of incoming packets can occur, leading to a "memory leak" caused by the input queue. When there is a leak in an interface's input queue, gradually the queue reaches its maximum permitted value, causing the interface to become "wedged." A wedged interface may no longer process incoming packets. Packet memory leaks can cause interface input queue wedges.
The Packet Memory Reclamation functionality improves the infrastructure for preventing wedged interface input queues, and it provides a method for changing the defaults of that infrastructure. The Embedded Resource Manager provides the Packet Memory Reclamation functionality for "unwedging" interface input queues and configuring the system to detect and rectify packet leaks.
Note
To use the Packet Memory Reclamation functionality, you must be running Cisco IOS
Release 12.4(6)T or a later release. Additional troubleshooting (debugging) commands were introduced by this enhancement for use by technical support representatives in specific situations.
Memory Resource Reclamation for Interfaces
The Garbage Detection process works in conjunction with the Memory RO in achieving interface unwedging (for more details, see the Memory Leak Dectector feature guide that is part of the Cisco IOS Configuration Fundamentals Configuration Guide, Release 12.4T).
As part of the reclamation process, incoming packets that belong to a leaked input queue can be deallocated and reused. This feature provides a command (critical rising) that can be used to fine-tune memory resource reclamation.
Note
Configuration of this feature will typically be needed only as part of a troubleshooting process with a Cisco Technical Support representative. Additional configuration tasks or special technical support commands may be required before this feature can be effectively used. Additional memory debug leak internal service commands are made available to Cisco Technical Support engineers for use in specific situations.
The deallocation procedure is triggered when a check is made to see if packets are using too much memory. Thresholds for the memory RO can be configured using a global policy of any level.
The purpose of configuring this memory policy is to find a balance between the utilization of the Memory Leak Detector (that can become resource intensive) and the need to detect packet memory leaks. Ideally, the system should perform deallocation only when it becomes absolutely necessary.
The critical rising command allows you to set a rising and falling threshold percentage for critical levels of I/O memory usage, and to specify an interval for those values. These values trigger the Memory Leak Detector process and, if needed, the deallocation procedure.
For example, if memory usage is more than that of the rising threshold of 75 percent of total I/O memory for more than 5 seconds, the "critical" notification is generated within the system and a callback is issued. As an action in the callback, a check is made to see if the packets are using too much memory. When the packets have used too much memory, the deallocation procedure begins. If the deallocation procedure does not bring memory utilization below the lower threshold value, the deallocation procedure is periodically reattempted. Once the memory usage falls below the configured threshold value, the periodic attempts to deallocate are stopped.
Memory Leak Reclamation
The Packet Memory Reclamation feature uses the ERM infrastructure to clean up and reclaim leaked Cisco IOS packet memory.
This feature uses the Memory Leak Detector process (sometimes referred to as the Garbage Detection or GD process) and the memory-manager RO functionality to reclaim packet memory.
I/O Memory
The I/O memory pool is one of the memory types in Cisco IOS software. The input queue buffers use memory from this pool for processing.
Buffer Resource Owner
The Embedded Resource Manager feature addresses the recurring problems of the Buffer Manager described in the following sections.
Automatic Buffer Tuning
The Embedded Resource Manager feature allows you to automatically tune the buffers using the buffer tune automatic command. The buffer RO tunes permanent memory in particle pools based on the usage of the buffer pool.
The buffer RO tracks the number of failures and the availability of memory in the buffer pool. When the number of failures increases above 1 percent of the buffer hits or when no memory is available in the buffer pool, the buffer RO performs an automatic tuning.
Note
Ensure that there is sufficient free I/O memory or main memory using the first lines of the show memory command output before enabling automatic tuning of buffers.
Here are some keywords from the buffer tune command that can help you verify if you have sufficient I/O memory:
•
permanent: take the number of total buffers in a pool and add 20 percent.
•
min-free: set the min-free keyword to 20 to 30 percent of the permanent number of allocated buffers in the pool.
•
max-free: set the max-free keyword to a value greater than the sum of permanent and minimum values.
However, when there is a traffic burst, the Cisco IOS device may not have enough time to create the new buffers and the number of failures may continue to increase.
The Embedded Resource Manager feature monitors the buffer pool every minute for tuning (that is, for number of hits, number of failures, and the number of counters created). When buffer tuning is enabled, the buffer RO automatically tunes the buffers when required.
Buffer Leak Detection
The Embedded Resource Manager feature allows Cisco IOS devices to detect and diagnose potential buffer leaks. All the buffers in a pool are linked so that they can be traced easily. The number of buffers allocated for incoming and outgoing packets in each buffer pool is tracked and can be displayed in the show buffers leak command output.
Buffer Accounting
The Embedded Resource Manager feature consists of mechanisms to account for the usage of buffers. All buffers are owned by the pool manager process (buffer RU). When a RU requests a buffer, the allocated buffer is allotted to that RU. When the RU returns the buffer, it is deducted from the RU's account. The packet type from the output of the show buffers usage command indicates the RU to which the packet belongs.
Buffer Usage Thresholding
The Embedded Resource Manager feature provides a facility to manage high buffer utilization. The buffer manager RO registers as a RU with the memory RO. The buffer manager RU is set before a memory allocation is made for creating new buffers. The buffer manager also registers as an RO. When a buffer is allocated, the current RU (if any) is charged with the memory allocation. The buffer manager RO registers for the notifications from the memory manager for the processor and I/O memory pool. If the I/O memory pool is falling short of memory, the buffer manager tries to free the lists of all the buffer pools. If your Cisco IOS device does not support I/O memory, then it registers for notifications from the processor memory.
Cisco IOS software maintains a threshold per buffer pool. When a particular pool exceeds the specified threshold, ERM sends a notification to all the RUs in that pool, so that the RUs can take corrective measures. Thresholds are configured for public buffer pools only.
Global notification is set for every pool in the system; that is, one notification for all pools in the public pool and one notification for each pool in the private pool. Threshold notifications are sent to only those RUs that have registered with the ROs for getting notifications. A list of RUs that have registered with the RO is maintained by the RO. When the threshold of a particular RU is exceeded, then that RU is notified and marked notified. When the buffers are recovered, the notified RUs are moved back to the original list.
For example, an Ethernet driver RU is allocated buffers from some particular private pool. Another RU, Inter Processor Communication (IPC), is added to the list. In this case, when the pool runs low on buffers, the IPC RU gets a notification and it can take corrective measures.
You can configure threshold values as percentages of the total buffers available in the public pool. Total buffer is the sum of maximum allowed buffers and the permanent pools in the public buffer pool. If these values change due to buffer tuning, then the threshold values also change. For example, if the configuration requires that a notification be sent when the IPC RU is holding more than 40 percent of Ethernet buffers and the sum of permanent and maximum allowed for Ethernet buffers is 150 percent, then the Ethernet pool is notified when the IPC RU is holding 60 percent.
Resource Policy Templates
Resource owner policy is a template used by the ROs to associate a RU with a set of thresholds that are configured through the CLI. This template can be used to specify system global, user local, and per user global thresholds. A particular resource group or RU can have only one policy associated with it. The policy template for ROs is maintained by the ERM framework.
When a policy template is associated with a user type and its instance (RUs), the thresholds configured in that policy are applied based on the RU to RO relationship. This method ignores any RO configuration that may not be applicable to the RU.
How to Configure Embedded Resource Manager
This section contains the following procedures.
•
Managing Resource Utilization by Defining Resource Policy (required)
•
Setting Expected Operating Ranges for Buffer Resources (required)
•
Setting Expected Operating Ranges for CPU Resources (required)
•
Setting Expected Operating Ranges for Memory Resources (required)
•
Enabling Automatic Tuning of Buffers (required)
•
Managing Memory Usage History (required)
•
Configuring a CPU Process to Be Included in the Extended Load Monitor Report (required)
•
Managing Extended CPU Load Monitoring (required)
•
Managing Automatic CPUHOG Profiling (required)
•
Applying a Policy to Resource Users (optional)
•
Setting a Critical Rising Threshold for Global I/O Memory (optional)
•
Verifying ERM Operations (optional)
•
Troubleshooting Tips (optional)
Managing Resource Utilization by Defining Resource Policy
Perform this task to configure a resource policy for ERM.
SUMMARY STEPS
1.
enable
2.
configure terminal
3.
resource policy
4.
policy policy-name [global | type resource-user-type]
DETAILED STEPS
Setting Expected Operating Ranges for Buffer Resources
Perform this task to configure threshold values for buffer RO.
SUMMARY STEPS
1.
enable
2.
configure terminal
3.
resource policy
4.
policy policy-name [global | type resource-user-type]
5.
system
or
slot slot-number6.
buffer public
7.
critical rising rising-threshold-value [interval interval-value] [falling falling-threshold-value [interval interval-value]] [global]
or
major rising rising-threshold-value [interval interval-value] [falling falling-threshold-value [interval interval-value]] [global]
or
minor rising rising-threshold-value [interval interval-value] [falling falling-threshold-value [interval interval-value]] [global]8.
exit
DETAILED STEPS
Setting Expected Operating Ranges for CPU Resources
Perform this task to configure threshold values for the CPU RO.
SUMMARY STEPS
1.
enable
2.
configure terminal
3.
resource policy
4.
policy policy-name [global | type resource-user-type]
5.
system
or
slot slot-number6.
cpu interrupt
7.
critical rising rising-threshold-value [interval interval-value] [falling falling-threshold-value [interval interval-value]] global
or
major rising rising-threshold-value [interval interval-value] [falling falling-threshold-value [interval interval-value]] global
or
minor rising rising-threshold-value [interval interval-value] [falling falling-threshold-value [interval interval-value]] global8.
exit
9.
cpu process
10.
critical rising rising-threshold-value [interval interval-value] [falling falling-threshold-value [interval interval-value]] [global]
or
major rising rising-threshold-value [interval interval-value] [falling falling-threshold-value [interval interval-value]] [global]
or
minor rising rising-threshold-value [interval interval-value] [falling falling-threshold-value [interval interval-value]] [global]11.
exit
12.
cpu total
13.
critical rising rising-threshold-value [interval interval-value] [falling falling-threshold-value [interval interval-value]] global
or
major rising rising-threshold-value [interval interval-value] [falling falling-threshold-value [interval interval-value]] global
or
minor rising rising-threshold-value [interval interval-value] [falling falling-threshold-value [interval interval-value]] global14.
exit
DETAILED STEPS
Setting Expected Operating Ranges for Memory Resources
Perform this task to configure threshold values for the memory RO.
Note
When the Packet Memory Reclamation functionality is enabled, and the violation of the configured threshold value for the memory RO occurs, the system verifies whether the memory is hogged by the buffers. If 70 percent of the memory is used by the buffers, the system activates the Memory Leak Detector process (sometimes referred to as the "Garbage Detection" or "GD" process) to clean up the memory. (For more details, see the Memory Leak Dectector feature guide that is part of the Cisco IOS Configuration Fundamentals Configuration Guide, Release 12.4T).
SUMMARY STEPS
1.
enable
2.
configure terminal
3.
resource policy
4.
policy policy-name [global | type resource-user-type]
5.
system
or
slot slot-number6.
memory io
7.
critical rising rising-threshold-value [interval interval-value] [falling falling-threshold-value [interval interval-value]] [global]
or
major rising rising-threshold-value [interval interval-value] [falling falling-threshold-value [interval interval-value]] [global]
or
minor rising rising-threshold-value [interval interval-value] [falling falling-threshold-value [interval interval-value]] [global]8.
exit
9.
memory processor
10.
critical rising rising-threshold-value [interval interval-value] [falling falling-threshold-value [interval interval-value]] [global]
or
major rising rising-threshold-value [interval interval-value] [falling falling-threshold-value [interval interval-value]] [global]
or
minor rising rising-threshold-value [interval interval-value] [falling falling-threshold-value [interval interval-value]] [global]11.
exit
DETAILED STEPS
Enabling Automatic Tuning of Buffers
Perform this task to enable automatic tuning of buffers.
SUMMARY STEPS
1.
enable
2.
configure terminal
3.
buffer tune automatic
DETAILED STEPS
Managing Memory Usage History
Perform this task to change the number of hours for which the memory log is maintained.
SUMMARY STEPS
1.
enable
2.
configure terminal
3.
memory statistics history table number-of-hours
DETAILED STEPS
Configuring a CPU Process to Be Included in the Extended Load Monitor Report
Perform this task to configure a process (or processes) to be included in the extended load monitor report.
SUMMARY STEPS
1.
enable
2.
monitor processes cpu extended process-id-list
DETAILED STEPS
Managing Extended CPU Load Monitoring
Perform this task to change the history size in the collection report for extended CPU load.
Restrictions
You cannot disable this feature completely. If the command is not configured, the default behavior is to collect a one-minute history. The one-minute history is equivalent to collecting history for a history size 12.
SUMMARY STEPS
1.
enable
2.
configure terminal
3.
process cpu extended history history-size
DETAILED STEPS
Managing Automatic CPUHOG Profiling
Perform this task to enable automatic profiling of CPUHOGs by the CPU Resource Owner. The CPU Resource Owner predicts when a process could hog CPU and begins profiling that process at the same time. This function is enabled by default.
SUMMARY STEPS
1.
enable
2.
configure terminal
3.
processes cpu autoprofile hog
DETAILED STEPS


