Introduction
This document describes monitoring CPU utilization to Troubleshoot High CPU Usage on the SNMP process.
Prerequisites
Requirements
Cisco recommends that you have basic knowledge in Cisco IOS®-XE WLC 9800 series.
Components Used
The information in this document is based on hardware versions Cisco IOS®-XE WLC 9800 series and is not restricted to specific software versions.
The information in this document was created from the devices in a specific lab environment. All of the devices used in this document started with a cleared (default) configuration. If your network is live, ensure that you understand the potential impact of any command.
Background Information
We need to confirm if SNMP is the one having higher utilisation. For further investigation, collect these logs, during non-business hours when the issue is seen, because it can affect the performance,
Monitor
Example:
ID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process
736 6846005 11045858 619 88.09% 9.15% 3.28% 0 SNMP ENGINE
Troubleshoot
Open two WLC CLI sessions for gathering the these logs:
Session-1: show snmp stats hosts
Debug snmp packet
Debug snmp detail
terminal monitor
Session-2: These stats show output for an interval and enable the service internal to run this.
Conf t
Service internal
end
wr
test snmp cpu-stats start
show snmp cpu-stats
test snmp cpu-stats stop
Also, check the MIB that is being used for polling on the SNMP server when the issue is seen.
EEM script
Please follow the steps on non-production hours.
Step 1. Please run these commands
Conf t
Service internal
end
wr
Step 2. Enable the EEM script for snmp stats (copy & paste this script in the controller CLI):
Conf t
no event manager applet snmp-1
event manager applet snmp-1
event none maxrun 2000
action 10 cli command "enable"
action 11 cli command "terminal leng 0"
action 11.1 puts "Script starts"
action 12 cli command "debug snmp packet"
action 13 cli command "debug snmp detail"
action 14 cli command "debug snmp request"
action 20.1 cli command "show clock"
action 21 regexp "(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) ([1-9]|0[1-9]|[1-2][0-9]|3[0-1]) (20[1-9][0-9])" "$_cli_result" time2 month day year
action 22 puts "$time2"
action 23 puts "$month"
action 24 puts "$day"
action 25 puts "$year"
action 26 cli command "show clock | append flash:/snmp-cpu-logs-$year$month$day.txt"
action 27 cli command "show snmp stats hosts | append flash:/snmp-cpu-logs-$year$month$day.txt"
action 30 cli command "test snmp cpu-stats start"
action 35 set iter 1
action 36 while $iter le 6
action 40 cli command "show snmp cpu-stats | append flash:/snmp-cpu-logs-$year$month$day.txt"
action 40.1 puts "Iterator:$iter"
action 41 wait 300
action 43 cli command "show clock | append flash:/snmp-cpu-logs-$year$month$day.txt"
action 44 increment iter 1
action 45 end
action 50 cli command "test snmp cpu-stats stop"
action 55 cli command "no debug snmp packet"
action 56 cli command "no debug snmp detail"
action 57 cli command "no debug snmp request"
action 58 puts "Script ends"
end
wr
Step 3. Run the previous script using this command: #event manager run snmp-1
Note: During non-business hours, when issue is seen, enable the script using the command,
Wait for the script to end the prompt in the CLI. It usually takes 30 minutes.
Step 4. The preceding script would take maximum 30-40 minutes to run and it would prompt “Script ends” message,
There is “Iterator” to be completed after that you see the message “Script ends”
Once the script ends, go to GUI > Administration > File Manager > Bootflash. Right click on the snmp-cpu-logs.txt log to download and share it with TAC.
Step 5. Check the MIB file being used to Poll the WLC during the time of the issue.
A sample reference output can look like :
DR 5sec% 1min% 5min% Running(ms) Time(usecs) Invoked OID
556272A00320 0.00 6.03 3.30 59 59408 44 vmMembershipSummaryEntry.2
556272A00320 50.48 9.68 4.09 59 59659 44 vmMembershipSummaryEntry.3
556272A00320 0.23 1.60 2.23 0 8333 6 clcCdpApCacheApName
556272A00320 0.19 1.62 2.24 2 6999 5 bsnDot11EssMacFiltering
556272A00320 0.23 1.60 2.23 2 3792 24 bsnDot11EssAdminStatus
556272A00320 0.23 1.60 2.23 2 4000 2 bsnDot11EssSecurityAuthType
556272A00320 0.23 1.60 2.23 2 3541 24 bsnDot11EssRowStatus
556272A00320 0.23 1.60 2.23 2 3500 2 bsnDot11EssWmePolicySet
The SNMP utilization above 70-90% for C9800-40 is normal.
Conclusion
If using SNMP to poll different OIDs, the CLI needs to be configured as a best practice to reduce the possible impact on the C9800 CPU: C9800config)#snmp-server subagent cache
With this command, the cache is cleared after 60 seconds. To change the interval, use this CLI:
C9800(config)#snmp-server subagent cache timeout ?
<1-100> cache timeout interval (default 60 seconds)
If the core is not to be utilized more on SNMP process, limit the SNMP polling from the server using the MIB. Disable the high-queue time object identifier from SNMP MIB/ server.
A highly queued SNMP object from the MIB can be disabled or removed.
Here is the list for reference that could be disbaled if not needed:
clcCdpApCacheApName
bsnDot11EssMacFiltering
bsnDot11EssAdminStatus
bsnDot11EssSecurityAuthType
bsnDot11EssRowStatus
bsnDot11EssWmePolicySetting
bsnMobileStationIpAddress
bsnMobileStationUserName
bsnMobileStationAPMacAddr
bsnMobileStationAPIfSlotId
bsnMobileStationEssIndex
bsnMobileStationSsid
bsnMobileStationAID
bsnMobileStationStatus
bsnAPIfDot11BeaconPeriod
bsnGlobalDot11PrivacyOptionImplemented
bsnGlobalDot11MultiDomainCapabilityImplemented
bsnGlobalDot11MultiDomainCapabilityEnabled
bsnGlobalDot11CountryIndex
bsnGlobalDot11LoadBalancing
bsnGlobalDot11bDot11gSupport
The “bsn station” object on SNMP would take queue time for getting additional details.
Tip: The best practice is to reduce the poll interval based on the number of nodes on the network, and to remove the MIBs that are not required
Related Information
For more information on SNMP on C9800 please refer to this link: