This document describes Embedded Event Manager (EEM) applets that are used in networks where Performance Routing (PfR) optimizes traffic through multiple Border Relays (BRs). Some forwarding loops are also observed. The applets are used in order to collect data when a loop is observed and mitigate the impact of a forwarding loop.
There are no specific requirements for this document.
The information in this document is based on Cisco IOS® software that supports EEM Version 4.0.
In order to check the EEM version supported by your Cisco IOS release, use this command:
Router#sh event manager version | i Embedded
Embedded Event Manager Version 4.00
Router#
The information in this document was created from the devices in a specific lab environment. All of the devices used in this document started with a cleared (default) configuration. If your network is live, make sure that you understand the potential impact of any command.
When PfR controls a Traffic Class (TC), it creates a dynamic route map/Access Control List (ACL) on the BRs. The route map on a BR with a selected exit points to a selected exit, while a route map on other BRs points to an internal interface (next-hop = selected BR).
A problem occurs when the dynamic ACLs are not synched properly between the different BRs (due to bugs, for example).
In this picture, the focus is on TC matching any IP packets destined to 172.16.1.0/24 with DSCP EF. In this scenario, the related ACL entry is removed from the selected BR (BR-2), but not from BR-1. Packets of that TC hit on BR-2 with the prefix entry that matches all IP packets destined to 172.16.1.0/24. The selected exit for the prefix entry is Exit-1, so the related route-map/ACL on BR-2 points to BR-1.
The packets of that TC now loop between the BRs untill the Time To Live (TTL) reaches 0.
This document provides the necessary EEM applets used in order to:
The applets used in the case of a Master Controller (MC)/BR combo are much easier (when MC runs on one of the BRs). The scenario with dedicated MCs is also covered.
This section describes the Access-lists used for this process, as well as Applet Log files.
In order to detect forwarding loops, the applet relies on an ACL to match packets with low TTL.
It is recommended to use ACE matching on 2x consecutive, relatively low, TTL values (20 and 21) in order to get one (and only one) hit for each packet that loops between BRs. The TTL value used should not be too low in order to avoid frequent hits from traceroute packets.
interface gig0/0 (internal interface)
ip access-group LOOP in
!
ip access-list extended LOOP
permit ip 10.116.48.0 0.0.31.255 any ttl range 20 21
permit ip any any
The ACL should be placed on the internal interface reported in the show pfr master border topology command output.
The source IP range (here 10.116.48.0/20) should match the internal network(s) (prefixes reachable via internal interfaces).
In order to identify which remote site/TC is impacted by the loop, you can add a second ACL outbound on the interface, with more specific ACEs for each remote site/TC.
interface gig0/0 (internal interface)
ip access-group LOOP-DETAIL out
!
ip access-list extended LOOP-DETAIL
permit ip 10.116.48.0 0.0.31.255 10.116.132.0 0.0.0.255 ttl range 20 21
permit ip 10.116.48.0 0.0.31.255 10.116.128.0 0.0.0.255 ttl range 20 21
.... (add here one line per remote site)
permit ip any an
The destination IP matches the subnet in the different remote sites:
10.116.132.0/24 -> site-1
10.116.128.0/24 -> site-2
You can also add several lines per remote site if you need to identify the exact TC impacted by the loop.
The applet checks the hitcounts of the ACE matching on the TTL in the ACL loop every thirty seconds. Based on the outcome of these checks, the applet might perform these tasks:
The applet maintains a log file that keeps track of the number of hitcounts (when the count is greater than 0), and any encounters of temporary loops (when THRESHOLD_1 is exceeded but not THRESHOLD_2) or a real loop (when both THRESHOLD_1 and THRESHOLD_2 are exceeded).
These are the simplest scenarios described in this document. Loop detection and PfR clearing are done on the same device, so there is no need to enter device EEM applet communication. A separate applet runs on a MC/BR combo and other BRs.
This output displays important information for the applet used on MC/BR combo. Here are some important notes for this specific output:
event manager environment THRESHOLD_1 1000
event manager environment THRESHOLD_2 500
event manager environment DISK bootflash
!
event manager applet LOOP-MON authorization bypass
event timer watchdog name LOOP time 30
action 100 cli command "enable"
action 110 cli command "show ip access-list LOOP"
action 120 set regexp_substr 0
action 130 regexp "range 20 21 \(([0-9]+) matches\)"
$_cli_result _regexp_result regexp_substr
action 140 cli command "clear ip access-list counters LOOP"
action 150 if $regexp_substr gt 0
action 200 set MATCHES $regexp_substr
action 210 file open LOGS $DISK:script-logs.txt a
action 220 cli command "enable"
action 230 cli command "show clock"
action 240 regexp "[0-9]+:[0-9]+:[0-9]+.[0-9]+ est [A-Za-z]+
[A-Za-z]+ [0-9]+ 201[0-9]" $_cli_result _regexp_result
action 250 set TIME $_regexp_result
action 260 if $MATCHES gt $THRESHOLD_1
action 270 wait 15
action 280 cli command "show ip access-list LOOP"
action 290 set regexp_substr 0
action 300 regexp "range 20 21 \(([0-9]+) matches\)"
$_cli_result _regexp_result regexp_substr
action 310 if $regexp_substr gt $THRESHOLD_2
action 320 cli command "enable"
action 330 cli command "show ip access-list LOOP-DETAIL
| tee /append $DISK:script-output-$_event_pub_sec.txt"
action 340 cli command "show pfr master traffic-class perf det
| tee /append $DISK:script-output-$_event_pub_sec.txt"
action 350 cli command "show route-map dynamic detail
| tee /append $DISK:script-output-$_event_pub_sec.txt"
action 360 cli command "show ip route
| tee /append $DISK:script-output-$_event_pub_sec.txt"
action 370 cli command "clear pfr master *"
action 380 cli command "clear ip access-list counters LOOP-DETAIL"
action 390 file puts LOGS "$TIME - LOOP DETECTED - PfR CLEARED -
matches $MATCHES > $THRESHOLD_1 and $regexp_substr
> $THRESHOLD_2 - see $DISK:script-output-$_event_pub_sec.txt"
action 400 syslog priority emergencies msg "LOOP DETECTED -
PfR CLEARED - see $DISK:script-output-$_event_pub_sec.txt !"
action 410 else
action 420 file puts LOGS "$TIME - TEMPORARY LOOP : matches
$MATCHES > $THRESHOLD_1 and $regexp_substr < or = $THRESHOLD_2"
action 430 cli command "clear ip access-list counters LOOP-DETAIL"
action 440 end
action 450 else
action 460 cli command "en"
action 470 cli command "clear ip access-list counters LOOP-DETAIL"
action 480 file puts LOGS "$TIME - number of matches =
$MATCHES < $THRESHOLD_1"
action 490 end
action 500 else
action 510 cli command "clear ip access-list counters LOOP-DETAIL"
action 520 end
This section describes the applet used for other BRs. Here are some important notes for this specific output:
event manager environment THRESHOLD 700
event manager environment DISK flash 0
!
event manager applet LOOP-BR authorization bypass
event timer watchdog name LOOP time 20
action 100 cli command "enable"
action 110 cli command "show ip access-list LOOP"
action 120 set regexp_substr 0
action 130 regexp "range 20 21 \(([0-9]+) matches\)"
$_cli_result _regexp_result regexp_substr
action 140 cli command "clear ip access-list counters LOOP"
action 150 if $regexp_substr gt 0
action 160 set MATCHES $regexp_substr
action 170 file open LOGS $DISK:script-logs.txt a
action 180 cli command "show clock"
action 190 regexp "[0-9]+:[0-9]+:[0-9]+.[0-9]+
est [A-Za-z]+ [A-Za-z]+ [0-9]+ 201[0-9]" $_cli_result _regexp_result
action 200 set TIME $_regexp_result
action 210 if $MATCHES gt $THRESHOLD
action 220 cli command "enable"
action 230 cli command "show route-map dynamic detail | tee /append
$DISK:script-output-$_event_pub_sec.txt"
action 240 cli command "show ip route | tee /append
$DISK:script-output-$_event_pub_sec.txt"
action 250 file puts LOGS "$TIME : matches = $MATCHES >
$THRESHOLD - see $DISK:script-output-$_event_pub_sec.txt"
action 260 syslog priority emergencies msg "LOOP DETECTED -
Outputs captured - see $DISK:script-output-$_event_pub_sec.txt !"
action 270 else
action 280 file puts LOGS "$TIME : matches = $MATCHES < or = $THRESHOLD"
action 290 end
action 300 end
The loop detection and PfR clearing/stats collection is completed on different devices that must have inter-device EEM applet communication. The communication between the devices occurs in different ways. This document describes device communication via objects tracked in order to check reachability of dedicated loopbacks advertised in IGP. When an event is detected, the loopback is shut down, which allows applets on remote devices to launch when the object that is tracked goes offline. You can use different loopbacks if different information needs to be exchanged.
These applets and communication methods are used:
Applet Name | Where ? | What ? | Trigger ? | Communication ? |
LOOP-BR | BRs | Check ACL hitcounts in order to detect loops |
Periodic | shut Loop100 |
LOOP-MC | MC | - Collect PfR infos - Clears PfR |
Track reachability Loop100 | shut Loop200 |
COLLECT-BR | BRs | Collect Information | Track reachability Loop200 | none |
Here is an image that illustrates this:
This is the process used by the applets:
This section describes how to create loopbacks (ensure the IPs are advertised on the IGP) and track objects.
Here are some important points to keep in mind when you create track objects:
BR-1
interface Loopback100
ip address 10.100.100.1 255.255.255.255
!
track timer ip route 1
track 1 ip route 10.100.100.200 255.255.255.255 reachability
BR-2
interface Loopback100
ip address 10.100.100.2 255.255.255.255
!
track timer ip route 1
track 1 ip route 10.100.100.200 255.255.255.255 reachability
MC
interface Loopback200
ip address 10.100.100.200 255.255.255.255
!
track timer ip route 1
track 1 ip route 10.100.100.1 255.255.255.255 reachability
track 2 ip route 10.100.100.2 255.255.255.255 reachability
track 11 ip route 10.116.100.1 255.255.255.255 reachability
track 12 ip route 10.116.100.2 255.255.255.255 reachability
track 20 list boolean and
object 11
object 12
LOOP-BR
This section describes how to create loopbacks on the BRs. Here are some important notes to keep in mind:
event manager environment THRESHOLD_1 100event manager environment
THRESHOLD_2 500event manager environment DISK bootflash
!event manager applet LOOP-BR authorization bypass
event timer watchdog name LOOP time 30 maxrun 27
action 100 cli command "enable"
action 110 cli command "show ip access-list LOOP"
action 120 set regexp_substr 0
action 130 regexp "range 20 21 \(([0-9]+) matches\)"
$_cli_result _regexp_result regexp_substr
action 140 cli command "clear ip access-list counters LOOP"
action 150 if $regexp_substr gt 0
action 200 set MATCHES $regexp_substr
action 210 file open LOGS $DISK:script-detect-logs.txt a
action 220 cli command "enable"
action 230 cli command "show clock"
action 240 regexp "[0-9]+:[0-9]+:[0-9]+.[0-9]+
est [A-Za-z]+ [A-Za-z]+ [0-9]+ 201[0-9]"
$_cli_result _regexp_result
action 250 set TIME $_regexp_result
action 260 if $MATCHES gt $THRESHOLD_1
action 270 wait 15
action 280 cli command "show ip access-list LOOP"
action 290 set regexp_substr 0
action 300 regexp "range 20 21 \(([0-9]+) matches\)"
$_cli_result _regexp_result regexp_substr
action 310 if $regexp_substr gt $THRESHOLD_2
action 320 cli command "enable"
action 330 cli command "conf t"
action 340 cli command "interface loop100"
action 350 cli command "shut"
action 360 file puts LOGS "$TIME - LOOP DETECTED - Message sent to MC -
matches $MATCHES > $THRESHOLD_1 and $regexp_substr > $THRESHOLD_2"
action 370 wait 5
action 375 cli command "enable"
action 380 cli command "conf t"
action 390 cli command "interface loop100"
action 400 cli command "no shut"
action 410 else
action 420 file puts LOGS "$TIME - TEMPORARY LOOP : matches $MATCHES >
$THRESHOLD_1 and $regexp_substr < or = $THRESHOLD_2"
action 430 cli command "clear ip access-list counters LOOP-DETAIL"
action 440 end
action 450 else
action 460 cli command "en"
action 470 cli command "clear ip access-list counters LOOP-DETAIL"
action 480 file puts LOGS "$TIME - number of matches =
$MATCHES < $THRESHOLD_1"
action 490 end
action 500 else
action 510 cli command "clear ip access-list counters LOOP-DETAIL"
action 520 end
LOOP-MC
This section describes how to create loopbacks on the MC. Here are some important notes to keep in mind:
event manage environment DISK bootflash
event manager applet LOOP-MC authorization bypass
event syslog pattern "10.100.100.[0-9]/32 reachability Up->Dow" ratelimit 60
action 100 file open LOGS $DISK:script-logs.txt a
action 110 regexp "10.100.100.[0-9]" "$_syslog_msg" _regexp_result
action 120 set BR $_regexp_result
action 130 wait 2
action 140 track read 20
action 150 if $_track_state eq "up"
action 160 cli command "enable"
action 170 cli command "show clock"
action 180 regexp "[0-9]+:[0-9]+:[0-9]+.[0-9]+
est [A-Za-z]+ [A-Za-z]+ [0-9]+ 201[0-9]"
"$_cli_result" _regexp_result
action 190 set TIME "$_regexp_result"
action 200 cli command "show pfr master traffic-class perf det
| tee /append $DISK:script-output-$_event_pub_sec.txt"
action 210 cli command "conf t"
action 220 cli command "interface loop200"
action 230 cli command "shut"
action 240 wait 10
action 250 cli command "conf t"
action 260 cli command "interface loop200"
action 270 cli command "no shut"
action 280 cli command "end"
action 290 cli command "clear pfr master *"
action 300 file puts LOGS "$TIME - LOOP DETECTED by $BR -
PfR CLEARED - see $DISK:script-output-$_event_pub_sec.txt"
action 310 syslog priority emergencies msg "LOOP DETECTED by $BR -
PfR CLEARED - see $DISK:script-output-$_event_pub_sec.txt !"
action 320 else
action 330 file puts LOGS "$TIME - REACHABILITY LOST with
$BR - REACHABILITY TO ALL BRs NOT OK - NO ACTION"
action 340 end
COLLECT-BR
This section describes how to collect the BR. The applet launches in when a BR looses reachability to Loopback200 (10.100.100.200) on MC. The commands used in order to collect are listed in actions 120, 130, and 140.
event manager environment DISK bootflash
event manager applet COLLECT-BR authorization bypass
event syslog pattern "10.100.100.200/32 reachability Up->Dow" ratelimit 45
action 100 file open LOGS $DISK:script-collect-logs.txt a
action 110 cli command "enable"
action 120 cli command "sh ip access-list LOOP-DETAIL |
tee /append $DISK:script-output-$_event_pub_sec.txt"
action 130 cli command "show route-map dynamic detail
| tee /append $DISK:script-output-$_event_pub_sec.txt"
action 140 cli command "show ip route | tee /append
$DISK:script-output-$_event_pub_sec.txt"
action 150 cli command "show clock"
action 160 regexp "[0-9]+:[0-9]+:[0-9]+.[0-9]+ CET [A-Za-z]+ [A-Za-z]+
[0-9]+ 201[0-9]" "$_cli_result" _regexp_result
action 170 set TIME "$_regexp_result"
action 180 file puts LOGS "$TIME - OUTPUTs COLLECTED -
see $DISK:script-output-$_event_pub_sec.txt"
SYSLOG-MC
Here is the syslog on MC when a loop is detected:
MC#
*Mar 8 08:52:12.529: %TRACKING-5-STATE: 1 ip route 10.100.100.1/32
reachability Up->Down
MC#
*Mar 8 08:52:16.683: %LINEPROTO-5-UPDOWN:
Line protocol on Interface Loopback200, changed state to down
*Mar 8 08:52:16.683: %LINK-5-CHANGED: Interface Loopback200,
changed state to administratively down
MC#
*Mar 8 08:52:19.531: %TRACKING-5-STATE: 1
ip route 10.100.100.1/32 reachability Down->Up
MC#
*Mar 8 08:52:24.727: %SYS-5-CONFIG_I: Configured from console by
on vty0 (EEM:LOOP-MC)
*Mar 8 08:52:24.744: %PFR_MC-1-ALERT: MC is inactive due to PfR
minimum requirements not met;
Less than two external interfaces are operational
MC#
*Mar 8 08:52:24.757: %HA_EM-0-LOG: LOOP-MC:
LOOP DETECTED by 10.100.100.1 - PfR CLEARED
- see unix:script-output-1362732732.txt !
MC#
*Mar 8 08:52:26.723: %LINEPROTO-5-UPDOWN:
Line protocol on Interface Loopback200, changed state to up
MC#
*Mar 8 08:52:26.723: %LINK-3-UPDOWN: Interface Loopback200,
changed state to up
MC#
*Mar 8 08:52:29.840: %PFR_MC-5-MC_STATUS_CHANGE: MC is UP
*Mar 8 08:52:30.549: %TRACKING-5-STATE: 2
ip route 10.100.100.2/32 reachability Up->Down
MC#
*Mar 8 08:52:37.549: %TRACKING-5-STATE: 2
ip route 10.100.100.2/32 reachability Down->Up
MC#
Revision | Publish Date | Comments |
---|---|---|
1.0 |
27-May-2013 |
Initial Release |