Question
How to determine the current failure rate for an alarm at any time after the initial trigger. The initial failure rate may be more obvious to find, but would one find the current fail rate possibly hours later?
Answer
While the initial SNMP trap, log and the alarm track the intial failure rate of a monitored threshold, it may not be obvious how to get the current fail rate. At the bottom of "show threshold", the command reports the status of all alarms with regards to current fail rate in the last polling interval.
[local]PDSN> show threshold
Friday June 12 04:39:32 UTC 2015
Outstanding alarms:
Threshold Name: aaa-acct-failure-rate
Alarm Source: System
Last Measured: 77%
Raise Time: 2015-Jun-11+22:15:05
[local]PDSN> show thresh
Friday June 12 05:34:04 UTC 2015
Outstanding alarms:
Threshold Name: aaa-acct-failure-rate
Alarm Source: System
Last Measured: 65%
Raise Time: 2015-Jun-11+22:15:05
[local]PDSN> show thresh
Friday June 12 06:06:07 UTC 2015
Outstanding alarms:
Threshold Name: aaa-acct-failure-rate
Alarm Source: System
Last Measured: 61%
Raise Time: 2015-Jun-11+22:15:05
[local]PDSN> show alarm outstanding verbose
Friday June 12 04:41:28 UTC 2015
Severity Object Timestamp Alarm ID
-------- ---------- ---------------------------------- ---------------------
Alarm Details
--------------------------------------------------------------------------------
Minor Chassis Thursday June 11 22:15:05 U 5770524519230406656
<28:aaa-acct-failure-rate> has reached or exceeded the configured threshold <25%>, the measured value is <32%>. It is detected at <System>.
2015-Jun-11+22:15:05.418 [alarmctrl 65201 info] [8/0/5185 <evlogd:0> alarmctrl.c:192] [software internal system critical-info syslog] Alarm condition: id 5015057a08690000 (Minor): <28:aaa-acct-failure-rate> has reached or exceeded the configured threshold <25%>, the measured value is <32%>. It is detected at <System>.
Thu Jun 11 22:15:05 2015 Internal trap notification 222 (ThreshAAAAcctFailRate) threshold 25% measured value 32%
Issue clears:
2015-Jun-12+07:15:05.210 [alarmctrl 65200 info] [8/0/5185 <evlogd:0> alarmctrl.c:285] [software internal system critical-info syslog] Alarm cleared: id 5015057a08690000: <28:aaa-acct-failure-rate> has reached or exceeded the configured threshold <25%>, the measured value is <32%>. It is detected at <System>.
Fri Jun 12 07:15:05 2015 Internal trap notification 223 (ThreshClearAAAAcctFailRate) threshold 20% measured value 0%
[local]PDSN> show threshold Friday June 12 13:45:26 UTC 2015
...
No outstanding alarm