Onboard Failure Logging

OBFL gathers boot, environmental, and critical hardware data for field-replaceable units (FRUs), and stores the information in the nonvolatile memory of the FRU. This information is used for troubleshooting, testing, and diagnosis if a failure or other error occurs, providing improved accuracy in hardware troubleshooting and root cause isolation analysis. Stored OBFL data can be retrieved in the event of a failure and is accessible even if the card does not boot.

Because OBFL is on by default, data is collected and stored as soon as the card is installed. If a problem occurs, the data can provide information about historical environmental conditions, uptime, downtime, errors, and other operating conditions.


Note


OBFL is activated by default in all cards and cannot be disabled.


Feature History for Implementing OBFL

Release

Modification

Release 7.0.11

This feature was introduced.

Prerequisites

You must be in a user group associated with a task group that includes the proper task IDs. The command reference guides include the task IDs required for each command. If you suspect user group assignment is preventing you from using a command, contact your AAA administrator for assistance.

Information About OBFL

OBFL is enabled by default. OBFL collects and stores both baseline and event- driven information in the nonvolatile memory of each supported card where OBFL is enabled. The data collected includes the following:

  • Alarms

  • Boot time

  • Field Programmable Device (FPD) Upgrade data

  • FRU part serial number

  • Temperature and voltage at boot

  • Temperature and voltage history

  • Total run time

This data is collected in two different ways as baseline data and event- driven data.

Baseline Data Collection

Baseline data is stored independent of hardware or software failures and includes the information given in the following table.

Table 1. Data Types for Baseline Data Collection

Data Type

Details

Installation

Chassis serial number and slot number are stored at initial boot.

Run-time

Total run-time is limited to the size of the history buffer used for logging. This is based on the local router clock with logging granularity of 15 minutes.

Temperature

Information from the temperature sensors is recorded after boot. The subsequent recordings are specific to variations based on preset thresholds.

Voltage

Information from the voltage sensors is recorded after boot. The subsequent recordings are specific to variations based on preset thresholds.

Event-Driven Data Collection

Event driven data include card failure events. Failure events are card crashes, memory errors, ASIC resets, and similar hardware failure indications.

Table 2. Data Types for Event-Driven Data Collection

Data Type

Details

Alarm

Major and critical alarm state changes.

FPD

FPD upgrade information.

Inventory

IDPROM information and card state changes.

Temperature

Inlet and hot point temperature value changes beyond the thresholds set in the hardware inventory XML files.

Uptime

Card uptime and location history, including the most recent time the card OBFL disk was cleared.

Voltage

Voltage value changes beyond the thresholds set in the hardware inventory XML files.

Supported Cards and Platform

FRUs that have sufficient nonvolatile memory available for OBFL data storage support OBFL. The following table provides information about the OBFL support for different FRUs on the Cisco 8000 Series router.

Table 3. OBFL Support on Cisco 8000 Series Router

Card Type

Cisco 8000 Series Router

Route processor

Supported

Fabric cards

Supported

Line card

Supported

Power supply cards

Not Supported

Fan tray

Not Supported

Monitoring and Maintaining OBFL

Use the commands described in this section to display the status of OBFL, and the data collected by OBFL. Enter these commands in EXEC mode.

Procedure

  Command or Action Purpose

Step 1

Example:


Router# show logging onboard uptime

Displays stored OBFL data for all nodes or for a specified node.

See the Onboard Failure Logging Commands module in the System Monitoring Command Reference for Cisco 8000 Series Routers.

Step 2

show process | include obfl

Example:


Router# show process | include obfl 

Confirms that the OBFL environmental monitor process is operating.

Step 3

show process obflmgr

Example:


Router# show process obflmgr 

Displays details about the OBFL manager process.

Clearing OBFL Data

To erase all OBFL data on a specific card, use the following command:

clear logging onboard [location node-id]


Caution


The clear logging onboard command permanently deletes all OBFL data for a node. Do not clear the OBFL logs without specific reasons because the OBFL data is used to diagnose and resolve problems in FRUs.



Note


The obflmgr process automatically removes old log files to make room for new log files as needed. No manual intervention is required in order to free up OBFL disk space.


For more information, see the Onboard Failure Logging Commands module in the System Monitoring Command Reference for Cisco 8000 Series Routers.