Configuring Online Diagnostics

Information About Configuring Online Diagnostics

With online diagnostics, you can test and verify the hardware functionality of a device while the device is connected to a live network. Online diagnostics contains packet-switching tests that check different hardware components and verify the data path and control signals.

Online diagnostics detects problems in these areas:

  • Hardware components

  • Interfaces (Ethernet ports and so forth)

  • Solder joints

Online diagnostics are categorized as on-demand, scheduled, or health-monitoring diagnostics. On-demand diagnostics run from the CLI; scheduled diagnostics run at user-designated intervals or at specified times when the device is connected to a live network; and health-monitoring runs in the background with user-defined intervals. The health-monitoring test runs every 90, 100, or 150 seconds based on the test.

After you configure online diagnostics, you can manually start diagnostic tests or display the test results. You can also see which tests are configured for the device and the diagnostic tests that have already run.

Generic Online Diagnostics (GOLD) Tests


Note


  • Before you enable online diagnostics tests, enable console logging to see all the warning messages.

  • While tests are running, all the ports are shut down because a stress test is being performed with looping ports internally, and external traffic might affect the test results. Reboot the switch to bring it to normal operation. When you run the command to reload a switch, the system will ask you if the configuration should be saved. Do not save the configuration.

  • If you are running tests on other modules, after a test is initiated and complete, you must reset the module.


The following sections provide information about GOLD tests.

TestGoldPktLoopback

This GOLD packet loopback test verifies the MAC-level loopback functionality. In this test, a GOLD packet, for which Unified Access Data Plane (UADP) ASIC provides support in hardware, is sent. The packet loops back at the MAC-level and is matched against the stored packet.

Attribute

Description

Disruptive or Nondisruptive

Nondisruptive.

Recommendation

Run this on-demand test as per requirement.

Default

Off.

Intitial release

Cisco IOS XE Gibraltar 16.11.1.

Corrective action

Displays a syslog message if the test fails for a port.

Hardware support

All line cards. Not supported on supervisor engines.

TestOBFL

This test verifies the on-board failure logging capabilities. During this test, a diagnostic message is logged to the Onboard Failure Logging (OBFL).

Attribute

Description

Disruptive or Nondisruptive

Nondisruptive.

Recommendation

Run this on-demand test as per requirement.

Default

Off.

Intitial release

Cisco IOS XE Gibraltar 16.11.1.

Corrective action

Displays a syslog message if the test fails for a port.

Hardware support

All line cards and supervisor engines.

TestFantray

This test verifies if a fan tray has been inserted and is working properly on the board. This test runs every 100 seconds.

Attribute

Description

Disruptive or Nondisruptive

Nondisruptive

Recommendation

Do not disable. This can be run as a health-monitoring test and as an on-demand test.

Default

On.

Intitial release

Cisco IOS XE Gibraltar 16.11.1.

Corrective action

Displays a syslog message if the fan tray is not present, or if any of the fans fail.

Hardware support

Only supervisor engines.

TestPhyLoopback

This PHY loopback test verifies the PHY-level loopback functionality. In this test, a packet, which loops back at the PHY level and is matched against the stored packet, is sent. It cannot be run as a health-monitoring test.

Attribute

Description

Disruptive or Nondisruptive

Disruptive.

Recommendation

Run this as an on-demand test as per requirement.

Default

Off.

Intitial release

Cisco IOS XE Gibraltar 17.1.1.

Corrective action

Displays a syslog message if the test fails for any port.

Hardware support

Only on the C9600-LC-48TX line card.

TestThermal

This test verifies the temperature reading from a device sensor if it is below the yellow temperature threshold. This test runs every 90 seconds.

Attribute

Description

Disruptive or Nondisruptive

Nondisruptive

Recommendation

Do not disable. Run this as an on-demand test and a health-monitoring test.

Default

On.

Intitial release

Cisco IOS XE Gibraltar 16.11.1.

Corrective action

Displays a syslog message if the test fails.

Hardware support

All line cards and supervisor engines.

TestScratchRegister

This Scratch Register test monitors the health of ASICs by writing values into registers and reading back the values from these registers. This test runs every 90 seconds.

Attribute

Description

Disruptive or Nondisruptive

Nondisruptive.

Recommendation

Do not disable. This can be run as a health-monitoring test and also as an on-demand test.

Default

On.

Intitial release

Cisco IOS XE Gibraltar 16.11.1.

Corrective action

Displays a syslog message if the test fails.

Hardware support

Only supervisor engines.

TestConsistencyCheck

This test checks if the hardware programming is correct. It checks with the forwarding object manager to identify incomplete entries or long-pending configurations to hardware. This test runs every 90 seconds.

Attribute

Description

Disruptive or Nondisruptive

Nondisruptive.

Recommendation

Do not disable. This can be run as a health-monitoring test and also as an on-demand test.

Default

On.

Intitial release

Cisco IOS XE Gibraltar 17.2.1.

Corrective action

Displays a syslog message if the test fails.

Hardware support

Only supervisor engines.

TestPortTxMonitoring

This test monitors the transmit counters of a connected interface. It verifies if a connected port is able to send packets or not. This test runs every 150 seconds.

Attribute

Description

Disruptive or Nondisruptive

Nondisruptive.

Recommendation

Do not disable. This can be run as a health-monitoring test and also as an on-demand test.

Default

On.

Intitial release

Cisco IOS XE Gibraltar 16.11.1.

Corrective action

Displays a syslog message if the test fails for a port.

Hardware support

All line cards. Not supported on supervisor engines.

How to Configure Online Diagnostics

The following sections provide information about the various procedures that comprise the online diagnostics configuration.

Starting Online Diagnostic Tests

After you configure diagnostic tests to run on a device, use the diagnostic start privileged EXEC command to begin diagnostic testing.

After starting the tests, you cannot stop the testing process midway.

Use the diagnostic start switch privileged EXEC command to manually start online diagnostic testing:

Procedure

Command or Action Purpose

diagnostic start module number test {name | test-id | test-id-range | all | basic | complete | minimal | non-disruptive | per-port}

Example:



Device# diagnostic start module 2 test basic

Starts the diagnostic tests.

You can specify the tests by using one of these options:

  • name : Enters the name of the test.

  • test-id : Enters the ID number of the test.

  • test-id-range : Enters the range of test IDs by using integers separated by a comma and a hyphen.

  • all : Starts all of the tests.

  • basic : Starts the basic test suite.

  • complete : Starts the complete test suite.

  • minimal : Starts the minimal bootup test suite.

  • non-disruptive : Starts the nondisruptive test suite.

  • per-port : Starts the per-port test suite.

Configuring Online Diagnostics

You must configure the failure threshold and the interval between tests before enabling diagnostic monitoring.

Monitoring and Maintaining Online Diagnostics

You can display the online diagnostic tests that are configured for a device or a device stack and check the test results by using the privileged EXEC show commands in this table:

Table 1. Commands for Diagnostic Test Configuration and Results

Command

Purpose

show diagnostic content module [number | all]

Displays the online diagnostics configured for a switch.

show diagnostic status

Displays the diagnostic tests that are running currently. .

show diagnostic result module [number | all] [detail | test {name | test-id | test-id-range | all} [detail]]

Displays the online diagnostics test results.

show diagnostic post

Displays the POST results. (The output is the same as the show post command output.)

show diagnostic events {event-type | module}

Displays diagnostic events such as error, information, or warning based on the test result.

show diagnostic description module [number] test { name | test-id | all }

Displays the short description of the results from an individual test or all the tests.

Configuration Examples for Online Diagnostics

The following sections provide examples of online diagnostics configurations.

Examples: Start Diagnostic Tests

This example shows how to start a diagnostic test by using the test name:


Device# 

diagnostic start module 3 test DiagFanTest

This example shows how to start all of the basic diagnostic tests:


Device# diagnostic start module 3 test all

Example: Displaying Online Diagnostics

This example shows how to display on-demand diagnostic settings:

Device# show diagnostic ondemand settings

Test iterations = 1
Action on test failure = continue

This example shows how to display diagnostic events for errors:


Device# show diagnostic events event-type error

Diagnostic events (storage for 500 events, 0 events recorded)
Number of events matching above criteria = 0

No diagnostic log entry exists.

This example shows how to display the description for a diagnostic test:

   Device# show diagnostic description module 3 test all 
TestGoldPktLoopback : 
	The GOLD packet Loopback test verifies the MAC level loopback
	functionality. In this test, a GOLD packet, for which doppler
	provides the support in hardware, is sent. The packet loops back
	at MAC level and is matched against the stored packet. It is a
	non-disruptive test.

TestFantray : 
	This test verifies all fan modules have been inserted and working
	properly on the board. It is a non-disruptive test and can be
	run as a health monitoring test.

TestPhyLoopback : 
	The PHY Loopback test verifies the PHY level loopback
	functionality. In this test, a packet is sent which loops back
	at PHY level and is matched against the stored packet. It is a 
	disruptive test and cannot be run as a health monitoring test.

TestThermal : 
	This test verifies the temperature reading from the sensor is
	below the yellow temperature threshold. It is a non-disruptive
	test and can be run as a health monitoring test.

TestScratchRegister : 
	The Scratch Register test monitors the health of
	application-specific integrated circuits (ASICs) by writing values
	into registers and reading back the values from these registers.
	It is a non-disruptive test and can be run as a health monitoring
	test.

TestMemory : 
	This test runs the exhaustive ASIC memory test during normal
	switch operation. Switch utilizes mbist for this test. Memory test
	is very disruptive in nature and requires switch reboot after
	the test.

  

Additional References for Online Diagnostics

Related Documents

Related Topic Document Title

For complete syntax and usage information for the commands used in this chapter.

Command Reference (Catalyst 9600 Series Switches)

Feature History for Configuring Online Diagnostics

This table provides release and related information for features explained in this module.

These features are available on all releases subsequent to the one they were introduced in, unless noted otherwise.

Release

Feature

Feature Information

Cisco IOS XE Gibraltar 16.11.1

Online Diagnostics

With online diagnostics, you can test and verify the hardware functionality of the device while the device is connected to a live network.

Cisco IOS XE Cupertino 17.7.1

Online Diagnostics

Support for this feature was introduced on the Cisco Catalyst 9600 Series Supervisor 2 Module.

Use Cisco Feature Navigator to find information about platform and software image support. To access Cisco Feature Navigator, go to http://www.cisco.com/go/cfn.