Alarm Troubleshooting

This chapter gives a description, severity, and troubleshooting procedure for each commonly encountered Cisco NCS 1002 alarm and condition. When an alarm is raised, refer to its clearing procedure.

Alarm Logical Objects

Alarms are raised against logical objects. These logical objects represent physical objects such as cards, ports, and logical objects such as trunk.

The table below lists all logical alarm objects used in this chapter.

Table 1. Alarm Logical Object Type Definitions

Logical Object

Definition

FT

Fan-tray assembly.

PPM

Pluggable port module (PPM, also called SFP), referring to MXP and TXP cards.

PSU

Power Supply Unit.

Chassis

Chassis equipment.

Client

Ethernet.

Temp Sensor

Temperature Sensor.

Voltage Sensor

Voltage Sensor.

Trunk

DWDM

Alarm Severity

The DWDM system uses Telcordia-devised standard severities for alarms and conditions: Critical (CR), Major (MJ), Minor (MN), Not Alarmed (NA), and Not Reported (NR). These are described below:

  • Critical (CR) alarm—Indicates severe, service-affecting trouble that needs immediate correction.

  • Major (MJ) alarm—Indicates a serious alarm, but the trouble has less impact on the network.

  • Minor (MN) condition—Indicates alarms that do not affect service.

  • Not Alarmed (NA) condition—Indicates conditions that are information indicators.

  • Not Reported (NR) condition—Indicates a condition that occurs as a secondary result of another event.

Alarms

This section lists the NCS 1002 alarms alphabetically. The severity, description, and troubleshooting procedure accompany each alarm.

ALL-FAN-TRAY-REMOVAL Alarm

Default Severity: Critical (CR), Service-Affecting (SA)

Logical Objects: FT

The ALL-FAN-TRAY-REMOVAL alarm is raised on the NCS 1002 when all the fan trays are removed from the chassis.

CD Alarm

Default Severity: Minor (MN), Non-Service-Affecting (NSA)

Logical Objects: TRUNK

The Chromatic Dispersion (CD) alarm is raised on the NCS 1002 when the detected chromatic dispersion value is above or below the configured threshold values. You can configure the chromatic dispersion threshold values using the controller optics R/S/I/P cd-high-threshold and controller optics R/S/I/P cd-low-threshold commands in the config mode. The range is -70000 to +70000 ps/nm when the trunk bit rate is 100G. The range is -20000 to +20000 ps/nm when the trunk bit rate is 200G or 250G.

Clear the CD Alarm

Procedure

Step 1

Verify the value of the chromatic dispersion threshold of the NCS 1002 using the show controller optics R/S/I/P command.

Step 2

If the value is not within the threshold range, configure the chromatic dispersion threshold using the controller optics R/S/I/P cd-high-threshold and controller optics R/S/I/P cd-low-threshold command in the config mode.

Step 3

If the value is within the range of the chromatic dispersion threshold, contact Cisco Technical Assistance Center (TAC).

If the condition does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


DAISYDUKE-CPU-FPGA-PCIE-ERROR Alarm

Default Severity: Critical (CR) , Service-Affecting (SA)

Logical Objects: EQUIPMENT

The DAISYDUKE-CPU-FPGA-PCIE-ERROR alarm occurs when the Daisy Duke CPU FPGA is unable to communicate with the CPU controller due to a Peripheral Component Interconnect Express (PCIe) error.

DAISYDUKE-CPU-PROCESSOR-HOT Alarm

Default Severity: Minor (MN), Non-Service-Affecting (NSA)

Logical Objects: EQUIPMENT

The DAISYDUKE-CPU-PROCESSOR-HOT alarm occurs when the daisyduke FPGA detects that the temperature of the CPU processor is high.

Clear the DAISYDUKE-CPU-PROCESSOR-HOT Alarm

Procedure

Verify if all the fans are functioning properly.

If the condition does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


DGD Alarm

Default Severity: Minor (MN), Non-Service-Affecting (NSA)

Logical Objects: TRUNK

The Differential Group Delay (DGD) alarm is raised on the NCS 1002 when the value of the differential group delay read by the pluggable port module exceeds the configured threshold value. You can configure the threshold value using the controller optics R/S/I/P dgd-high-threshold command in the config mode.

Clear the DGD Alarm

Procedure

Step 1

Verify the value of the differential group delay threshold of the NCS 1002 using the show controller optics R/S/I/P command.

Step 2

If the value is not within the threshold range, configure the differential group delay threshold using the controller optics R/S/I/P dgd-high-threshold command in the config mode. The range is from 0 to 18000 (in units of 0.01 ps).

Step 3

If the value is within the range of the differential group delay threshold, contact Cisco TAC.

If the condition does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


EQPT-FAIL-SLICE Alarm

Default Severity: Critical (CR), Service-Affecting (SA)

Logical Objects: Slice

The Equipment Fail (EQPT-FAIL) slice alarm is raised on the NCS 1002 when any component or device within the slice fails. The alarm is tagged as Not Service Affecting (NSA) if the device is not in the service path. The failed device is on a port that is not provisioned.

Clear the EQPT-FAIL-SLICE Alarm

Procedure

Step 1

Reload the host using the hw-module location 0/RP0 reload command.

Step 2

If the alarm does not clear, do a traffic-impacting system reload using the hw-module location all reload command.

If the condition does not clear, log into the Technical Support Website at http://www.cisco.com/cisco/web/support/index.html for more information or call Cisco TAC (1 800 553-2447).


FAN FAIL Alarm

Default Severity: Major (MJ), Service-Affecting (SA))

Logical Objects: FT

The FAN FAIL alarm is raised on the NCS 1002 when one of the three fans fail. When any fan fails, the temperature of the NCS 1002 can rise above its normal operating range. This condition can trigger the TEMP THRESHOLD alarm.

FAN MISSING Alarm

Default Severity: Major (MJ), Service-Affecting (SA)

Logical Objects: FT

The FAN MISSING alarm is raised on the NCS 1002 when one of the three fans is missing or is not correctly inserted. When any fan is missing, the temperature of the NCS 1002 can rise above its normal operating range. This condition can trigger the TEMP THRESHOLD alarm.

Clear the FAN MISSING Alarm

Procedure

Step 1

Verify that a fan is missing or is not correctly inserted.

Step 2

Insert a fan. The fan should run immediately when correctly inserted.

If the alarm does not get cleared, you need to report a Service-Affecting (SA) problem. Log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or log into http://www.cisco.com/c/en/us/support/web/tsd-cisco-worldwide-contacts.html to obtain a directory of toll-free Technical Support numbers for your country.


FAN-TRAY-INSERT Alarm

Default Severity: Minor(MN), Non-Service Affecting (NSA)

Logical Objects: FT

The FAN-TRAY-INSERT alarm is raised on the NCS 1002 when the fan tray is detected in the chassis.

Clear the FAN-TRAY-INSERT Alarm

Procedure

This alarm clears automatically.

If the condition does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


FAN-TRAY-REMOVAL

Default Severity: Major (MJ), Service-Affecting (SA)

Logical Objects: FT

The FAN-TRAY-REMOVAL alarm is raised on the NCS 1002 when any of the three fans is removed from the chassis.

HI-LASERBIAS Alarm

Default Severity: Minor (MN), Non-Service-Affecting (NSA)

Logical Objects: PPM

The Equipment High Transmit Laser Bias (HI-LASERBIAS) current alarm is raised against the NCS 1002 laser performance. This alarm occurs when the physical pluggable port laser detects a laser bias value beyond the configured high threshold. The alarm indicates that the port laser has reached the maximum laser bias tolerance. You can configure the threshold value using the controller optics R/S/I/P lbc-high-threshold command in the config mode.

Laser bias typically starts at about 30 percent of the manufacturer maximum laser bias specification and increases as the laser ages. If the HI-LASERBIAS alarm threshold is set at 100 percent of the maximum, the laser usability has ended. If the threshold is set at 90 percent of the maximum, the card is still usable for several weeks or months before it needs to be replaced.

Clear the HI-LASERBIAS Alarm

Procedure

Step 1

Verify the value of the laser bias high threshold of NCS 1002 using the show controller optics R/S/I/P command.

Step 2

If the value is not within the high laser bias threshold range, configure the high laser bias threshold using the controller optics R/S/I/P lbc-high-threshold command in the config mode. The range is from 0 to 100%.

Step 3

If the value is within the range of the high laser bias threshold, physically replace the pluggable module. Replacement is not urgent and can be scheduled during a maintenance window.

If the condition does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


HI-RXPOWER Alarm

Default Severity: Minor (MN), Non-Service-Affecting (NSA)

Logical Objects: PPM

The Equipment High Receive Power (HI-RXPOWER) alarm is an indicator of the received optical signal power of NCS 1002. This alarm occurs on the client optics controller when the measured individual lane optical signal power of the received signal exceeds the default or user-defined threshold. This alarm occurs on the trunk optics controller when the total optical signal power of the received signal exceeds the default or user-defined threshold. You can configure the threshold value using the controller optics R/S/I/P rx-high-threshold command in the config mode.

Clear the HI-RXPOWER Alarm

Procedure

Step 1

Verify the value of the high receive power threshold of the NCS 1002 using the show controller optics R/S/I/P command.

Step 2

If the value is not within the high receive power threshold range, configure the high receive power threshold using the controller optics R/S/I/P rx-high-threshold command in the config mode. The range is from -400 to 300 dBm (in units of 0.1 dBm).

Step 3

If the value is within the range of the high receive power threshold, physically verify, by using a standard power meter that the optical input power is overcoming the expected power threshold.

Step 4

Change the threshold value or use an attenuator to reduce the input power to the desired level.

If the condition does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


HI-TXPOWER Alarm

Default Severity: Minor (MN), Non-Service-Affecting (NSA)

Logical Objects: PPM

The Equipment High Transmit Power (HI-TXPOWER) alarm is an indicator of the transmitted optical signal power of NCS 1002. This alarm occurs on the client optics controller when the measured individual lane optical signal power of the transmitted signal exceeds the default or user-defined threshold. This alarm occurs on the trunk optics controller when the total optical signal power of the transmitted signal exceeds the default or user-defined threshold. You can configure the threshold value using the controller optics R/S/I/P tx-high-threshold command in the config mode.

Clear the HI-TXPOWER Alarm

Procedure

Step 1

Verify the value of the high transmit power threshold of NCS 1002 using the show controller optics R/S/I/P command.

Step 2

If the value is not within the high transmit power threshold range, configure the high transmit power threshold using the controller optics R/S/I/P tx-high-threshold command in the config mode. The range is from -400 to 300 dBm (in units of 0.1 dBm).

Step 3

If the value is within the range of the high transmit power threshold, physically verify, by using a standard power meter that the optical output power is overcoming the expected power threshold. If so, the pluggable module should be replaced at first opportunity.

If the condition does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


HIBER Alarm

Default Severity: Major (MJ), Service-Affecting (SA)

Logical Objects: Client

The High Bit Error Rate (HIBER) alarm is raised on the NCS 1002 when the client and trunk ports receive 16 or more invalid sync-headers in 125 microseconds. This alarm occurs when the NCS 1002 is configured with 10 GE or 10 GE Fibre Channel (FC) payloads.

Limitation:

HIBER alarm is detected on Rx for 0.5 seconds. After 0.5 secs, the link is brought down automatically leading to loss of block lock(SYNCLOSS). Due to this limitation, the end user cannot be able to potentially see HIBER through CLI, since Ncs1k/XR Alarm reporting soak time is only 2 seconds (for raising).

Clear the HIBER Alarm

Procedure

The alarm clears under the following conditions:

  • When the card port does not receive a high bit error rate.

  • When the optical connectors are cleaned.

If the alarm does not get cleared, you need to report a Service-Affecting (SA) problem. Log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or log into http://www.cisco.com/c/en/us/support/web/tsd-cisco-worldwide-contacts.html to obtain a directory of toll-free Technical Support numbers for your country.


IMPROPRMVL Alarm

Default Severity: Critical (CR), Service-Affecting (SA)

Logical Objects: PPM

The Improper Removal (IMPROPRMVL) alarm occurs when a physical pluggable is absent on a service-provisioned port of the NCS 1002.

Clear the IMPROPRMVL Alarm

Procedure

Step 1

Verify if the pluggable is plugged into the port of the NCS 1002 using the show inventory command.

Step 2

If the QSFP or CFP is not plugged into the port, insert the appropriate QSFP or CFP.

Note

 

Before you configure the client data rate of the NCS 1002, ensure that the appropriate QSFP or CFP is plugged into the port of the NCS 1002.

If the alarm does not get cleared, you need to report a Service-Affecting (SA) problem. Log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or log into http://www.cisco.com/c/en/us/support/web/tsd-cisco-worldwide-contacts.html to obtain a directory of toll-free Technical Support numbers for your country.


LO-RXPOWER Alarm

Default Severity: Minor (MN), Non-Service-Affecting (NSA)

Logical Objects: PPM

The Equipment Low Receive Power (LO-RXPOWER) alarm is an indicator of the received optical signal power of NCS 1002. This alarm occurs on the client optics controller when the measured individual lane optical signal power of the received signal falls below the default or user-defined threshold. This alarm occurs on the trunk optics controller when the total optical signal power of the received signal falls below the default or user-defined threshold. You can configure the threshold value using the controller optics R/S/I/P rx-low-threshold command in the config mode.

Clear the LO-RXPOWER Alarm

Procedure

Step 1

Verify the value of the receive power of NCS 1002 using the show controller optics R/S/I/P command.

Step 2

If the value is not within the low receive power threshold range, configure the low receive power threshold using the controller optics R/S/I/P rx-low-threshold command in the config mode. The range is from -400 to 300 dBm (in units of 0.1 dBm).

Step 3

If the value is within the range of the low receive power threshold, physically verify, by using a standard power meter that the optical input power overcomes the expected power threshold. Ensure that a proper threshold has been provisioned for the receive value. If an incorrect threshold has been set, adjust it to a value within the allowed limits.

Step 4

Verify that the Trunk-Rx port is cabled correctly, and clean the fiber connecting the faulty TXP/MXP to the drop port of the DWDM card.

Step 5

Determine whether a bulk attenuator is present and if so, verify that a proper fixed attenuation value has been used.

Step 6

Using a test set, check the optical power value of the drop port of the DWDM card connected to the faulty TXP/MXP. If the read value is different (+1 dBm or 1 dBm) from the ANS setpoint for Padd&drop-Drop power, move to next step.

Step 7

Look for any alarm reported by the DWDM cards belonging to the OCHNC circuit whose destination is the faulty TXP/MXP and first troubleshoot that alarm. Possible alarm related include: amplifier Gain alarms, Automatic Power Control (APC) alarms, and LOS-P alarms on the Add or Drop ports belonging to the OCHNC circuit.

If the condition does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


LO-TXPOWER Alarm

Default Severity: Minor (MN), Non-Service-Affecting (NSA)

Logical Objects: PPM

The Equipment Low Transmit Power (LO-TXPOWER) alarm is an indicator of the transmitted optical signal power of NCS 1002. This alarm occurs on the client optics controller when the measured individual lane optical signal power of the transmitted signal falls below the default or user-defined threshold. This alarm occurs on the trunk optics controller when the total optical signal power of the transmitted signal falls below the default or user-defined threshold. You can configure the threshold value using the controller optics R/S/I/P tx-low-threshold command in the config mode. The range is from -400 to 300 dBm (in units of 0.1 dBm).

Clear the LO-TXPOWER Alarm

Procedure

Step 1

Verify the value of the low transmit power threshold of NCS 1002 using the show controller optics R/S/I/P command.

Step 2

If the value is not within the low transmit power threshold range, configure the low transmit power threshold using the controller optics R/S/I/P tx-low-threshold command in the config mode. The range is from -400 to 300 dBm (in units of 0.1 dBm).

Step 3

If the value is within the range of the low transmit power threshold, physically verify, by using a standard power meter that the optical output power is overcoming the expected power threshold. If so, the pluggable module should be replaced at first opportunity.

If the condition does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


LOCAL-FAULT Alarm

Default Severity: Major (MJ), Service-Affecting (SA)

Logical Objects: Client

The LOCAL-FAULT alarm is raised on the NCS 1002 client port provisioned with 10 GE, 40 GE, or 100 GE payloads. This alarm occurs when a local fault character sequence is received in the incoming MAC stream as defined in IEEE 802.3.

LOM Alarm

Default Severity: Critical (CR), Service-Affecting (SA)

Logical Objects: TRUNK

The Optical Transport Unit (OTU) Loss of Multiframe (LOM) alarm is an OTN alarm for the trunk port and occurs when the Multi Frame Alignment Signal (MFAS) is corrupted. This alarm is raised when the MFAS overhead field is invalid for more than five frames and persists for more than 3 milliseconds.

Clear the LOM Alarm

Procedure

Step 1

Ensure that the fiber connector for the card is completely plugged in.

Step 2

If the bit error rate (BER) threshold is correct and at the expected level, use an optical test set to measure the power level of the line to ensure it is within guidelines.

Step 3

If the optical power level is good, verify that optical receive levels are within the acceptable range.

Step 4

If receive levels are good, clean the fibers at both ends according to site practice.

Step 5

If the condition does not clear, verify that single-mode fiber is used.

Step 6

If the fiber is of the correct type, verify that a single-mode laser is used at the far-end node.

Step 7

Clean the fiber connectors at both ends for a signal degrade according to site practice.

Step 8

Verify that a single-mode laser is used at the far end.

Step 9

If the problem does not clear, the transmitter at the other end of the optical line could be failing and require replacement

If the alarm does not get cleared, you need to report a Service-Affecting (SA) problem. Log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or log into http://www.cisco.com/c/en/us/support/web/tsd-cisco-worldwide-contacts.html to obtain a directory of toll-free Technical Support numbers for your country.


LOS-P Alarm

Default Severity: Critical (CR), Service-Affecting (SA)

Logical Objects: TRUNK

The Loss of Signal Payload (LOS-P) alarm for the trunk layer indicates that the PPM or CFP does not get any incoming payload signal. The purpose of the LOS-P alarm is to alert the user that no optical power is being received from the fiber. A common fault condition signaled by this alarm is a fiber cut. In this case, neither the payload nor the overhead signals are received.

Clear the LOS-P Alarm

Procedure

Step 1

Verify that the trunk port is configured with the proper wavelength. For more information see the Cisco NCS 1000 Series Configuration Guide.

Step 2

Verify if there is a loss of received optical power. Compare the actual power levels with the expected power range..

Step 3

Verify the fiber continuity to the port of the NCS 1002 and fix the fiber connection.

If the alarm does not get cleared, you need to report a Service-Affecting (SA) problem. Log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or log into http://www.cisco.com/c/en/us/support/web/tsd-cisco-worldwide-contacts.html to obtain a directory of toll-free Technical Support numbers for your country.


LOWER-CTRL-FPGA-PCIE-ERROR

Default Severity: Critical (CR), Service-Affecting (SA)

Logical Objects: EQUIPMENT

The LOWER-CTRL-FPGA-PCIE-ERROR alarm is raised when a control FPGA of the lower board is unreachable because of a Peripheral Component Interconnect Express (PCIe) error.

MEA Alarm

Default Severity: Critical (CR), Service-Affecting (SA)

Logical Objects: PPM

The Missing Equipment Attributes (MEA) alarm for the PPM or QSFP is raised on a pluggable port of the NCS 1002 when there is a mismatch in the configured client data rate and the supported QSFP physical data rate. For example, if the configured client data rate is 10G and the supported data rate of the QSFP is 100G, then this alarm is raised.

Clear the MEA Alarm

Procedure

Step 1

Verify the supported physical data rate of the QSFP on the NCS 1002 using the show inventory command.

Step 2

Verify the configured client data rate on the NCS 1002 using the show hw-module slice command.

Step 3

If the above values do not match, insert the appropriate QSFP pluggable or configure the required client data rate using the hw-module location location slice slice_number client bitrate [ 10G | 40G | 100G] command.

If the alarm does not get cleared, you need to report a Service-Affecting (SA) problem. Log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or log into http://www.cisco.com/c/en/us/support/web/tsd-cisco-worldwide-contacts.html to obtain a directory of toll-free Technical Support numbers for your country.


MIXED-AC-DC-PTS Alarm

Default Severity: Critical (CR), Service-Affecting (SA)

Logical Objects: EQUIPMENT

The MIXED-AC-DC-PTS alarm occurs when both the AC and DC power trays are detected in the chassis.

MKA-LOCALLY-SECURED Alarm

Default Severity: Critical (CR) , Service-Affecting (SA)

Logical Objects: Client

The MACsec Key Agreement-Locally-Secured (MKA-LOCALLY-SECURED) alarm is raised, when MKA session becomes locally secured, due to the loss of an active programmed peer.

A node loses its peer under the following conditions:

  • When the node shuts down its interface.

    When the node issues no macsec command for more than 6 times.

MKA-START-FAIL-MISSING-PSK Alarm

Default Severity: Critical (CR) , Service-Affecting (SA)

Logical Objects: Client

The MACsec Key Agreement-Start-Fail-Missing-Pre-Shared-Keys alarm is raised on the NCS 1002 interface, when MKA session cannot start due to the application of macsec keychain that has no active keys.

Clear the MKA-START-FAIL-MISSING-PSK Alarm

The alarm clears after configuring the lifetime of the key to current active key and bringing up the mks session.

If the alarm does not get cleared, you need to report a Service-Affecting (SA) problem. Log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or log into http://www.cisco.com/c/en/us/support/web/tsd-cisco-worldwide-contacts.html to obtain a directory of toll-free Technical Support numbers for your country.

MKA-UNSECURE-MISSING-PSK Alarm

Default Severity: Critical (CR) , Service-Affecting (SA)

Logical Objects: Client

The MACsec Key Agreement-UNSECURE-MISSING-Pre-Shared-Keys alarm is raised, when the existing secured MKA session is brought down. This is due to the insecurity caused by the non-active keys present in the keychain on the NCS 1002 interface.

Clear the MKA-UNSECURE-MISSING-PSK Alarm

The alarm clears after configuring an active key in the keychain for bringing up the mka session. It is also required to have overlapping keys, so that the next key becomes active prior to the expiry of existing key.

If the alarm does not get cleared, you need to report a Service-Affecting (SA) problem. Log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or log into http://www.cisco.com/c/en/us/support/web/tsd-cisco-worldwide-contacts.html to obtain a directory of toll-free Technical Support numbers for your country.

MORE-THAN-ONE-FAN-TRAY-REMOVED Alarm

Default Severity: Major (MJ), Service-Affecting (SA)

Logical Objects: EQUIPMENT

The MORE-THAN-ONE-FAN-TRAY-REMOVED alarm occurs when more than one fan tray is removed from the chassis.

NM-I2C-ACCESS-ERROR Alarm

Default Severity: Major (MJ), Service Affecting (SA)

Logical Objects: EQUIPMENT

The Node Management Inter-Integrated Circuit Access Error (NM-I2C-ACCESS-ERROR) alarm occurs when there is an I2C error.

NM-SHUTDOWN-CARD Alarm

Default Severity: Critical (CR), Service-Affecting (SA)

Logical Objects: EQUIPMENT

The NM-SHUTDOWN-CARD alarm occurs when a card shuts down due to a critical error.

NOT ABLE TO COMMUNICATE WITH PATCH-PANEL Alarm

Default Severity: Minor (MN)

Logical Objects: Software

This alarm is raised on NCS 1002 when communication with the breakout patch-panel is lost. The alarm is also raised when the connected interface goes down or the patch panel reloads. Also, it is raised when the connected interface goes down or the panel reloads.

Clear the NOT ABLE TO COMMUNICATE WITH PATCH-PANEL Alarm

The alarm clears after establishing communication with the patch panel. Also, check whether the patch panel is reachable and is not reloaded.

If the alarm does not get cleared, you need to report a Service-Affecting (SA) problem. Log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or log into http://www.cisco.com/c/en/us/support/web/tsd-cisco-worldwide-contacts.html to obtain a directory of toll-free Technical Support numbers for your country.

OSNR Alarm

Default Severity: Minor (MN), Non-Service-Affecting (NSA)

Logical Objects: TRUNK

The optical signal-to-noise ratio (OSNR) alarm is an indicator of the OSNR of NCS 1002. The OSNR alarm occurs when the measured OSNR falls below the threshold. You can configure the threshold value using the controller optics R/S/I/P osnr-low-threshold command in the config mode.

Clear the OSNR Alarm

Procedure

Step 1

Verify the value of the minimum acceptable OSNR value of NCS 1002 using the show controller optics R/S/I/P command.

Step 2

If the value is not within the OSNR threshold range, configure the minimum acceptable OSNR value using the controller optics R/S/I/P osnr-low-threshold command in the config mode. The range is 0 to 4000 (in units of 0.01db).

Step 3

If the value is within the range of the minimum acceptable OSNR, contact TAC .

If the condition does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


OTUK-AIS Alarm

Default Severity: Not Reported (NR), Non-Service-Affecting (NSA)

Logical Objects: TRUNK

An Alarm Indication Signal (AIS) signal communicates to the receiving node when the transmit node does not send a valid signal. AIS is not an error. The OTUK-AIS alarm is raised by the receiving node on each input when it detects the AIS instead of a real signal.

OTUK-AIS is a generic AIS signal with a repeating AIS PN-11 sequence. This pattern is inserted by the card in the ITU-T G.709 frame (Trunk) when a faulty condition is present on the client side.

OTUK-BDI Alarm

Default Severity: Not Reported (NR), Non-Service-Affecting (NSA)

Logical Objects: TRUNK

The Optical Transport Unit Backward Defect Indication (OTUK BDI) alarm is raised when there is a path termination error in the upstream data. This error is read as a BDI bit in the path monitoring area of the digital wrapper overhead.

Clear the OTUK-BDI Alarm

Procedure

Step 1

At the near-end node, use site practices to clean trunk transmitting fiber toward the far-end node and the client receiving fiber.

Step 2

At the far-end node, determine whether any OTUK-AIS condition, is present on the Trunk-RX. If so, the root cause to be investigated is the Trunk-TX side on the near-end card (the one alarmed for OTUK-BDI) because that is the section where the AIS bit is inserted.

Step 3

If there is no OTUK-AIS at the far-end node, continue to investigate performances of the Trunk-Rx: Look for other OTU-related alarms, such as the OTUK-LOF condition or OTUK-SD condition at the far-end Trunk-RX. If either is present, resolve the condition using the appropriate procedure in this chapter.

If the alarm does not get cleared, you need to report a Service-Affecting (SA) problem. Log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or log into http://www.cisco.com/c/en/us/support/web/tsd-cisco-worldwide-contacts.html to obtain a directory of toll-free Technical Support numbers for your country.


OTUK-LOF Alarm

Default Severity: Critical (CR), Service-Affecting (SA)

Logical Objects: TRUNK

The Optical Transport Unit Loss of Frame (OTUK-LOF) alarm is raised when a frame loss is detected by an invalid frame alignment in the received frames. This alarm indicates that the card has lost frame delineation on the input data. Loss of frame occurs when the optical transport unit overhead frame alignment (FAS) area is invalid for more than five frames and that the error persists more than three milliseconds.

This alarm is also raised when the FEC settings on trunk ports of the source and destination cards are different.

OTUK-SD Alarm

Default Severity: Not Alarmed (NA), Non Service-Affecting (NSA)

Logical Objects: TRUNK

The Optical Transport Unit Signal Degrade (OTUK-SD) alarm occurs when the quality of signal is so poor that the bit error rate on the incoming optical line passed the signal degrade threshold.

Clear the OTUK SD Alarm

Procedure

Rectify the reason of poor quality signal.

If the condition does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


OTUK-SF Alarm

Default Severity: Not Alarmed (NA), Non Service-Affecting (NSA)

Logical Objects: TRUNK

Both hardware and software can generate the Optical Transport Unit Signal Fail (OTUK-SF) alarm based on the summarization of LOS, LOF, and LOM alarms.

Clear the OTUK SF Alarm

Procedure

OTUK SF alarm gets cleared when none of the defects LOS, LOF, or, LOM exist.

If the condition does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


OTUK-TIM Alarm

Default Severity: Major (MJ), Service-Affecting (SA)

Logical Objects: PORT

The OTUK-TIM alarm or the Trail Trace Identifier (TTI) Mismatch alarm is raised on the NCS 1002 when the expected TTI string does not match the received section trace string.

Clear the OTUK-TIM Alarm

Procedure

Rectify the reason to have different expected and received TTI strings.

If the condition does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


PATCH-PANEL POWER REDUNDANCY LOST ALARM

Default Severity: Minor (MN)

Logical Objects: Power

The PATCH-PANEL POWER REDUNDANCY LOST alarm is raised when one of the power supply units present in the breakout patch-panel is not functional.

PEM-MODULE-INSERT Alarm

Default Severity: Minor(MN), Non-Service Affecting (NSA)

Logical Objects: PEM

The PEM-MODULE-INSERT alarm is raised on the NCS 1002 when the Power Entry Module (PEM) is inserted into the chassis.

Clear the PEM-MODULE-INSERT Alarm

This alarm clears automatically.

If the condition does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).

PEM-MODULE-REMOVAL Alarm

Default Severity: Major (MJ), Service-Affecting (SA)

Logical Objects: PEM

The PEM-MODULE-REMOVAL alarm is raised on the NCS 1002 when one of the redundant PEM module is removed from the system.

POWER MODULE OUTPUT DISABLED Alarm

Default Severity: Major (MJ), Service-Affecting (SA)

Logical Objects: PEM

The POWER MODULE OUTPUT DISABLED alarm is raised on the NCS 1002 when the power supply is disabled on the active PEM.

PEM-PWR-TRAY-LVL-RED-LOST

Default Severity: Major (MJ), Service-Affecting (SA)

Logical Objects: PEM

The PEM-PWR-TRAY-LVL-RED-LOST alarm is raised on the NCS 1002 when one of the two active PEMs is removed.

PLL-FAIL-SLICE Alarm

Default Severity: Major (MJ), Service-Affecting (SA)

Logical Objects: SLICE

The PLL-FAIL-SLICE alarm is raised on the slice when a fault is detected in the Phase Lock Loop (PLL) device and it becomes inaccessible.

Clear the PLL-FAIL-SLICE Alarm

Procedure

Remove the slice configuration and reload the Cisco IOS XR.

If the alarm does not clear it indicates that it is a hardware fault.

If the alarm does not get cleared, you need to report a Service-Affecting (SA) problem. Log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or log into http://www.cisco.com/c/en/us/support/web/tsd-cisco-worldwide-contacts.html to obtain a directory of toll-free Technical Support numbers for your country.


PPM FAIL Alarm

Default Severity: Critical (CR), Service-Affecting (SA)

Logical Objects: PPM

The PPM FAIL alarm is raised on the physical pluggable when a fault is detected in the PPM and it becomes inaccessible.

PKG-K9SEC-REQ Alarm

Default Severity: Minor (MN)

Logical Objects: Chassis equipment

The PKG-K9SEC-REQ alarm occurs when the encrypted slice is configured without installing the k9sec package or the macsec_mka_aipc_driver library or if the k9sec package is removed when encrypted slice is available. The macsec_mka_aipc_driver library is contained in the k9sec package.

PSU FAIL Alarm

Default Severity: Major (MJ), Service-Affecting (SA)

Logical Objects: PSU

The PSU FAIL alarm is raised on the NCS 1002 when one of the power modules fail.

PSU INSERTED Alarm

Default Severity: Minor (MN), Non-Service-Affecting (NSA)

Logical Objects: PSU

The PSU INSERTED alarm is raised on the NCS 1002 when a power supply is inserted to the device.

PSU REDUNDANCY LOST Alarm

Default Severity: Major (MJ), Service-Affecting (SA)

Logical Objects: PSU

The PSU REDUNDANCY LOST alarm is raised on the NCS 1002 when a power supply is removed.

REMOTE-FAULT Alarm

Default Severity: Major (MJ), Service-Affecting (SA)

Logical Objects: CLIENT

The REMOTE-FAULT alarm is raised on the NCS 1002 when a remote fault character sequence is received in the incoming MAC stream as defined in IEEE 802.3.

Clear the REMOTE-FAULT Alarm

Procedure

Step 1

Verify and resolve the client port fault and remote fault errors on the remote or upstream node.

Step 2

Verify and resolve loss of signal synchronization error on the remote or upstream node.

If the alarm does not get cleared, you need to report a Service-Affecting (SA) problem. Log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or log into http://www.cisco.com/c/en/us/support/web/tsd-cisco-worldwide-contacts.html to obtain a directory of toll-free Technical Support numbers for your country.


RM-DEVICE-OUT-OF-TOL Alarm

Default Severity: Minor (MN), Non-Service Affecting (NSA)

Logical Objects: FT

The RM-DEVICE-OUT-OF-TOL alarm is raised on the NCS 1002 when the fan is malfunctioned and is not working properly.

Clear the RM-DEVICE-OUT-OF-TOL Alarm

Procedure

Replace the malfunctioned fan.

If the condition does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


SIGLOSS Alarm

Default Severity: Major (MJ), Service-Affecting (SA)

Logical Objects: CLIENT

The Signal Loss on Data Interface alarm is raised on the client-side QSFP of the NCS 1002 when there is a loss of Ethernet signal.

Clear the SIGLOSS Alarm

Procedure

Step 1

Ensure that the port connection at the near end of the client peer router is operational.

Step 2

Verify fiber continuity to the port.

If the alarm does not get cleared, you need to report a Service-Affecting (SA) problem. Log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or log into http://www.cisco.com/c/en/us/support/web/tsd-cisco-worldwide-contacts.html to obtain a directory of toll-free Technical Support numbers for your country.


SYNCLOSS Alarm

Default Severity: Major (MJ), Service-Affecting (SA)

Logical Objects: CLIENT

The Loss of Synchronization on Data Interface alarm is raised on the NCS 1002 client and trunk ports when there is a loss of signal synchronization on the port. This alarm is demoted by the SIGLOSS alarm.

Clear the SYNCLOSS Alarm

Procedure

Step 1

Ensure that the data port connection at the near end of the Ethernet link is operational.

Step 2

Verify the fiber continuity to the port. To do this, follow site practices.

Step 3

For 100 GE, verify that the FEC settings match between the router and NCS 1002.

If the alarm does not get cleared, you need to report a Service-Affecting (SA) problem. Log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or log into http://www.cisco.com/c/en/us/support/web/tsd-cisco-worldwide-contacts.html to obtain a directory of toll-free Technical Support numbers for your country.


SQUELCHED Alarm

Default Severity: Not Alarmed (NA), Non-Service-Affecting (NSA)

Logical Objects: GE, CLIENT

The client signal squelched condition is raised by the NCS 1002 in the following situations:

  • Laser-squelch is configured under the client controller.


    Note


    Laser-squelch is supported only on QSFP+, QSFP28 LR4, and QSFP28 CWDM4 pluggables.


  • Laser-squelching occurs on a QSFP pluggable when all the four lanes operating in the 10 GE client mode are turned off after the upstream receive facility has experienced a loss of signal (such as optical LOS, LOF).

  • A MXP or TXP client facility detects that an upstream receive facility has experienced a loss of signal (such as optical LOS, LOF). In response, the client facility transmit is turned off (SQUELCHED). The upstream receive facility is the trunk receive on the same slice as the client.

  • The client will squelch if the upstream trunk receive (on the trunk port carrying the client payload) experiences a LOS, or LOF (TRUNK) alarm.

  • The client will squelch if any one of trunk ports receive LOS, or LOF. In this case, the client port is mapped to two trunk ports. For example, the HundredGigECtrlr 0/0/0/9 is mapped to the CoherentDSP0/0/0/12 and the CoherentDSP0/0/0/13 for a 10 GE to 100 GE hw-module config.

The local client raises a SQUELCHED condition if the local trunk raises one of the following alarms:

Far-end Laser Squelch

NCS 1002 supports far-end laser squelch. This feature relays the following:

  • Client input signal to the far-end client output signal, especially faults.

  • Near-end client-side faults such as SIGOSS, SYNCLOSS, HIBER, and LF to the far-end.

    If a near-end QSFP has a fault on its RX side, the far-end QSFP laser is turned off.

The SQUELCHED alarm is raised on the far-end NCS 1002 if laser-squelch is configured on the far-end NCS 1002 and a fault on the near-end NCS 1002 is observed on the client RX side.


Note


Far-end client laser squelch does not work on headless events such as a reload, CPU OIR, or an mxp_driver restart.


Clear the SQUELCHED Alarm

Procedure

Determine whether the associated NCS 1002 trunk port reports an LOF or LOS alarm (for the client trunk). If it does, turn to the relevant section in this chapter and complete the troubleshooting procedure.


TEMPERATURE Alarm

Default Severity: Critical (CR) , Service-Affecting (SA)

Logical Objects: EQUIPMENT

The temperature alarms are raised on the NCS 1002 when the temperature is not within the operating range. When this condition occurs, critical devices like DSP, and ETNA automatically shut down.

The alarm can appear in one of the following formats:

  • [sensor name] : high temperature alarm

  • [sensor name] : low temperature alarm

The [sensor name] : high temperature alarm is raised when the temperature is high and not within the operating range.

The [sensor name] : low temperature alarm is raised when the temperature is low and not within the operating range.

All sensors managed at the CPU level have NM as the prefix. NM denotes Node management. All sensors managed at the rack level have RM as the prefix. RM denotes Rack management. Some alarms that come in this category have the alarm tags as: NM-TEMP, RM-TEMP.

Fan Speed and Chassis Inlet Temperature Thresholds

The table below lists the chassis inlet temperature threshold values for the different fan speeds.

Fan speed (rpm)

Rising Min Temperature (°C)

Rising Max Temperature (°C)

Falling Max Temperature (°C)

Falling Min Temperature (°C)

4800

-127

28

27

-127

5500

29

30

29

28

8500

31

36

35

30

10500

37

41

40

36

12500

42

44

43

41

14500 45 127 127 44

Clear the TEMPERATURE Alarm

This alarm clears when the temperature falls within the operating range.

Procedure

Step 1

Verify the temperature of the NCS 1002.

Step 2

Verify that the environmental temperature of the room is not abnormally high.

Step 3

If the room temperature is not abnormal, physically ensure that nothing prevents the fan-tray assembly from passing air through the system shelf. You must also check if any fan has failed.

If the alarm does not get cleared, you need to report a Service-Affecting (SA) problem. Log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or log into http://www.cisco.com/c/en/us/support/web/tsd-cisco-worldwide-contacts.html to obtain a directory of toll-free Technical Support numbers for your country.


TX-POWER-PROV-MISMATCH Alarm

Default Severity: Minor (MN), Non-Service-Affecting (NSA)

Logical Objects: PPM

The Provisioned Optics Transmit Power Not Supported (TX-POWER-PROV-MISMATCH) alarm is raised when the laser transmit power configured is not within the range of output power supported by the CFP2 pluggable. The alarm indicates that the configured Tx power is not supported by the pluggable though the configured power is applied to the Tx laser.

Clear the TX-POWER-PROV-MISMATCH Alarm

Procedure

Step 1

Verify the value of the configured TX power on NCS 1002 using the show controller optics R/S/I/P command.

Step 2

If the value is not within the output power range supported by the pluggable type, configure the Tx power using the controller optics R/S/I/P transmit-power command in the Optics Controller configuration mode.

The range is from -11.5dBm to -1.5dBm for ONS-CFP2-WDM,-11.5dBm to -1.5dBm for ONS-CFP2-WDM-1KL and -8dBm to 2dBm for ONS-CFP2-WDM-1KE PID types. The PID information can be obtained from inventory details using show inventory command.

If the condition does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447)


UNC-WORD Alarm

Default Severity: Not Reported (NR), Non-Service-Affecting (NSA)

Logical Objects: TRUNK

The Uncorrected FEC Word condition indicates that the FEC capability cannot correct the frame.

Clear the UNC-WORD Alarm

Procedure

Step 1

Ensure that the fiber connector for the card is completely plugged in.

Step 2

Ensure that the ports on the far end and near end nodes have the same port rates and FEC settings.

Step 3

If the BER threshold is correct and at the expected level, use an optical test set to measure the power level of the line to ensure it is within guidelines. For specific procedures to use the test set equipment, consult the manufacturer.

Step 4

If the optical power level is good, verify that the optical receive levels are within the acceptable range.

Step 5

If receive levels are good, clean the fibers at both ends.

Step 6

If the condition does not clear, verify that a single-mode fiber is used.

Step 7

Verify if the fiber is of single-mode type.

Step 8

Clean the fiber connectors at both ends for a signal degrade.

Step 9

If the problem does not clear, the transmitter at the other end of the optical line could be failing and requires replacement.

If the condition does not clear, log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


UPPER-CTRL-FPGA-PCIE-ERROR Alarm

Default Severity: Critical (CR), Service-Affecting (SA)

Logical Objects: EQUIPMENT

The UPPER-CTRL-FPGA-PCIE-ERROR alarm is raised when a control FPGA of the upper board is unreachable because of a Peripheral Component Interconnect Express (PCIe) error.

VOLTAGE Alarms

Default Severity: Critical (CR) , Service-Affecting (SA)

Logical Objects: EQUIPMENT

The voltage alarms are raised on the NCS 1002 when the voltage is not within the operating range. When this condition occurs, critical devices like DSP, and ETNA automatically shut down.

The alarm can appear in one of the following formats:

  • [sensor name] : high voltage alarm

  • [sensor name] : low voltage alarm

The [sensor name] : high voltage alarm is raised when the voltage is high and not within the operating range.

The [sensor name] : low voltage alarm is raised when the voltage is low and not within the operating range.

All sensors managed at the CPU level have NM as the prefix. NM denotes Node management. All sensors managed at the rack level have RM as the prefix. RM denotes Rack management. Some alarms that come in this category have the alarm tags as: FAM-FAULT-NM-L, FAM-FAULT-NM-H, FAM-FAULT-RM-L, FAM-FAULT-RM-H.

Clear the VOLTAGE Alarms

This alarm clears when the voltage falls within the operating range.

Verify the voltage of the power source. The voltage rating value for AC power ranges between 200 V to 240 V depending on the standards in various countries. The voltage rating value for DC power is 48 V, the fuse rating must not exceed 60 A.

If the alarm does not get cleared, you need to report a Service-Affecting (SA) problem. Log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or log into http://www.cisco.com/c/en/us/support/web/tsd-cisco-worldwide-contacts.html to obtain a directory of toll-free Technical Support numbers for your country.

WVL-OUT-OF-LOCK Alarm

Default Severity: Major (MJ), Service-Affecting (SA)

Logical Objects: TRUNK

The Wavelength Out of Lock (WVL-OUT-OF-LOCK) alarm is raised when the trunk port detects that the optical input frequency is out of range.

Clear the WVL-OUT-OF-LOCK Alarm

Procedure

Step 1

Verify the wavelength configuration.

Step 2

Verify if the CFP is inserted properly.

If the alarm does not get cleared, you need to report a Service-Affecting (SA) problem. Log into the Technical Support Website at http://www.cisco.com/c/en/us/support/index.html for more information or log into http://www.cisco.com/c/en/us/support/web/tsd-cisco-worldwide-contacts.html to obtain a directory of toll-free Technical Support numbers for your country.