The documentation set for this product strives to use bias-free language. For the purposes of this documentation set, bias-free is defined as language that does not imply discrimination based on age, disability, gender, racial identity, ethnic identity, sexual orientation, socioeconomic status, and intersectionality. Exceptions may be present in the documentation due to language that is hardcoded in the user interfaces of the product software, language used based on RFP documentation, or language that is used by a referenced third-party product. Learn more about how Cisco is using Inclusive Language.
This document describes the causes of and solutions for input discards for the Cisco Nexus 9500-R EoR and Nexus 3000-R ToR. An input discard indicates the number of packets dropped in the input queue because of congestion. This number includes drops that are caused by tail drop and Weighted Random Early Detection (WRED).
If you experience random/sporadic/historical (i.e no longer occurring) drops, please contact Cisco TAC for further investigation. This walk-through is useful when Input Discards are incremented frequently.
The R-Series uses ingress VOQ architecture. VOQ architecture emulates egress queues in the ingress buffer with virtual queues. Each egress port has eight queues for unicast traffic and eight queues for multicast traffic. Traffic can be classified into traffic classes based on the Class-of-Service (CoS) or Differentiated Services Code Point (DSCP) value in the packets and then queued in the corresponding virtual queue for that traffic class.
The R-Series uses a distributed credit mechanism to transfer traffic over the fabric. Before a packet is scheduled to leave the VOQ, the ingress buffer scheduler requests a credit for the specific port and priority in the egress buffer. Credit is requested from an ingress credit scheduler for the destination port and priority. If buffer space is available, the egress scheduler grants access and sends the credit grant to the ingress buffer scheduler. If no buffer space is available in the egress buffer, the egress schedule does not grant a credit, and traffic is buffered in the VOQ until the next credit is available.
Below is the Packet Forwarding Pipeline for the -R platform. On this article, you focus on the Ingress Traffic Manager component. More details on the architecture at this link
The ingress traffic manager (ITM) is a block in the ingress pipeline. It performs steps related to queue traffic into VOQ, schedule traffic for transmission over the fabric, and manage credits.
The ingress VOQ buffer block manages both the on-chip buffer and the off-chip packet buffer. Both buffers use VOQ architecture, and traffic is queued based on the information from the IRPP (Ingress Receiver Packet Processor). A total of 96,000 VOQs are available for unicast and multicast traffic.
Before a packet is transmitted from the ingress pipeline, the packet needs to be scheduled for transfer over the fabric. The ingress scheduler sends a credit request to the egress scheduler located in the egress traffic manager block. When the ingress traffic manager receives the credit, it starts sending traffic to the ingress transmit packet processor. If the egress buffer is full, traffic will be buffered in the dedicated queue represented by the egress port and traffic class.
Generally, input discards could be seen for below reasons across various Nexus hardware
PID |
N9K-X9636C-R |
N9K-X9636Q-R |
N9K-X9636C-RX |
N9K-X96136YC-R |
N3K-C36180YC-R |
N3K-C3636C-R |
Throughout this article, the value for the counter of "input discards" and any HW internal counter that references the same will change as the errors were incrementing while testing and relevant commands must be grabbed live.
This step comes in handy later.
In our case, it is Queue 7, the default queue - There are 8 queues total on ingress:
Nexus-R# bcm-shell mod 1 "diag counters g" | /|\ | J E R I C H O N E T W O R K I N T E R F A C E | \|/ | +-------------------------------------------+-------------------------------------------+-------------------------------------------+-------------------------------------------+ | NBI | | RX_TOTAL_BYTE_COUNTER = 10,616,663,796 | TX_TOTAL_BYTE_COUNTER = 41,136 | | RX_TOTAL_PKT_COUNTER = 10,659,301 | TX_TOTAL_PKT_COUNTER = 606 | | RX_TOTAL_DROPPED_EOPS = 0 | | +-------------------------------------------+-------------------------------------------+-------------------------------------------+-------------------------------------------+ | IRE | EPNI | | CPU_PACKET_COUNTER = 606 | | | NIF_PACKET_COUNTER = 10,659,302 | EPE_BYTES_COUNTER = 41,136 | | OAMP_PACKET_COUNTER = 0 | EPE_PKT_COUNTER = 606 | | OLP_PACKET_COUNTER = 0 | EPE_DSCRD_PKT_CNT = 0 | | RCY_PACKET_COUNTER = 0 | | | IRE_FDT_INTRFACE_CNT = 0 | | +-------------------------------------------+-------------------------------------------+-------------------------------------------+-------------------------------------------+ | IDR | EGQ | | | | | MMU_IDR_PACKET_COUNTER = 10,659,302 | FQP_PACKET_COUNTER = 606 | | IDR_OCB_INTERFACE_COUNTER = 0 | PQP_UNICAST_PKT_CNT = 606 | | | PQP_DSCRD_UC_PKT_CNT = 0 | | | PQP_UC_BYTES_CNT = 48,408 | +-------------------------------------------+-------------------------------------------| PQP_MC_PKT_CNT = 0 | | IQM | PQP_DSCRD_MC_PKT_CNT = 0 | | | PQP_MC_BYTES_CNT = 0 | | ENQUEUE_PKT_CNT = 1,403,078 | EHP_UNICAST_PKT_CNT = 606 | | DEQUEUE_PKT_CNT = 1,403,078 | EHP_MC_HIGH_PKT_CNT = 0 | | DELETED_PKT_CNT = 0 | EHP_MC_LOW_PKT_CNT = 0 | | ENQ_DISCARDED_PACKET_COUNTER = 9,256,829 | DELETED_PKT_CNT = 0 | | Rejects: PORT_AND_PG_STATUS | | | | RQP_PKT_CNT = 606 | | | RQP_DSCRD_PKT_CNT = 0 | | | PRP_PKT_DSCRD_TDM_CNT = 0 | | | PRP_SOP_DSCRD_UC_CNT = 0 | | | PRP_SOP_DSCRD_MC_CNT = 0 | | | PRP_SOP_DSCRD_TDM_CNT = 0 | | | EHP_MC_HIGH_DSCRD_CNT = 0 | | | EHP_MC_LOW_DSCRD_CNT = 0 | | | ERPP_LAG_PRUNING_DSCRD_CNT = 0 | | | ERPP_PMF_DISCARDS_CNT = 0 | | | ERPP_VLAN_MBR_DSCRD_CNT = 0 | +-------------------------------------------+-------------------------------------------+-------------------------------------------+-------------------------------------------+ | | FDA | | | CELLS_IN_CNT_P1 = 0 | CELLS_OUT_CNT_P1 = 0 | | | CELLS_IN_CNT_P2 = 0 | CELLS_OUT_CNT_P2 = 0 | +-------------------------------------------+-------------------------------------------| CELLS_IN_CNT_P3 = 0 | CELLS_OUT_CNT_P3 = 0 | | IPT | CELLS_IN_TDM_CNT = 0 | CELLS_OUT_TDM_CNT = 0 | | | CELLS_IN_MESHMC_CNT = 0 | CELLS_OUT_MESHMC_CNT = 0 | | EGQ_PKT_CNT = 606 --> CELLS_IN_IPT_CNT = 606 | CELLS_OUT_IPT_CNT = 606 | | ENQ_PKT_CNT = 1,403,084 | EGQ_DROP_CNT = 0 | | FDT_PKT_CNT = 1,402,472 | EGQ_MESHMC_DROP_CNT = 0 | | CRC_ERROR_CNT = 0 | EGQ_TDM_OVF_DROP_CNT = 0 | | CFG_EVENT_CNT = 606 * | | | CFG_BYTE_CNT = 48,408 | | +-------------------------------------------+-------------------------------------------+-------------------------------------------+-------------------------------------------+ | FDT | FDR | | IPT_DESC_CELL_COUNTER = 5,609,892 | P1_CELL_IN_CNT = 0 | | IRE_DESC_CELL_COUNTER = 0 | P2_CELL_IN_CNT = 0 | | | P3_CELL_IN_CNT = 0 | | TRANSMITTED_DATA_CELLS_COUNTER = 5,609,892 | CELL_IN_CNT_TOTAL = 0 | +-------------------------------------------+-------------------------------------------+-------------------------------------------+-------------------------------------------+ | /|\ | J E R I C H O F A B R I C I N T E R F A C E | \|/ |
A QUEUE_DELETED_PACKET_COUNTER being greater than zero would indicate that packets were DELETED by the IQM (Ingress Queueing Manager) afterenqueue. This would be due to an active queue not receiving any credits which would suggest a misconfiguration of the scheduling scheme. You would check this via bcm-shell mod X "getReg IQM_QUEUE_DELETED_PACKET_COUNTER"
ENQ_DISCARDED_PACKET_COUNTER means packets were discarded BEFORE enqueue. You can see this counter set in BCM as well (command is cleared on read):
You can always notice these quickly with show hardware internal errors module X (command clears on read):
Displaying Eth1/33 for this example. In an actual network, you won't know the congested egress port yet.
This command shows us details for the flow for ingress VoQ for a specific port. Additionally, it shows us the current credit balance of the VoQ.
The port's VOQ is derived in this way:
LCs are 0 based - Module 1 is 0, Module 2 is 1, etc
There are 256 System Port IDs per LC
ID = (LC * System port ID) + FP number
Eth1/9 = (0 * 256) + 9 = 9
VOQ ID = 32 + (System Port ID * 8)
Eth1/9 = 32 + (9 * 8) = 104
Our VOQ for Eth1/9 will therefore be 104 which matches the output previously gathered
If the Queue is 303, recall that these queues are actually a range so it can be 303 + 7 or 303-7 - The question is, which port has a VOQ that matches on a range of 296-303 or alternatively, 303-310?
It is known that Queue 7 on Eth1/9 is congested, so 303 actually is the highest in its range so the range of 296-303 is a well-educated guess.
Display the same for asic 0 - Not shown here for brevity; you would notice under the Voq column that your range of interest is not in that ASIC
Notice a few things on the above output:
At this point, you have found the egress congested port - Determine whether there's something wrongfully bursting into the network, you have configured SPAN and your destination port is 1G while sourcing one or more 10G interface or if this is a bottleneck/design issue.
These are more advanced - Not needed to find Egress Congested port under normal scenarios.
attach module X show hardware internal jer-usd tm_debug asic <slot> module <module> show hardware internal jer-usd info voq [ asic <instance> ] [ port <port> ] [ ] show hardware internal jer-usd info non-empty voq asic [ <instance> ] [ ] show hardware internal jer-usd info voq-profile { QueueThreshold drop_p <dp> | OCBThreshold } [ asic <instance> ] [ port<port> ] [ ] show hardware internal jer-usd info voq-connector front-port <port> [ ] show hardware internal jer-usd stats vsq { front-port <port> | inband asic <slot> | recycle-port <port> asic <slot> } show hardware internal jer-usd ingress-vsq buffer-occupancy front-port <port> show hardware internal jer-usd info IQM { counter | rate } asic <instance> dst-port <port> [ interval <int> ] [ ] show hardware internal jer-usd info SCH { counter | rate } asic <instance> dst-port <port> [ interval <int> ] [ ]
bcm-shell mod X
diag cosq print_flow_and_up dest_id=<flow_id>
diag cosq voq id=<voqid> detailed=1
diag cosq qpair e2e ps=<id>
cosq conn ing
cosq conn egr
dump IPS_CR_BAL_TABLE <voqID>
getReg IQM_QUEUE_MAXIMUM_OCCUPANCY_QUEUE_SIZE
Consider this topology wherein the Traffic Generator is sending 2G of traffic towards each Server:
Quickly check which Queues are not empty - Notice there are 4:
Determine what interfaces these Queues belong to - Check ASIC 0 first (it only demonstrates with one interface):
Repeat the same process for the other three Queue values: 247, 303 and 351.
Setting Eth1/33 as a SPAN destination port while setting Eth1/9 as a SPAN source port in the RX direction
Sending packets with SRC 10.10.10.10 and DEST 192.168.10.10 where Eth1/9 is in 10.10.10.1/24 - This does not result in an Input Discard; however, you do see this counter:
Nexus-R# bcm-shell mod 1 "diag counters g" | /|\ | J E R I C H O N E T W O R K I N T E R F A C E | \|/ | +-------------------------------------------+-------------------------------------------+-------------------------------------------+-------------------------------------------+ _PACKET_COUNTER = 0 | DELETED_PKT_CNT = 12,027,201 | | | Discards: INVALID_OTM SRC_EQUAL_DEST +-------------------------------------------+-------------------------------------------+-------------------------------------------+-------------------------------------------+
Send packets with SRC 10.10.10.10 and DEST 192.168.10.10 where Eth1/9 is in 10.10.10.1/24 and Eth1/33 is an L3 port in the 172.16.0.1/30 subnet - No drop counter, no input discards even when the destination is unknown.
Send packets where Eth1/9 is just a wide trunk (or access port) - This is registered as an Input Discard while the port transitions into an STP forwarding state.
Nexus-R(config)# int e1/9
Nexus-R(config-if)# switchport mode trunk
Nexus-R# bcm-shell mod 1 "diag counters g" | i i --|IQM|ENQ_DISCARD|Rejects| PQP_MC_PKT_CNT = 1,678,949 | | IQM | PQP_DSCRD_MC_PKT_CNT = 11,369,033 | | ENQ_DISCARDED_PACKET_COUNTER = 1,289,182 | DELETED_PKT_CNT = 11,369,081 | | Rejects: QUEUE_NOT_VALID_STATUS | Discards: SRC_EQUAL_DEST | +-------------------------------------------+-------------------------------------------+-------------------------------------------+-------------------------------------------+
Nexus-R# show span int e1/9
Vlan Role Sts Cost Prio.Nbr Type
---------------- ---- --- --------- -------- --------------------------------
VLAN0001 Desg BLK 2 128.9 P2p
VLAN0010 Desg BLK 2 128.9 P2p
<snip>
QUEUE_NOT_VALID_STATUS is a drop due to the Packet Processor's (PP) decision to drop or an invalid destination received from the Packet Processor (PP) Blocks.
Sends 10G+ into Eth1/9 would result in a different type of drop as you are maxing out Eth1/9 in the fist place - Does still count as an Input Discard:
bcm-shell.0> diag counters g | /|\ | J E R I C H O N E T W O R K I N T E R F A C E | \|/ | +-------------------------------------------+-------------------------------------------+-------------------------------------------+-------------------------------------------+ | NBI | | RX_TOTAL_BYTE_COUNTER = 53,913,106,009 | TX_TOTAL_BYTE_COUNTER = 1,164,231 | | RX_TOTAL_PKT_COUNTER = 54,145,395 | TX_TOTAL_PKT_COUNTER = 17,029 | | RX_TOTAL_DROPPED_EOPS = 0 | | +-------------------------------------------+-------------------------------------------+-------------------------------------------+-------------------------------------------+ | IRE | EPNI | | CPU_PACKET_COUNTER = 17,010 | | | NIF_PACKET_COUNTER = 54,145,476 | EPE_BYTES_COUNTER = 5,721,307 | | OAMP_PACKET_COUNTER = 0 | EPE_PKT_COUNTER = 50,703 | | OLP_PACKET_COUNTER = 0 | EPE_DSCRD_PKT_CNT = 0 | | RCY_PACKET_COUNTER = 16,837 | | | IRE_FDT_INTRFACE_CNT = 0 | | +-------------------------------------------+-------------------------------------------+-------------------------------------------+-------------------------------------------+ | IDR | EGQ | | | | | MMU_IDR_PACKET_COUNTER = 54,128,577 | FQP_PACKET_COUNTER = 50,703 | | IDR_OCB_INTERFACE_COUNTER = 0 | PQP_UNICAST_PKT_CNT = 50,683 | | | PQP_DSCRD_UC_PKT_CNT = 0 | | | PQP_UC_BYTES_CNT = 5,216,716 | +-------------------------------------------+-------------------------------------------| PQP_MC_PKT_CNT = 20 | | IQM | PQP_DSCRD_MC_PKT_CNT = 20 | | | PQP_MC_BYTES_CNT = 2,079 | | ENQUEUE_PKT_CNT = 5,463,323 | EHP_UNICAST_PKT_CNT = 50,683 | | DEQUEUE_PKT_CNT = 5,594,400 | EHP_MC_HIGH_PKT_CNT = 20 | | DELETED_PKT_CNT = 0 | EHP_MC_LOW_PKT_CNT = 0 | | ENQ_DISCARDED_PACKET_COUNTER = 48,716,055 | DELETED_PKT_CNT = 40 | | Rejects: VOQ_MX_QSZ_STATUS | | <snip>