Troubleshoot eBGP Session Stuck In Active State

Available Languages

Updated:September 24, 2018

Document ID:213710

Bias-Free Language

The documentation set for this product strives to use bias-free language. For the purposes of this documentation set, bias-free is defined as language that does not imply discrimination based on age, disability, gender, racial identity, ethnic identity, sexual orientation, socioeconomic status, and intersectionality. Exceptions may be present in the documentation due to language that is hardcoded in the user interfaces of the product software, language used based on RFP documentation, or language that is used by a referenced third-party product. Learn more about how Cisco is using Inclusive Language.

Introduction

Prerequisites

Requirements

Components Used

Problem

Scenario 1 - Multihop EBGP with Topology Change

Scenario 2 - eBGP with Update Source Address Change

Solution

Enhancement in XR Release

Related Information

Introduction

This document describes how to troubleshoot eBGP (External Border Gateway Protocol) when the session is stuck in active state due to incorrect LPTS (Local Packet Transport Services) entries.

Contributed by William Xu, Cisco TAC Engineer.

Prerequisites

Requirements

Cisco recommends that you have knowledge of these topics:

BGP
TCP
LPTS for IOS XR

Components Used

The information in this document is based on ASR9000 (Aggregation Services Router) platforms.

The information in this document was created from devices in a specific lab environment. All of the devices used in this document started with a cleared (default) configuration. If your network is live, ensure that you understand the potential impact of any commands.

Problem

When you configure eBGP, the session can be stuck in active indefinitely if:

There is no update-source command configured
There is a topology change which causes traffic to take a different path

These symptoms present when this issue occurs:

IP addresses are reachable
Both BGP peers remain stuck in active
Packet capture shows that the routers send many TCP resets
show tcp trace error indicates this error for BGP sessions.

Feb 18 09:32:15.393 tcp/error 0/RSP0/CPU0 t9  Lpts set the drop flag for 179 -> 5368, drop packet (pak 0xb1cf80f3) and send a RST

In summary, the root cause of the issue is that LPTS entries are not updated by the routing and forwarding change. It means they remain in a stale state after the topology changes.

There are some enhancements done for BGP. These two scenarios cover more detail about this issue.

Note: iBGP (Internal Border Gateway Protocol) normally does not hit this issue since update-source is always used.

Scenario 1 - Multihop EBGP with Topology Change

You can build a multihop eBGP sessions between ASR9K-1 and ASR9K-3. The peer IP addresses are 172.123.1.1 and 172.123.2.2 at the physical interfaces. There is no update-source command configured. With the current topology, the session stays in the active state. This is expected because both routers will use the interface in subnet 172.123.3.0/24 as the egress interface.

You can shut down the direct link between ASR9K-1 and ASR9K-3. Then, the peer addresses are reachable via ASR9K-2 which is the multihop link, thus ping is successful. The source IP addresses match at both ends, but the BGP session is still in an active state.

When the BGP neighbors are configured, LPTS entries are created according to the CEF (Cisco Express Forwarding) table. For ASR9K-1, IP address 172.123.2.2 is reachable via 172.123.3.0/24 subnet. Therefore, the relevant entries in LPTS are available. It allows BGP neighbor to connect port 179 with local IP address 172.123.3.1. Since it tries to initiate a TCP session from local port 26036, you can see another entry for it.

ASR9K-1:
========
ASR9K-1#show lpts ifib entry brief | inc "BGP"
...
BGP4 default TCP any 0/RSP1/CPU0 172.123.3.1,179 172.123.2.2
BGP4 default TCP any 0/RSP1/CPU0 172.123.3.1,26036 172.123.2.2,179

This output is same in the ASR9K-3.

ASR9K-3:
========
ASR9K-3#show lpts ifib entry brief | inc "BGP"
...
BGP4 default TCP any 0/RSP1/CPU0 172.123.3.2,11126 172.123.1.1,179
BGP4 default TCP any 0/RSP1/CPU0 172.123.3.2,179 172.123.1.1

When the link between ASR9K-1 and ASR9K-3 goes down, the peers are reachable via ASR9K-2 path with a new local source IP address. But the topology change does not trigger the LPTS update. The original entry with port 179 stays with the original local IP address. This prevents the router to allow ingress TCP requests to the new local IP address. Hence, the BGP session at both ends remains stuck in an active state.

Scenario 2 - eBGP with Update Source Address Change

You can deploy an eBGP session between ASR9K-1 and ASR9K-3. The IP addresses are 172.123.3.1 and 172.123.3.2. As per the new plan, you changed the IP addresses to 172.123.3.111 and 172.123.3.222. If you configure eBGP first and then update the IP addresses at the interfaces, the EBGP session is stuck in an active state.

The cause is same as the scenario 1. Once you configure the eBGP session, the LPTS entries are generated according to the local egress interface at that point.

ASR9K-1:
========
ASR9K-1#show lpts ifib entry brief | inc "BGP"
...
BGP4 default TCP any 0/RSP1/CPU0 172.123.3.1,179 172.123.3.222
BGP4 default TCP any 0/RSP1/CPU0 172.123.3.1,24067 172.123.3.222,179

ASR9K-3:
========
ASR9K-3#show lpts ifib entry brief | inc "BGP"
...
BGP4 default TCP any 0/RSP1/CPU0 172.123.3.2,45091 172.123.3.111,179
BGP4 default TCP any 0/RSP1/CPU0 172.123.3.2,179 172.123.3.111

Although the local IP addresses were changed later, the LPTS entries are not updated. The TCP request is blocked and the session remains stuck in an active state forever.

Solution

To solve this issue, you need to trigger an update to LPTS. You can use these options to resolve the issue:

Shut/No shut the BGP neighbors
Reconfiguration of the BGP neighbors
Restart process bgp
Configure update-source at both ends which can prevent this issue.

Enhancement in XR Release

There are some enhancements in recent IOS XR releases.

CSCuz51103 - BGP session stuck in active

This enhancement introduced from XR release 6.1.1. In this release, when BGP tries to re-establish the session, LPTS updates its entries with the new local IP address . The update time depends on the hold time configuration at both ends. You can still wait for sometimes to see the session up.

Even with this enhancement, a BGP session still can be stuck in an active state if you have configured passive mode. The reason is obvious. If BGP does not try to re-establish the session, the local IP address is not checked. Hence the LPTS entries are not updated.

There is another enhancement for this situation from XR release 6.2.1.

CSCvb15128- BGP session stuck in active while router has Passive BGP mode configured

Related Information

Revision History

Revision	Publish Date	Comments
1.0	23-Sep-2018	Initial Release

Contributed by Cisco Engineers

William Xu
Cisco TAC Engineer

Was this Document Helpful?

Feedback

Contact Cisco

Open a Support Case
(Requires a Cisco Service Contract)

This Document Applies to These Products

ASR 9000 Series Aggregation Services Routers

Troubleshoot eBGP Session Stuck In Active State

Available Languages

Bias-Free Language

Contents

Introduction

Prerequisites

Requirements

Components Used

Problem

Scenario 1 - Multihop EBGP with Topology Change

Scenario 2 - eBGP with Update Source Address Change

Solution

Enhancement in XR Release

Related Information

Revision History

Contributed by Cisco Engineers

Was this Document Helpful?

Contact Cisco

This Document Applies to These Products