Information about Implementing BGP
To implement BGP, you need to understand the following concepts:
Prerequisites for Implementing BGP
The current Internet BGP table contains approximately 1.1 million IPv4 routes and 200,000 IPv6 routes. With an average of two paths per route, the BGP process typically requires around 5.5 GB of RAM to manage the full Internet BGP table. As the IPv6 Internet table continues to expand, the memory requirements for the BGP process are expected to increase. Therefore, Cisco recommends using the Service Edge (SE) version of Route Processor (RP) or Route Switch Processor (RSP) cards, or fixed chassis, on routers that will maintain a full BGP table
BGP Router Identifier
For BGP sessions between neighbors to be established, BGP must be assigned a router ID. The router ID is sent to BGP peers in the OPEN message when a BGP session is established.
BGP attempts to obtain a router ID in the following ways (in order of preference):
-
By means of the address configured using the bgp router-id command in router configuration mode.
-
By using the highest IPv4 address on a loopback interface in the system if the router is booted with saved loopback address configuration.
-
By using the primary IPv4 address of the first loopback address that gets configured if there are not any in the saved configuration.
If none of these methods for obtaining a router ID succeeds, BGP does not have a router ID and cannot establish any peering sessions with BGP neighbors. In such an instance, an error message is entered in the system log, and the show bgp summary command displays a router ID of 0.0.0.0. After BGP has obtained a router ID, it continues to use it even if a better router ID becomes available. This usage avoids unnecessary flapping for all BGP sessions. However, if the router ID currently in use becomes invalid (because the interface goes down or its configuration is changed), BGP selects a new router ID (using the rules described) and all established peering sessions are reset.
Note |
We strongly recommend that the bgp router-id command is configured to prevent unnecessary changes to the router ID (and consequent flapping of BGP sessions). |
BGP Route Distinguisher
In network design solutions where customer equipment is dual-homed and Fast Reroute is required, such as in EVPN and BGP PIC Edge solutions, the Route Distinguisher (RD) associated with each VRF must be unique per Provider Edge (PE) router. In other design scenarios, while it isn’t mandatory for the RD to be unique per PE, it is highly recommended to make it unique. This practice facilitates easier transitions to dual-homed solutions in the future.
There are few available options to keep unique RD per device:
-
Manual configuration: You must manually assign a unique value per device in the network. For example, in this scenario:
-
Leaf (ToR) = RD 1
-
Edge DCI Gateway = RD 2
-
Remote PE = RD 3
-
-
Use rd auto command under VRF. To assign a unique route distinguisher for each router, you must ensure that each router has a unique BGP router-id. If so, the rd auto command assigns a Type 1 route distinguisher to the VRF using the following format: ip-address:number. The IP address is specified by the BGP router-id statement and the number (which is derived as an unused index in the 0 to 65535 range) is unique across the VRFs.
Note |
In a DCI deployment, for route re-originate with stitching-rt for a particular VRF, using the same Route Distinguisher (RD) between edge DCI gateway and MPLS-VPN PE or same RD between edge DCI gateway and Leaf (ToR) is not supported. |
BGP Default Limits
BGP imposes maximum limits on the number of neighbors that can be configured on the router and on the maximum number of prefixes that are accepted from a peer for a given address family. This limitation safeguards the router from resource depletion caused by misconfiguration, either locally or on the remote neighbor. The following limits apply to BGP configurations:
-
The default maximum number of peers that can be configured is 4000. The default can be changed using the bgp maximum neighbor command. The limit range is 1–15000. Any attempt to configure additional peers beyond the maximum limit or set the maximum limit to a number that is less than the number of peers that are currently configured will fail.
- To prevent a peer from flooding BGP with advertisements, a limit is placed on the number of prefixes that are accepted from
a peer for each supported address family. The default limits can be overridden through configuration of the maximum-prefix
limit command for the peer for the appropriate address family. The following default limits are used if the user does not configure
the maximum number of prefixes for the address family:
-
512K (524,288) prefixes for IPv4 unicast
-
128K (131,072) prefixes for IPv6 unicast
-
512K (524,288) prefixes for VPNv4 unicast
A cease notification message is sent to the neighbor and the peering with the neighbor is terminated when the number of prefixes that are received from the peer for a given address family exceeds the maximum limit (either set by default or configured by the user) for that address family.
It is possible that the maximum number of prefixes for a neighbor for a given address family has been configured after the peering with the neighbor has been established and some prefixes have already been received from the neighbor for that address family. A cease notification message is sent to the neighbor and peering with the neighbor is terminated immediately after the configuration if the configured maximum number of prefixes is fewer than the number of prefixes that have already been received from the neighbor for the address family.
-
BGP Attributes and Operators
This table summarizes the BGP attributes and operators per attach points.
Attach Point |
Attribute |
Match |
Set |
---|---|---|---|
aggregation |
as-path |
in is-local length neighbor-is originates-from passes-through unique-length |
— |
as-path-length |
is, ge, le, eq |
— | |
as-path-unique-length |
is, ge, le, eq |
— | |
community |
is-empty matches-any matches-every |
set set additive delete in delete not in delete all |
|
destination |
in |
— | |
extcommunity cost |
— |
set set additive |
|
local-preference |
is, ge, le, eq |
set |
|
med |
is, eg, ge, le |
setset +set - |
|
next-hop |
in |
set |
|
origin |
is |
set |
|
source |
in |
— | |
suppress-route |
— |
suppress-route |
|
weight |
— |
set |
|
allocate-label |
as-path |
in is-local length neighbor-is originates-from passes-through unique-length |
— |
as-path-length |
is, ge, le, eq |
— | |
as-path-unique-length |
is, ge, le, eq |
— | |
community |
is-empty matches-any matches-every |
— | |
destination |
in |
— | |
label |
— |
set |
|
local-preference |
is, ge, le, eq |
— | |
med |
is, eg, ge, le |
— | |
next-hop |
in |
— | |
origin |
is |
— | |
source |
in |
— | |
clear-policy |
as-path |
in is-local length neighbor-is originates-from passes-through unique-length |
— |
as-path-length |
is, ge, le, eq |
— | |
as-path-unique-length |
is, ge, le, eq |
— | |
dampening |
as-path |
in is-local length neighbor-is originates-from passes-through unique-length |
— |
as-path-length |
is, ge, le, eq |
— | |
as-path-unique-length |
is, ge, le, eq |
— | |
community |
is-empty matches-any matches-every |
— | |
dampening |
—/ |
set dampening |
|
destination |
in |
— | |
local-preference |
is, ge, le, eq |
— | |
med |
is, eg, ge, le |
— | |
next-hop |
in |
— | |
origin |
is |
— | |
source |
in |
— | |
debug |
destination |
in |
— |
default originate |
med |
— |
set set + set - |
rib-has-route |
in |
— | |
neighbor-in |
as-path |
in is-local length NA neighbor-is originates-from passes-through unique-length |
prepend prepend most-recent remove as-path private-as replace |
as-path-length |
is, ge, le, eq |
— | |
as-path-unique-length |
is, ge, le, eq |
— | |
communitycommunity with ‘peeras’ |
is-empty matches-any matches-every |
set set additive delete-in delete-not-in delete-all |
|
destination |
in |
— | |
extcommunity cost |
— |
set set additive |
|
extcommunity rt |
is-empty matches-any matches-every matches-within |
set additive delete-in delete-not-in delete-all |
|
extcommunity soo |
is-empty matches-any matches-every matches-within |
— | |
local-preference |
is, ge, le, eq |
set |
|
med |
is, eg, ge, le |
set set + set - |
|
next-hop |
in |
set set peer address |
|
origin |
is |
set |
|
route-aggregated |
route-aggregated |
NA |
|
source |
in |
— | |
weight |
— |
set |
|
neighbor-out |
as-path |
in is-local length — neighbor-is originates-from passes-through unique-length |
prepend prepend most-recent remove as-path private-as replace |
as-path-length |
is, ge, le, eq |
— | |
as-path-unique-length |
is, ge, le, eq |
— | |
communitycommunity with ‘peeras’ |
is-empty matches-any matches-every |
set set additive delete-in delete-not-in delete-all |
|
destination |
in |
— | |
extcommunity cost |
— |
set set additive |
|
extcommunity rt |
is-empty matches-any matches-every matches-within |
set additive delete-in delete-not-in delete-all |
|
extcommunity soo |
is-empty matches-any matches-every matches-within |
— | |
local-preference |
is, ge, le, eq |
set |
|
med |
is, eg, ge, le |
set set + set - set max-unreachable set igp-cost |
|
next-hop |
in |
set set self |
|
origin |
is |
set |
|
path-type |
is |
— | |
rd |
in |
— | |
route-aggregated |
route-aggregated |
— |
|
source |
in |
— | |
unsuppress-route |
— |
unsuppress-route |
|
vpn-distinguisher |
— |
set |
|
neighbor-orf |
orf-prefix |
in |
n/a |
network |
as-path |
— |
prepend |
community |
— |
set set additive delete-in delete-not-in delete-all |
|
destination |
in |
— | |
extcommunity cost |
— |
set set additive |
|
mpls-label |
route-has-label |
— | |
local-preference |
— |
set |
|
med |
— |
set set+ set- |
|
next-hop |
in |
set |
|
origin |
— |
set |
|
route-type |
is |
— | |
tag |
is, ge, le, eq |
— | |
weight |
— |
set |
|
next-hop |
destination |
in |
— |
protocol |
is,in |
— | |
source |
in |
— | |
redistribute |
as-path |
— |
prepend |
community |
— |
set set additive delete in delete not in delete all |
|
destination |
in |
— | |
extcommunity cost |
— |
setset additive |
|
local-preference |
— |
set |
|
med |
— |
set set+ set- |
|
next-hop |
in |
set |
|
origin |
— |
set |
|
mpls-label |
route-has-label |
— | |
route-type |
is |
— | |
tag |
is, eq, ge, le |
— | |
weight |
— |
set |
|
retain-rt |
extcommunity rt |
is-empty matches-any matches-every matches-within |
— |
show |
as-path |
in is-local length neighbor-is originates-from passes-through unique-length |
— |
as-path-length |
is, ge, le, eq |
— | |
as-path-unique-length |
is, ge, le, eq |
— | |
community |
is-empty matches-any matches-every |
— | |
destination |
in |
— | |
extcommunity rt |
is-empty matches-any matches-every matches-within |
— | |
extcommunity soo |
is-empty matches-any matches-every matches-within |
— | |
med |
is, eg, ge, le |
— | |
next-hop |
in |
— | |
origin |
is |
— | |
source |
in |
— |
This table summarizes which operations are valid and where they are valid.
Command |
import |
export |
aggregation |
redistribution |
---|---|---|---|---|
prepend as-path most-recent |
eBGP only |
eBGP only |
n/a |
n/a |
replace as-path |
eBGP only |
eBGP only |
n/a |
n/a |
set med igp-cost |
forbidden |
eBGP only |
forbidden |
forbidden |
set weight |
n/a |
forbidden |
n/a |
n/a |
suppress |
forbidden |
forbidden |
n/a |
forbidden |
BGP Best Path Algorithm
BGP routers typically receive multiple paths to the same destination. The BGP best-path algorithm determines the best path to install in the IP routing table and to use for forwarding traffic. This section describes the Cisco IOS XR software implementation of BGP best-path algorithm, as specified in Section 9.1 of the Internet Engineering Task Force (IETF) Network Working Group draft-ietf-idr-bgp4-24.txt document.
The BGP best-path algorithm implementation is in three parts:
-
Part 1—Compares two paths to determine which is better.
-
Part 2—Iterates over all paths and determines which order to compare the paths to select the overall best path.
-
Part 3—Determines whether the old and new best paths differ enough so that the new best path should be used.
Note |
The order of comparison determined by Part 2 is important because the comparison operation is not transitive; that is, if three paths, A, B, and C exist, such that when A and B are compared, A is better, and when B and C are compared, B is better, it is not necessarily the case that when A and C are compared, A is better. This nontransitivity arises because the multi exit discriminator (MED) is compared only among paths from the same neighboring autonomous system (AS) and not among all paths. |
Comparing Pairs of Paths
Perform the following steps to compare two paths and determine the better path:
-
If either path is invalid (for example, a path has the maximum possible MED value or it has an unreachable next hop), then the other path is chosen (provided that the path is valid).
-
If the paths have unequal pre-bestpath cost communities, the path with the lower pre-bestpath cost community is selected as the best path.
-
If the paths have unequal weights, the path with the highest weight is chosen. Note
The weight is entirely local to the router, and can be set with the weight command or using a routing policy.
-
If the paths have unequal local preferences, the path with the higher local preference is chosen.
Note
If a local preference attribute was received with the path or was set by a routing policy, then that value is used in this comparison. Otherwise, the default local preference value of 100 is used. The default value can be changed using the bgp default local-preference command.
-
If one of the paths is a redistributed path, which results from a redistribute or network command, then it is chosen. Otherwise, if one of the paths is a locally generated aggregate, which results from an aggregate-address command, it is chosen.
Note
Step 1 through Step 4 implement the “Path Selection with BGP”of RFC 1268.
-
If the paths have unequal AS path lengths, the path with the shorter AS path is chosen. This step is skipped if bgp bestpath as-path ignore command is configured.
Note
When calculating the length of the AS path, confederation segments are ignored, and AS sets count as 1.
Note
eiBGP specifies internal and external BGP multipath peers. eiBGP allows simultaneous use of internal and external paths.
-
If the paths have different origins, the path with the lower origin is selected. Interior Gateway Protocol (IGP) is considered lower than EGP, which is considered lower than INCOMPLETE.
-
If appropriate, the MED of the paths is compared. If they are unequal, the path with the lower MED is chosen.
A number of configuration options exist that affect whether or not this step is performed. In general, the MED is compared if both paths were received from neighbors in the same AS; otherwise the MED comparison is skipped. However, this behavior is modified by certain configuration options, and there are also some corner cases to consider.
If the bgp bestpath med always command is configured, then the MED comparison is always performed, regardless of neighbor AS in the paths. Otherwise, MED comparison depends on the AS paths of the two paths being compared, as follows:
-
If a path has no AS path or the AS path starts with an AS_SET, then the path is considered to be internal, and the MED is compared with other internal paths.
-
If the AS path starts with an AS_SEQUENCE, then the neighbor AS is the first AS number in the sequence, and the MED is compared with other paths that have the same neighbor AS.
-
If the AS path contains only confederation segments or starts with confederation segments followed by an AS_SET, then the MED is not compared with any other path unless the bgp bestpath med confed command is configured. In that case, the path is considered internal and the MED is compared with other internal paths.
-
If the AS path starts with confederation segments followed by an AS_SEQUENCE, then the neighbor AS is the first AS number in the AS_SEQUENCE, and the MED is compared with other paths that have the same neighbor AS.
Note
If no MED attribute was received with the path, then the MED is considered to be 0 unless the bgp bestpath med missing-as-worst command is configured. In that case, if no MED attribute was received, the MED is considered to be the highest possible value.
-
-
If one path is received from an external peer and the other is received from an internal (or confederation) peer, the path from the external peer is chosen.
-
If the paths have different IGP metrics to their next hops, the path with the lower IGP metric is chosen.
-
If the paths have unequal IP cost communities, the path with the lower IP cost community is selected as the best path.
-
If all path parameters in Step 1 through Step 10 are the same, then the router IDs are compared. If the path was received with an originator attribute, then that is used as the router ID to compare; otherwise, the router ID of the neighbor from which the path was received is used. If the paths have different router IDs, the path with the lower router ID is chosen.
Note
Where the originator is used as the router ID, it is possible to have two paths with the same router ID. It is also possible to have two BGP sessions with the same peer router, and therefore receive two paths with the same router ID.
-
If the paths have different cluster lengths, the path with the shorter cluster length is selected. If a path was not received with a cluster list attribute, it is considered to have a cluster length of 0.
-
Finally, the path received from the neighbor with the lower IP address is chosen. Locally generated paths (for example, redistributed paths) are considered to have a neighbor IP address of 0.
Order of Comparisons
The second part of the BGP best-path algorithm implementation determines the order in which the paths should be compared. The order of comparison is determined as follows:
-
The paths are partitioned into groups such that within each group the MED can be compared among all paths. The same rules as in are used to determine whether MED can be compared between any two paths. Normally, this comparison results in one group for each neighbor AS. If the bgp bestpath med always command is configured, then there is just one group containing all the paths.
-
The best path in each group is determined. Determining the best path is achieved by iterating through all paths in the group and keeping track of the best one seen so far. Each path is compared with the best-so-far, and if it is better, it becomes the new best-so-far and is compared with the next path in the group.
-
A set of paths is formed containing the best path selected from each group in Step 2. The overall best path is selected from this set of paths, by iterating through them as in Step 2.
Best Path Change Suppression
The third part of the implementation is to determine whether the best-path change can be suppressed or not—whether the new best path should be used, or continue using the existing best path. The existing best path can continue to be used if the new one is identical to the point at which the best-path selection algorithm becomes arbitrary (if the router-id is the same). Continuing to use the existing best path can avoid churn in the network.
Note |
This suppression behavior does not comply with the IETF Networking Working Group draft-ietf-idr-bgp4-24.txt document, but is specified in the IETF Networking Working Group draft-ietf-idr-avoid-transition-00.txt document. |
The suppression behavior can be turned off by configuring the bgp bestpath compare-routerid command. If this command is configured, the new best path is always preferred to the existing one.
Otherwise, the following steps are used to determine whether the best-path change can be suppressed:
-
If the existing best path is no longer valid, the change cannot be suppressed.
-
If either the existing or new best paths were received from internal (or confederation) peers or were locally generated (for example, by redistribution), then the change cannot be suppressed. That is, suppression is possible only if both paths were received from external peers.
-
If the paths were received from the same peer (the paths would have the same router-id), the change cannot be suppressed. The router ID is calculated using rules in .
-
If the paths have different weights, local preferences, origins, or IGP metrics to their next hops, then the change cannot be suppressed. Note that all these values are calculated using the rules in .
-
If the paths have different-length AS paths and the bgp bestpath as-path ignore command is not configured, then the change cannot be suppressed. Again, the AS path length is calculated using the rules in .
-
If the MED of the paths can be compared and the MEDs are different, then the change cannot be suppressed. The decision as to whether the MEDs can be compared is exactly the same as the rules in , as is the calculation of the MED value.
-
If all path parameters in Step 1 through Step 6 do not apply, the change can be suppressed.
BGP Update Generation and Update Groups
The BGP Update Groups feature separates BGP update generation from neighbor configuration. The BGP Update Groups feature introduces an algorithm that dynamically calculates BGP update group membership based on outbound routing policies. This feature does not require any configuration by the network operator. Update group-based message generation occurs automatically and independently.
BGP Update Group
When a change to the configuration occurs, the router automatically recalculates update group memberships and applies the changes.
For the best optimization of BGP update group generation, we recommend that the network operator keeps outbound routing policy the same for neighbors that have similar outbound policies. This feature contains commands for monitoring BGP update groups.
BGP Cost Community Reference
The cost community attribute is applied to internal routes by configuring the set extcommunity cost command in a route policy. The cost community set clause is configured with a cost community ID number (0–255) and cost community number (0–4294967295). The cost community number determines the preference for the path. The path with the lowest cost community number is preferred. Paths that are not specifically configured with the cost community number are assigned a default cost community number of 2147483647 (the midpoint between 0 and 4294967295) and evaluated by the best-path selection process accordingly. When two paths have been configured with the same cost community number, the path selection process prefers the path with the lowest cost community ID. The cost-extended community attribute is propagated to iBGP peers when extended community exchange is enabled.
The following commands include the route-policy keyword, which you can use to apply a route policy that is configured with the cost community set clause:
-
aggregate-address
-
redistribute
-
network
BGP Next Hop Reference
-
Next hop becomes unreachable
-
Next hop becomes reachable
-
Fully recursed IGP metric to the next hop changes
-
First hop IP address or first hop interface change
-
Next hop becomes connected
-
Next hop becomes unconnected
-
Next hop becomes a local address
-
Next hop becomes a nonlocal address
Note |
Reachability and recursed metric events trigger a best-path recalculation. |
-
Critical events are related to the reachability (reachable and unreachable), connectivity (connected and unconnected), and locality (local and nonlocal) of the next hops. Notifications for these events are not delayed.
-
Noncritical events include only the IGP metric changes. These events are sent at an interval of 3 seconds. A metric change event is batched and sent 3 seconds after the last one was sent.
BGP is notified when any of the following events occurs:
-
Next hop becomes unreachable
-
Next hop becomes reachable
-
Fully recursed IGP metric to the next hop changes
-
First hop IP address or first hop interface change
-
Next hop becomes connected
-
Next hop becomes unconnected
-
Next hop becomes a local address
-
Next hop becomes a nonlocal address
Note |
Reachability and recursed metric events trigger a best-path recalculation. |
The next-hop trigger delay for critical and noncritical events can be configured to specify a minimum batching interval for critical and noncritical events using the nexthop trigger-delay command. The trigger delay is address family dependent.
The BGP next-hop tracking feature allows you to specify that BGP routes are resolved using only next hops whose routes have the following characteristics:
-
To avoid the aggregate routes, the prefix length must be greater than a specified value.
-
The source protocol must be from a selected list, ensuring that BGP routes are not used to resolve next hops that could lead to oscillation.
This route policy filtering is possible because RIB identifies the source protocol of route that resolved a next hop as well as the mask length associated with the route. The nexthop route-policy command is used to specify the route-policy.
Next Hop as the IPv6 Address of Peering Interface
BGP can carry IPv6 prefixes over an IPv4 session. The next hop for the IPv6 prefixes can be set through a nexthop policy. In the event that the policy is not configured, the nexthops are set as the IPv6 address of the peering interface (IPv6 neighbor interface or IPv6 update source interface, if any one of the interfaces is configured).
If the nexthop policy is not configured and neither the IPv6 neighbor interface nor the IPv6 update source interface is configured, the next hop is the IPv4 mapped IPv6 address.
Scoped IPv4/VPNv4 Table Walk
To determine which address family to process, a next-hop notification is received by first de-referencing the gateway context associated with the next hop, then looking into the gateway context to determine which address families are using the gateway context. The IPv4 unicast and VPNv4 unicast address families share the same gateway context, because they are registered with the IPv4 unicast table in the RIB. As a result, both the global IPv4 unicast table and the VPNv4 table are is processed when an IPv4 unicast next-hop notification is received from the RIB. A mask is maintained in the next hop, indicating if whether the next hop belongs to IPv4 unicast or VPNv4 unicast, or both. This scoped table walk localizes the processing in the appropriate address family table.
Reordered Address Family Processing
The software walks address family tables based on the numeric value of the address family. When a next-hop notification batch is received, the order of address family processing is reordered to the following order:
-
IPv4 tunnel
-
VPNv4 unicast
-
VPNv6 unicast
-
IPv4 labeled unicast
-
IPv4 unicast
-
IPv4 MDT
-
IPv6 unicast
-
IPv6 labeled unicast
-
IPv4 tunnel
-
VPNv4 unicast
-
IPv4 unicast
-
IPv6 unicast
New Thread for Next-Hop Processing
The critical-event thread in the spkr process handles only next-hop, Bidirectional Forwarding Detection (BFD), and fast-external-failover (FEF) notifications. This critical-event thread ensures that BGP convergence is not adversely impacted by other events that may take a significant amount of time.
show, clear, and debug Commands
The show bgp nexthops command provides statistical information about next-hop notifications, the amount of time spent in processing those notifications, and details about each next hop registered with the RIB. The clear bgp nexthop performance-statistics command ensures that the cumulative statistics associated with the processing part of the next-hop show command can be cleared to help in monitoring. The clear bgp nexthop registration command performs an asynchronous registration of the next hop with the RIB.
The debug bgp nexthop command displays information on next-hop processing. The out keyword provides debug information only about BGP registration of next hops with RIB. The in keyword displays debug information about next-hop notifications received from RIB. The out keyword displays debug information about next-hop notifications sent to the RIB.
BGP Nonstop Routing Reference
BGP NSR provides nonstop routing during the following events:
-
Route processor switchover
-
Process crash or process failure of BGP or TCP Note
BGP NSR is enabled by default. Use the nsr disable command to turn off BGP NSR. The no nsr disable command can also be used to turn BGP NSR back on if it has been disabled.
In case of process crash or process failure, NSR will be maintained only if nsr process-failures switchover command is configured. In the event of process failures of active instances, the nsr process-failures switchover configures failover as a recovery action and switches over to a standby route processor (RP) or a standby distributed route processor (DRP) thereby maintaining NSR. An example of the configuration command is RP/0/RSP0/CPU0:router(config) # nsr process-failures switchover
The nsr process-failures switchover command maintains both the NSR and BGP sessions in the event of a BGP or TCP process crash. Without this configuration, BGP neighbor sessions flap in case of a BGP or TCP process crash. This configuration does not help if the BGP or TCP process is restarted in which case the BGP neighbors are expected to flap.
When the l2vpn_mgr process is restarted, the NSR client (te-control) flaps between the Ready and Not Ready state. This is the expected behavior and there is no traffic loss.
During route processor switchover and In-Service System Upgrade (ISSU), NSR is achieved by stateful switchover (SSO) of both TCP and BGP.
NSR does not force any software upgrades on other routers in the network, and peer routers are not required to support NSR.
When a route processor switchover occurs due to a fault, the TCP connections and the BGP sessions are migrated transparently to the standby route processor, and the standby route processor becomes active. The existing protocol state is maintained on the standby route processor when it becomes active, and the protocol state does not need to be refreshed by peers.
Events such as soft reconfiguration and policy modifications can trigger the BGP internal state to change. To ensure state consistency between active and standby BGP processes during such events, the concept of post-it is introduced that act as synchronization points.
BGP NSR provides the following features:
-
NSR-related alarms and notifications
-
Configured and operational NSR states are tracked separately
-
NSR statistics collection
-
NSR statistics display using show commands
-
XML schema support
-
Auditing mechanisms to verify state synchronization between active and standby instances
-
CLI commands to enable and disable NSR
BGP Route Reflectors Reference
#concept_305B2C3244D5404F9E6207098D05CA9E__ illustrates a simple iBGP configuration with three iBGP speakers (routers A, B, and C). Without route reflectors, when Router A receives a route from an external neighbor, it must advertise it to both routers B and C. Routers B and C do not readvertise the iBGP learned route to other iBGP speakers because the routers do not pass on routes learned from internal neighbors to other internal neighbors, thus preventing a routing information loop.
With route reflectors, all iBGP speakers need not be fully meshed because there is a method to pass learned routes to neighbors. In this model, an iBGP peer is configured to be a route reflector responsible for passing iBGP learned routes to a set of iBGP neighbors. In #concept_305B2C3244D5404F9E6207098D05CA9E__ , Router B is configured as a route reflector. When the route reflector receives routes advertised from Router A, it advertises them to Router C, and vice versa. This scheme eliminates the need for the iBGP session between routers A and C.
The internal peers of the route reflector are divided into two groups: client peers and all other routers in the autonomous system (nonclient peers). A route reflector reflects routes between these two groups. The route reflector and its client peers form a cluster. The nonclient peers must be fully meshed with each other, but the client peers need not be fully meshed. The clients in the cluster do not communicate with iBGP speakers outside their cluster.
#concept_305B2C3244D5404F9E6207098D05CA9E__ illustrates a more complex route reflector scheme. Router A is the route reflector in a cluster with routers B, C, and D. Routers E, F, and G are fully meshed, nonclient routers.
When the route reflector receives an advertised route, depending on the neighbor, it takes the following actions:
-
A route from an external BGP speaker is advertised to all clients and nonclient peers.
-
A route from a nonclient peer is advertised to all clients.
-
A route from a client is advertised to all clients and nonclient peers. Hence, the clients need not be fully meshed.
Along with route reflector-aware BGP speakers, it is possible to have BGP speakers that do not understand the concept of route reflectors. They can be members of either client or nonclient groups, allowing an easy and gradual migration from the old BGP model to the route reflector model. Initially, you could create a single cluster with a route reflector and a few clients. All other iBGP speakers could be nonclient peers to the route reflector and then more clusters could be created gradually.
An autonomous system can have multiple route reflectors. A route reflector treats other route reflectors just like other iBGP speakers. A route reflector can be configured to have other route reflectors in a client group or nonclient group. In a simple configuration, the backbone could be divided into many clusters. Each route reflector would be configured with other route reflectors as nonclient peers (thus, all route reflectors are fully meshed). The clients are configured to maintain iBGP sessions with only the route reflector in their cluster.
Usually, a cluster of clients has a single route reflector. In that case, the cluster is identified by the router ID of the route reflector. To increase redundancy and avoid a single point of failure, a cluster might have more than one route reflector. In this case, all route reflectors in the cluster must be configured with the cluster ID so that a route reflector can recognize updates from route reflectors in the same cluster. All route reflectors serving a cluster should be fully meshed and all of them should have identical sets of client and nonclient peers.
By default, the clients of a route reflector are not required to be fully meshed and the routes from a client are reflected to other clients. However, if the clients are fully meshed, the route reflector need not reflect routes to clients.
As the iBGP learned routes are reflected, routing information may loop. The route reflector model has the following mechanisms to avoid routing loops:
-
Originator ID is an optional, nontransitive BGP attribute. It is a 4-byte attributed created by a route reflector. The attribute carries the router ID of the originator of the route in the local autonomous system. Therefore, if a misconfiguration causes routing information to come back to the originator, the information is ignored.
-
Cluster-list is an optional, nontransitive BGP attribute. It is a sequence of cluster IDs that the route has passed. When a route reflector reflects a route from its clients to nonclient peers, and vice versa, it appends the local cluster ID to the cluster-list. If the cluster-list is empty, a new cluster-list is created. Using this attribute, a route reflector can identify if routing information is looped back to the same cluster due to misconfiguration. If the local cluster ID is found in the cluster-list, the advertisement is ignored.
BGP Persistence
BGP persistence enables the local router to retain routes that it has learnt from the configured neighbor even after the neighbor session is down. BGP persistence is also referred as Long Lived Graceful Restart (LLGR). LLGR takes effect after graceful restart (GR) ends or immediately if GR is not enabled. LLGR ends either when the LLGR stale timer expires or when the neighbor sends the end-of-RIB marker after it has revised its routes. When LLGR for a neighbor ends, all routes from that neighbor that are still stale will be deleted. The LLGR capability is signaled to a neighbor in the BGP OPEN message if it has been configured for that neighbor.
Note |
You can disable GR helper-only for peer-group and neighbor, when there is no global GR helper-only configured. |
LLGR differs from graceful restart in the following ways.
-
It can be in effect for a much longer time than GR.
-
LLGR stale routes are least preferred during route selection (bestpath computation).
-
An LLGR stale route will be advertised with the LLGR_STALE community attached if it is selected as best path. It will not be advertised at all to routers that are not LLGR capable.
-
LLGR stale routes will not be deleted when the forwarding path to the neighbor is detected to be down
-
An LLGR stale route will not be deleted if the BGP session to the neighbor goes down multiple times even if that neighbor does not re-advertise the route.
-
Any route that has the NO_LLGR community will not be retained.
BGP will not pass the updates containing communities 65535:6, 65535:7 to its neighbors until the neighbors negotiate BGP persistence capabilities. The communities 65535:6 and 65535:7 are reserved for LLGR_STALE and NO_LLGR respectively, BGP behavior maybe unpredictable if you have configured these communities prior to release 5.2.2. We recommend not to configure the communities 65535:6 and 65535:7.
The BGP persistence feature is supported only on the following AFIs:
-
VPNv4 and VPNv6
-
RT constraint
-
Flow spec (IPv4, IPv6, VPNv4 and VPNv6)
-
Private IPv4 and IPv6 (IPv4/v6 address family inside VRF)
BGP Persistence Configuration: Example
This example sets long lived graceful restart (LLGR) stale-time of 16777215 on BGP neighbor 10.3.3.3.
router bgp 100
neighbor 10.3.3.3
remote-as 30813
update-source Loopback0
graceful-restart stalepath-time 150
address-family vpnv4 unicast
long-lived-graceful-restart capable
long-lived-graceful-restart stale-time send 16777215 accept 16777215
!
address-family vpnv6 unicast
long-lived-graceful-restart capable
long-lived-graceful-restart stale-time send 16777215 accept 16777215
iBGP Multipath Load Sharing Reference
When there are multiple border BGP routers having reachability information heard over eBGP, if no local policy is applied, the border routers will choose their eBGP paths as best. They advertise that bestpath inside the ISP network. For a core router, there can be multiple paths to the same destination, but it will select only one path as best and use that path for forwarding. iBGP multipath load sharing adds the ability to enable load sharing among multiple equi-distant paths. Configuring multiple iBGP best paths enables a router to evenly share the traffic destined for a particular site. The iBGP Multipath Load Sharing feature functions similarly in a Multiprotocol Label Switching (MPLS) Virtual Private Network (VPN) with a service provider backbone.
For multiple paths to the same destination to be considered as multipaths, the following criteria must be met:
-
All attributes must be the same. The attributes include weight, local preference, autonomous system path (entire attribute and not just length), origin code, Multi Exit Discriminator (MED), and Interior Gateway Protocol (iGP) distance.
-
The next hop router for each multipath must be different.
Note |
|
L3VPN iBGP PE-CE Reference
When BGP is used as the provider edge (PE) or the customer edge (CE) routing protocol, the peering sessions are configured as external peering between the VPN provider autonomous system (AS) and the customer network autonomous system. The L3VPN iBGP PE-CE feature enables the PE and CE devices to exchange Border Gateway Protocol (BGP) routing information by peering as internal Border Gateway Protocol (iBGP) instead of the widely-used external BGP peering between the PE and the CE. This mechanism applies at each PE device where a VRF-based CE is configured as iBGP. This eliminates the need for service providers (SPs) to configure autonomous system override for the CE. With this feature enabled, there is no need to configure the virtual private network (VPN) sites using different autonomous systems.
The neighbor internal-vpn-client command enables PE devices to make an entire VPN cloud act as an internal VPN client to the CE devices. These CE devices are connected internally to the VPN cloud through the iBGP PE-CE connection inside the VRF. After this connection is established, the PE device encapsulates the CE-learned path into an attribute called ATTR_SET and carries it in the iBGP-sourced path throughout the VPN core to the remote PE device. At the remote PE device, this attribute is assigned with individual attributes and the source CE path is extracted and sent to the remote CE devices.
+------------------------------+
| Attr Flags (O|T) Code = 128 |
+------------------------------+
| Attr. Length (1 or 2 octets) |
+------------------------------+
| Origin AS (4 octets) |
+------------------------------+
| Path attributes (variable) |
+------------------------------+
Origin AS is the AS of the VPN customer for which the ATTR_SET is generated. The minimum length of ATTR_SET is four bytes and the maximum is the maximum supported for a path attribute after taking into consideration the mandatory fields and attributes in the BGP update message. It is recommended that the maximum length is limited to 3500 bytes. ATTR_SET must not contain the following attributes: MP_REACH, MP_UNREACH, NEW_AS_PATH, NEW_AGGR, NEXT_HOP and ATTR_SET itself (ATTR_SET inside ATTR_SET). If these attributes are found inside the ATTR_SET, the ATTR_SET is considered invalid and the corresponding error handling mechanism is invoked.
Per VRF and Per CE Label for IPv6 Provider Edge
The per VRF and per CE label for IPv6 feature makes it possible to save label space by allocating labels per default VRF or per CE nexthop.
All IPv6 Provider Edge (6PE) labels are allocated per prefix by default. Each prefix that belongs to a VRF instance is advertised with a single label, causing an additional lookup to be performed in the VRF forwarding table to determine the customer edge (CE) next hop for the packet.
However, use the label-allocation-mode command with the per-ce keyword or the per-vrf keyword to avoid the additional lookup on the PE router and conserve label space.
Use per-ce keyword to specify that the same label be used for all the routes advertised from a unique customer edge (CE) peer router. Use the per-vrf keyword to specify that the same label be used for all the routes advertised from a unique VRF.
Note |
The label-allocation-mode command is deprecated from 7.4.1 release. The function of this command can be carried out using label mode command under configured address-family. |
IPv6 Unicast Routing
Cisco provides complete Internet Protocol Version 6 (IPv6) unicast capability.
An IPv6 unicast address is an identifier for a single interface, on a single node. A packet that is sent to a unicast address is delivered to the interface identified by that address. Cisco IOS XR software supports the following IPv6 unicast address types:
-
Global aggregatable address
-
Site-local address
-
Link-local address
-
IPv4-compatible IPv6 address
For more information on IPv6 unicast addressing, refer the IP Addresses and Services Configuration Guide.
Remove and Replace Private AS Numbers from AS Path in BGP
Private autonomous system numbers (ASNs) are used by Internet Service Providers (ISPs) and customer networks to conserve globally unique AS numbers. Private AS numbers cannot be used to access the global Internet because they are not unique. AS numbers appear in eBGP AS paths in routing updates. Removing private ASNs from the AS path is necessary if you have been using private ASNs and you want to access the global Internet.
Public AS numbers are assigned by InterNIC and are globally unique. They range from 1 to 64511. Private AS numbers are used to conserve globally unique AS numbers, and they range from 64512 to 65535. Private AS numbers cannot be leaked to a global BGP routing table because they are not unique, and BGP best path calculations require unique AS numbers. Therefore, it might be necessary to remove private AS numbers from an AS path before the routes are propagated to a BGP peer.
External BGP (eBGP) requires that globally unique AS numbers be used when routing to the global Internet. Using private AS numbers (which are not unique) would prevent access to the global Internet. The remove and replace private AS Numbers from AS Path in BGP feature allows routers that belong to a private AS to access the global Internet. A network administrator configures the routers to remove private AS numbers from the AS path contained in outgoing update messages and optionally, to replace those numbers with the ASN of the local router, so that the AS Path length remains unchanged.
The ability to remove and replace private AS numbers from the AS Path is implemented in the following ways:
-
The remove-private-as command removes private AS numbers from the AS path even if the path contains both public and private ASNs.
-
The remove-private-as command removes private AS numbers even if the AS path contains only private AS numbers. There is no likelihood of a 0-length AS path because this command can be applied to eBGP peers only, in which case the AS number of the local router is appended to the AS path.
-
The remove-private-as command removes private AS numbers even if the private ASNs appear before the confederation segments in the AS path.
-
The replace-as command replaces the private AS numbers being removed from the path with the local AS number, thereby retaining the same AS path length.
route-policy can be configured either with remove-private-as command or replace-as command, or combination of both.
Following considerations describes the preference of command configurations:
-
When route-policy is not configured, remove-private-as command is considered.
-
When remove-private-as is not configured, replace-private-as command is considered.
-
When remove-private-as and replace-private-as are configured under neighbor's afi-safi command, remove-private-as command is considered.
-
When route-policy is configured with both remove private AS and remove private AS and applied under neighbour afi, replace private AS command is considered.
The feature can be applied to neighbors per address family (address family configuration mode). Therefore, you can apply the feature for a neighbor in one address family and not on another, affecting update messages on the outbound side for only the address family for which the feature is configured.
Use show bgp neighbors and show bgp update-group commands to verify that the that private AS numbers were removed or replaced.
Configure Route Policy
Use the following configuration to configure route policy.
/*Configuration Example*/
Router# configure
Router(config)# route-policy rm_pv_as
Router(config-rpl)# remove as-path private-as entire-aspath
Router(config-rpl)# replace as-path private-as
Router(config-rpl)# commit
Use the following configuration to configure Route Policy with rm-pv-as .
Router# configure
Router(config)# route-policy rm_pv
Router(config-rpl)# replace as-path private-as
Router(config-rpl)# remove as-path private-as entire-aspath
Router(config-rpl# end-policy
Router(config)# router bgp 100
Router(config-bgp)# neighbor 192.168.23.3
Router(config-bgp-nbr)# remote-as 3
Router(config-bgp-nbrgrp)# address-family ipv4 unicast
Router(config-bgp-nbrgrp-af)# route-policy passall in
Router(config-bgp-nbrgrp-af)# route-policy rm_pv out
Use the following configuration to configure Route Policy without rm-pv-as .
Router# configure
Router(config)# route-policy rm_pv
Router(config-rpl)# replace as-path private-as
Router(config-rpl)# remove as-path private-as entire-aspath
Router(config-rpl# end-policy
Router(config)# router bgp 100
Router(config-bgp)# neighbor 192.168.23.3
Router(config-bgp-nbr)# remote-as 3
Router(config-bgp-nbrgrp)# address-family ipv4 unicast
Router(config-bgp-nbrgrp-af)# route-policy passall
Router(config-bgp-nbrgrp-af)# pass
Router(config-bgp-nbrgrp-af)# end-policy
Router(config-bgp)# neighbor 192.168.23.3
Router(config-bgp-nbr)# remote-as 3
Router(config-bgp-nbr)# address-family ipv4 unicast
Router(config-bgp-nbr-af)# route-policy passall in
Router(config-bgp-nbrgrp-af)# route-policy rm_pv outroute-policy passall out
Router(config-rpl)# replace-private-AS
Router(config-rpl)# remove-private-AS entire-aspath
Running Configuration:
Running configuration of three routers (R1, R2, R3) where the connection topology is: R2--R1--R3
/* Running configuration at R1*/
route-policy rm_pv_as
replace as-path private-as
remove as-path private-as entire-aspath
end-policy
!
route-policy passall
pass
end-policy
!
router bgp 2
bgp router-id 2.2.2.2
address-family ipv4 unicast
redistribute connected
redistribute static
!
address-family vpnv4 unicast
!
address-family ipv6 unicast
!
address-family vpnv6 unicast
!
!
!
neighbor 192.168.12.1
remote-as 64512
address-family ipv4 unicast
route-policy passall in
route-policy passall out
!
!
neighbor 192.168.23.3
remote-as 3
address-family ipv4 unicast
route-policy passall in
route-policy rm_pv_as out
!
!
!
/*Running configuration at R2*/
route-policy passall
pass
end-policy
!
router bgp 64512
bgp router-id 1.1.1.1
address-family ipv4 unicast
network 10.10.10.10/32
!
address-family vpnv4 unicast
!
address-family ipv6 unicast
!
address-family vpnv6 unicast
!
neighbor 192.168.12.2
remote-as 2
address-family ipv4 unicast
route-policy passall in
route-policy passall out
!
!
/*Running configuration at R3*/
router bgp 3
bgp router-id 3.3.3.3
address-family ipv4 unicast
!
address-family vpnv4 unicast
!
address-family ipv6 unicast
!
address-family vpnv6 unicast
!
neighbor 192.168.23.2
remote-as 2
address-family ipv4 unicast
route-policy passall in
route-policy passall out
!
!
!
Verification
/*validate the configuration */
Router# show bgp
BGP router identifier 3.3.3.3, local AS number 3
BGP generic scan interval 60 secs
Non-stop routing is enabled
BGP table state: Active
Table ID: 0xe0000000 RD version: 6
BGP main routing table version 6
BGP NSR Initial initsync version 6 (Reached)
BGP NSR/ISSU Sync-Group versions 0/0
BGP scan interval 60 secs
Status codes: s suppressed, d damped, h history, * valid, > best
i - internal, r RIB-failure, S stale, N Nexthop-discard
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
*> 10.10.10.10/32 192.168.23.2 0 2 2 i
*> 192.168.12.0/24 192.168.23.2 0 0 2 ?
*> 192.168.23.0/24 192.168.23.2 0 0 2 ?
*> 192.168.122.0/24 192.168.23.2 0 0 2 ?
Replace BGP AS Path with Custom Values
Feature Name |
Release Information |
Feature Description |
---|---|---|
Replace BGP AS Path with Custom Values |
Release 7.5.2 |
You can now configure route policies to replace the Autonomous System (AS) Path in BGP with custom values to control the best path selection process. This feature introduces the replace as-path all command. |
BGP routers typically receive multiple paths to the same destination. The BGP best-path algorithm determines the best path to install in the IP routing table and to use for forwarding traffic. The overall best path is selected based on various attributes. .
AS path is one of the attributes used for best path selection. By default, BGP always prefers the route with shortest AS path as the best path. The best path selected by BGP might have traffic engineering issues, like heavy traffic that leads to congestion. In such cases, you can alter the best path by replacing the AS path with custom values.
The following are the custom values you can use to replace the AS path:
-
None: Use this option to modify an AS path as the shortest path in the network. When you choose this option, the AS path is replaced with a null or empty value. Use the replace as-path all none command to replace with none.
-
Auto: Use this option to advertise the local AS number or the neigbor's AS number as the AS path. When you choose this option, AS path is replaced based on the route policy:
-
For inbound route policy, AS path is replaced with AS path of BGP neighbor from where the prefix is received.
-
For outbound route policy, AS path is replaced with the local AS number.
Use the replace as-path all auto command to replace with auto.
-
-
'x': Use this option to replace AS path with any specified value. Use the replace as-path all 'x' command to replace with this option, where 'x' can be a single AS number or a sequence of AS numbers separated by space.
-
Optionally, you can repeat replacing the AS path for a specified number of times. This option is supported only for the auto and 'x' parameters. Use the replace as-path all {auto | 'x'} [n] command to enable the repeat option.
-
Optionally, you can use a parameter name along with the repeat option. The parameter name must be preceded with a “$.” You can attach the route policy with the parameter to a neighbor and specify the number of times the AS path replacement should be repeated. This opton allows you to apply the same route policy to different neighbors with different AS path values.
Use the replace as-path all {auto | 'x'} [n] [parameter] command to enable the parameter along with repeat option.
You can replace the AS path for inbound eBGP, outbound eBGP, and outbound iBGP paths.
Note |
For outbound eBGP paths, the AS number of the local router is always prepended to the replaced AS path. |
Interoperability with BGP Confederation
BGP confederation is a group of multiple autonomous systems that looks like a single autonomous system to the outside world. When confederation is configured on BGP peers, the AS path is replaced as follows:
-
When you replace the AS path in an outbound BGP router, which receives prefix from a BGP neighbor configured with confederation, the specified AS path value is appended to the confederation sequence.
-
When you replace the AS path in an inbound BGP router configured with confederation, the confederation sequence is replaced with the specified AS path value.
Deployment Scenario
Consider a BGP network configured with AS paths. By default, BGP selects the route with shortest AS path to reach the destination. You can alter the default route by using the replace BGP AS path feature.
In the following figure, the network consists of BGP routers configured with AS Path values. To reach Server B, Server A typically selects Path B (via S1_1, S0), as the AS path value of S0 is shorter.
You may want to use Path A to reach the destination (via S1_1, S2_1, S1_2, S0), for traffic engineering purpose. For example, Path A may be less congested and is better than Path B. To use Path A, you can replace the AS path values with one of the following options:
-
Replace AS path of Router S2_1 with a shorter value.
-
Replace AS path of Router S0 with a longer value.
Restrictions
-
The replace as-path all command isn't supported on inbound iBGP paths.
-
The replace as-path all command isn't supported on a route policy that is already configured with remove-private-as or replace as commands.
-
You can apply the route policy configured with replace as-path all only on neighbor-in or neighbor-out attach points.
Configuration Example
To replace BGP AS path with custom values, perform the following tasks on a BGP router:
This example shows how to replace AS path with null value.
/*Configure route policy to replace AS path with none*/
Router(config)#hw-module profile stats ?
Router(config)# route-policy aspath-none
Router(config-rpl)# replace as-path all none
Router(config-rpl)# end-policy
/* Apply route policy to BGP neighbor */
Router(config)# router bgp 65530
Router(config-bgp)# neighbor 111.0.0.1
Router(config-bgp-nbr)# address-family ipv4 unicast
Router(config-bgp-nbr-af)# route-policy aspath-none in
This example shows how to replace AS path with auto option.
/*Configure route policy to replace AS path with auto*/
Router(config)#route-policy aspath-auto
Router(config-rpl)# replace as-path all auto
Router(config-rpl)# end-policy
/* Apply route policy to BGP neighbor */
Router(config)# router bgp 65530
Router(config-bgp)# neighbor 111.0.0.1
Router(config-bgp-nbr)# address-family ipv4 unicast
Router(config-bgp-nbr-af)# route-policy aspath-auto out
This example shows how to replace AS path with a specified sequence of AS numbers. In this example, sequence '10 100 200 300' is used.
/*Configure route policy to replace AS path with 'x'*/
Router(config)# route-policy aspath-str
Router(config-rpl)# replace as-path all '10 100 200 300'
Router(config-rpl)# end-policy
/* Apply route policy to BGP neighbor */
Router(config)# router bgp 1
Router(config-bgp)# neighbor 111.0.0.1
Router(config-bgp-nbr)# address-family ipv4 unicast
Router(config-bgp-nbr-af)# route-policy aspath-str in
This example shows how to use replace as-path all command along with parameter to replace the AS path with specified sequence of values, repeated for specified number of times. In this example, AS path is replaced with sequence '45 55', repeated for 6 times.
/*Configure route policy to replace AS path with parameter ($n)*/
Router(config)# route-policy aspath-par($n)
Router(config-rpl)# replace as-path all '45 55' $n
Router(config-rpl)# end-policy
/* Apply route policy to BGP neighbor */
Router(config)# router bgp 1
Router(config-bgp)# neighbor 111.0.0.1
Router(config-bgp-nbr)# address-family ipv4 unicast
Router(config-bgp-nbr-af)# route-policy aspath-par(6) in
Verification
In the following output, AS path is replaced with null value.
Router# show bgp
Network Next Hop Metric LocPrf Weight Path
*> 192.168.3.0/24 192.168.3.1 0 0 i
In the following output, AS path is replaced with auto for an outbound path, where the AS path of local router is [40].
Router# show bgp
Network Next Hop Metric LocPrf Weight Path
*> 111.0.0.2/32 200.0.0.5 0 40 i
In the following output, AS path is replaced with the sequence '10 100 200 300'.
Router# show bgp
Network Next Hop Metric LocPrf Weight Path
*>111.0.0.2/32 200.0.0.5 0 10 100 200 300 i
In the following output, AS path is replaced with the sequence '45 55', repeated for 6 times.
Router# show bgp
Network Next Hop Metric LocPrf Weight Path
*>111.0.0.8/32 200.0.0.5 0 45 55 45 55 45 55 45 55 45 55 45 55 i
BGP Update Message Error Handling
The BGP UPDATE message error handling changes BGP behavior in handling error UPDATE messages to avoid session reset. Based on the approach described in IETF IDR I-D:draft-ietf-idr-error-handling, the Cisco IOS XR BGP UPDATE Message Error handling implementation classifies BGP update errors into various categories based on factors such as, severity, likelihood of occurrence of UPDATE errors, or type of attributes. Errors encountered in each category are handled according to the draft. Session reset will be avoided as much as possible during the error handling process. Error handling for some of the categories are controlled by configuration commands to enable or disable the default behavior.
According to the base BGP specification, a BGP speaker that receives an UPDATE message containing a malformed attribute is required to reset the session over which the offending attribute was received. This behavior is undesirable as a session reset would impact not only routes with the offending attribute, but also other valid routes exchanged over the session.
Discard Incoming BGP Update Message
Feature Name |
Release Information |
Feature Description |
Discard Incoming BGP Update Message |
Release 7.10.1 |
Introduced in this release on: NCS 5500 fixed port routers; NCS 5700 fixed port routers; NCS 5500 modular routers (NCS 5500 line cards; NCS 5700 line cards [Mode: Compatibility; Native]) You can now avoid the session reset when a BGP session encounters errors while parsing the received update message. This is made possible because the feature enables discarding the incoming update message as a withdraw message. The feature introduces these changes: CLI New Command: YANG Data Model
|
BGP Error Handling and Attribute Filtering Syslog Messages
When a router receives a malformed update packet, an ios_msg of type ROUTING-BGP-3-MALFORM_UPDATE is printed on the console. This is rate limited to 1 message per minute across all neighbors. For malformed packets that result in actions "Discard Attribute" (A5) or "Local Repair" (A6), the ios_msg is printed only once per neighbor per action. This is irrespective of the number of malformed updates received since the neighbor last reached an "Established" state.
%ROUTING-BGP-3-MALFORM_UPDATE : Malformed UPDATE message received from neighbor 13.0.3.50 - message length 90 bytes,
error flags 0x00000840, action taken "TreatAsWithdraw".
Error details: "Error 0x00000800, Field "Attr-missing", Attribute 1 (Flags 0x00, Length 0), Data []"
[4843.46]RP/0/0/CPU0:Aug 21 17:06:17.919 : bgp[1037]: %ROUTING-BGP-5-UPDATE_FILTERED :
One or more attributes were filtered from UPDATE message received from neighbor 40.0.101.1 - message length 173 bytes,
action taken "DiscardAttr".
Filtering details: "Attribute 16 (Flags 0xc0): Action "DiscardAttr"". NLRIs: [IPv4 Unicast] 88.2.0.0/17
[391.01]RP/0/0/CPU0:Aug 20 19:41:29.243 : bgp[1037]: %ROUTING-BGP-5-UPDATE_FILTERED :
One or more attributes were filtered from UPDATE message received from neighbor 40.0.101.1 - message length 166 bytes,
action taken "TreatAsWdr".
Filtering details: "Attribute 4 (Flags 0xc0): Action "TreatAsWdr"". NLRIs: [IPv4 Unicast] 88.2.0.0/17
BGP-RIB Feedback Mechanism for Update Generation
The Border Gateway Protocol-Routing Information Base (BGP-RIB) feedback mechanism for update generation feature avoids premature route advertisements and subsequent packet loss in a network. This mechanism ensures that routes are installed locally, before they are advertised to a neighbor.
BGP waits for feedback from RIB indicating that the routes that BGP installed in RIB are installed in forwarding information base (FIB) before BGP sends out updates to the neighbors. RIB uses the the BCDL feedback mechanism to determine which version of the routes have been consumed by FIB, and updates the BGP with that version. BGP will send out updates of only those routes that have versions up to the version that FIB has installed. This selective update ensures that BGP does not send out premature updates resulting in attracting traffic even before the data plane is programmed after router reload, LC OIR, or flap of a link where an alternate path is made available.
To configure BGP to wait for feedback from RIB indicating that the routes that BGP installed in RIB are installed in FIB, before BGP sends out updates to neighbors, use the update wait-install command in router address-family IPv4 or router address-family VPNv4 configuration mode. The show bgp , show bgp neighbors , and show bgp process performance-statistics commands display the information from update wait-install configuration.
Delay BGP Route Advertisements
Feature Name |
Release Information |
Feature Description |
Delay BGP Route Advertisements |
Release 7.5.3 |
You can now prevent traffic loss due to premature advertising of BGP routes and subsequent packet loss in a network. You can achieve this by setting the delay time of the BGP start-up in the router until the Routing Information Base (RIB) is synchronized with the Forward Information Base (FIB) in the routing table. This delays the BGP update generation and prevents traffic loss in a network. You can configure a minimum delay of 1 second and a maximum delay of 600 seconds. This feature introduces the update wait-install delay startup command. |
When BGP forwards traffic, it waits for feedback from the RIB until the RIB is ready to forward traffic. Once the RIB is ready, BGP sends the route updates to the BGP neighbors and peer-groups. Advertising routes before the RIB is synchronized in the FIB results in traffic loss. To avoid this problem, the router must delay the BGP start-up process to delay the BGP update generation so that no traffic loss happens.
To accomplish this, you must configure the update wait-install delay startup command to delay the generation of BGP updates. The show bgp process command displays the delay of the BGP process update since the last router reload.
This feature allows you to configure the minimum and maximum delay periods. The range of the delay is from 1 second to 600 seconds. As a result, network traffic loss is avoided.
Restrictions
This feature is applicable for the following Address Family Indicators (AFIs):
-
IPv4 unicast
-
IPv6 unicast
-
VPNv4 unicast
-
VPNv6 unicast
Configuration
-
Enter the IOS XR configuration mode.
Router# configure
-
Specify the BGP Autonomous System Number (AS Number).
Router(config)# router bgp 1
-
Specify the IP address from the address-family (Pv4, IPv6, VPNv4, or VPNv6) options.
Router(config-bgp)# address-family {ipv4| ipv6| vpnv4| vpn6} unicast
For example, Router(config-bgp)# address-family ipv4 unicast
-
Schedule the delay of the BGP process to prevent routes from being advertised to peers until RIB is synchronized.
Router(config-bgp-af)# update wait-install delay startup (time in seconds)
For example, Router(config-bgp-af)# update wait-install delay startup 10
-
Commit the changes.
Router(config-bgp-af)#commit
Note |
The delay time ranges from 1 second to 600 seconds. |
Running Configuration
configure
router bgp 1
address-family ipv4 unicast
update wait-install delay startup 10
!
Verification Example
The following command displays the delay of the BGP process update:
Router# show running-config router bgp 1
router bgp 1
address-family ipv4 unicast
update wait-install delay startup 10
Use-defined Martian Check
The solution allows disabling the Martian check for these IP address prefixes:
-
IPv4 address prefixes
-
0.0.0.0/8
-
127.0.0.0/8
-
224.0.0.0/4
-
-
IPv6 address prefixes
-
::
-
::0002 - ::ffff
-
::ffff:a.b.c.d
-
fe80:xxxx
-
ffxx:xxxx
-
BGP DMZ Aggregate Bandwidth
Feature Name |
Release Information |
Feature Description |
Removal of Link-Bandwidth Extended Community to iBGP Peers |
Release 7.3.2 |
The demilitarized zone (DMZ) link-bandwidth extended community allows BGP to send traffic over multiple internal BGP (iBGP) learned paths. The traffic that is sent is proportional to the bandwidth of the links that are used to exit the autonomous system. By default, iBGP propagates DMZ link-bandwidth community. The Removal of Link-Bandwidth Extended Community to iBGP Peers feature provides the flexibility to remove the DMZ link-bandwidth community to minimize the risk of exposure of the community parameters to networks zones where they are not recognized or unnecessary. |
BGP supports aggregating dmz-link bandwidth values of external BGP (eBGP) multipaths when advertising the route to interior BGP (iBGP) peer.
There is no explicit command to aggregate bandwidth. The bandwidth is aggregated if following conditions are met:
-
The network has multipaths and all the multipaths have link-bandwidth values.
-
The next-hop attribute set to next-hop-self. The next-hop attribute for all routes advertised to the specified neighbor to the address of the local router.
-
There is no out-bound policy configured that might change the dmz-link bandwidth value.
-
If the dmz-link bandwidth value is not known for any one of the multipaths (eBGP or iBGP), the dmz-link value for all multipaths including the best path is not downloaded to routing information base (RIB).
-
The dmz-link bandwidth value of iBGP multipath is not considered during aggregation.
-
The route that is advertised with aggregate value can be best path or add-path.
-
Add-path does not qualify for DMZ link bandwidth aggregation as next hop is preserved. Configuring next-hop-self for add-path is not supported.
-
For VPNv4 and VPNv6 afi, if dmz link-bandwidth value is configured using outbound route-policy, specify the route table or use the additive keyword. Else, this will lead to routes not imported on the receiving end of the peer.
extcommunity-set bandwidth dmz_ext
1:8000
end-set
!
route-policy dmz_rp_vpn
set extcommunity bandwidth dmz_ext additive <<< 'additive' keyword.
pass
end-policy
Removal of Link-Bandwidth Extended Community to iBGP Peers
The demilitarized zone (DMZ) link-bandwidth extended community allows BGP to send traffic over multiple internal BGP (iBGP) learned paths. The traffic that is sent is proportional to the bandwidth of the links that are used to exit the autonomous system. By default, iBGP propagates DMZ link-bandwidth community. The Removal of Link-Bandwidth Extended Community to iBGP Peers feature provides the flexibility to remove the DMZ link-bandwidth community to minimize the risk of exposure of the community parameters to networks zones where they are not recognized or unnecessary.
Configuration Example
Perform the following steps to allow users to be able to configure route-policy to remove the extended communities.
/* Delete all the extended communities. */
Router(config)# route-policy dmz_del_all
Router(config-rpl)# delete extcommunity bandwidth all
Router(config-rpl)# pass
Router(config-rpl)# end-policy
/* Delete only the extended communities that match an extended community mentioned in the list. */
Router(config)# route-policy dmz_CE1_del_non_match
Router(config-rpl)# if destination in (10.9.9.9/32) then
Router(config-rpl-if)# delete extcommunity bandwidth in (10:7000)
Router(config-rpl-if)# endif
Router(config-rpl)# pass
Router(config-rpl)# end-policy
/* Delete all the extended communities. */
Router(config)# route-policy dmz_del_param2($a,$b)
Router(config-rpl)# if destination in (10.9.9.9/32) then
Router(config-rpl-if)# delete extcommunity bandwidth in ($a:$b)
Router(config-rpl-if)# endif
Router(config-rpl)# pass
Router(config-rpl)# end-policy
Verification
Verify the configuration that allows the user to remove a particular extended community.
Router# show bgp 10.9.9.9/32
Fri Aug 27 13:15:05.833 EDT
BGP routing table entry for 10.9.9.9/32
Versions:
Process bRIB/RIB SendTblVer
Speaker 15 15
Last Modified: Aug 27 13:06:45.000 for 00:08:21
Paths: (3 available, best #1)
Advertised IPv4 Unicast paths to peers (in unique update groups):
13.13.13.5
Path #1: Received by speaker 0
Advertised IPv4 Unicast paths to peers (in unique update groups):
13.13.13.5
10
10.10.10.1 from 10.10.10.1 (192.168.0.1)
Origin incomplete, metric 0, localpref 100, valid, external, best, group-best, multipath
Received Path ID 0, Local Path ID 1, version 15
Extended community: LB:10:48
Origin-AS validity: (disabled)
Path #2: Received by speaker 0
Not advertised to any peer
10
11.11.11.3 from 11.11.11.3 (192.168.0.3)
Origin incomplete, metric 0, localpref 100, valid, external, multipath
Received Path ID 0, Local Path ID 0, version 0
Extended community: LB:10:48
Origin-AS validity: (disabled)
Path #3: Received by speaker 0
Not advertised to any peer
10
12.12.12.4 from 12.12.12.4 (192.168.0.4)
Origin incomplete, metric 0, localpref 100, valid, external, multipath
Received Path ID 0, Local Path ID 0, version 0
Extended community: LB:10:48
Origin-AS validity: (disabled)
22:35 30-09-2021
BGP Extended Route Retention
Feature Name |
Release Name |
Description |
---|---|---|
BGP Extended Route Retention |
Release 7.3.3 |
This feature allows you to maintain stale routing information from a failed BGP peer for longer periods of time than that is configured in the Graceful Restart atribute. However, this feature ensures that the BGP neighbor considers the stale routes as new routes. |
When a BGP peer fails, the BGP Extended Route Rention feature applies the route retention policy to the routes to modify the route attributes. This feature modifies the route attributes in addition to the modification that occur due to neighbor's inbound policy. This feature enables the use of route retention policy in place of LLGR, when the BGP hold timer expires or when the BGP session fails to reestablish as a receiving speaker within the configured graceful retart timer.
When you apply LLGR, you cannot remove the LLGR_STALE community when the stale route is advertised, and the route will treat it as the least preferred. Also, stale routes may be advertised to those neighbors that would not have advertised the LLGR capability under the following confitions:
-
The neighbors must be internal, either IBGP or confederation, neighbors.
-
The NO_EXPORT community must be attached to the stale routes.
-
The stale routes must have their LOCAL_PREF community set to zero.
This feature provides you the flexibility to advertise stale routes to eBGP neighbors and enable you to specify local preference values for any stale route that is retained within the iBGP system.
Restrictions
-
The neighbor should be capable of graceful restart.
-
When the BGP neighbor fails, the graceful retart functionality is applied till the graceful restart timer is valid.
-
The Extend Route Retention feature starts, when the graceful restart timer expires,
-
Soft-reconfiguration inbound configuration is a mandatory configuration. If required, configure the inbound policy.
-
The Extended Route Retention feature starts only when BGP peer goes down, that is, on the expiry of the hold-down timer.
-
For any other trigger, such as the expiry of a timer, the routes are not indicated as stale and the routes are purged.
-
The Extended Route Retention feature is applicable only in the following address-family modes:
-
IPv4 and IPv6 unicast address family mode
-
IPv4 and IPv4 labelled unicast address family mode
-
-
You cannot configure both LLGR and Extended Route Retention feature on the same neighbor.
-
When you configure the Extended Route Retention feature, the capablity attribute is not sent.
Configuration Example
Configure a route policy:
Router(config)# route-policy RRP_comm_no_export_local_pref_2500
Router(config-rpl)# set community RRP_comm_no_export additive
Router(config-rpl)# set local-preference 2500
Router(config-rpl)# end-policy
Router(config-rpl)# exit
Router(config)# route-policy comm_number_local_pref
Router(config-rpl)# set community comm_number
Router(config-rpl)# set local-preference 10000
Router(config-rpl)# end-policy
Apply route policy parameters to a neighbor:
Router(config)# router bgp 140
Router(config-bgp)# neighbor 10.1.1.1
Router(config-bgp-nbr)# remote-as 1
Router(config-bgp-nbr)# update-source Loopback 0
Router(config-bgp-nbr)# address-family ipv4 unicast
// Configure route-retention policy
Router(config-bgp-nbr-af)# route-policy RRP_comm_no_export_local_pref_2500 retention retention-time 2340
// Configure the inbound route policy
Router(config-bgp-nbr-af)# route-policy comm_number_local_pref in
// Enables the view of the peer's adj-rib-in table
Router(config-bgp-nbr-af)# soft-reconfiguration inbound always
Verification
Verify the configured route-retention policy:
Router# show bgp neighbor 10.1.1.1
Fri Oct 22 04:52:44.972 PDT
BGP neighbor is 10.1.1.1
Remote AS 1, local AS 1, internal link
Remote router ID 10.1.1.1
BGP state = Established, up for 00:03:03
…
For Address Family: IPv4 Unicast
BGP neighbor version 16172
Policy for incoming advertisements is comm_number_local_pref
Policy for Retention is RRP_comm_no_export_local_pref_2500
Configured route retention policy stale timer for routes is 2340 seconds
….
Verify the stale routes.
Router# show bgp ipv4 unicast 181.1.1.0/24
Fri Oct 22 04:56:15.906 PDT
….
Path #1: Received by speaker 0
Advertised IPv4 Unicast paths to peers (in unique update groups):
3.3.3.3
100, (Received from a RR-client), (long-lived/route-retention policy stale)
192.1.2.1 (metric 10) from 192.1.2.1 (10.1.1.1)
Origin IGP, metric 1221, localpref 2500, valid, internal, best, group-best, multipath
Received Path ID 0, Local Path ID 1, version 16243
Community: 1:100 no-export