The documentation set for this product strives to use bias-free language. For the purposes of this documentation set, bias-free is defined as language that does not imply discrimination based on age, disability, gender, racial identity, ethnic identity, sexual orientation, socioeconomic status, and intersectionality. Exceptions may be present in the documentation due to language that is hardcoded in the user interfaces of the product software, language used based on RFP documentation, or language that is used by a referenced third-party product. Learn more about how Cisco is using Inclusive Language.
This document describes Unified Multiprotocol Label Switching (MPLS), which is all about scaling. It provides a framework of technology solutions to bring simple end-to-end traffic and/or services across a traditionally segmented infrastructure. It makes use of both the benefits of a hierarchical infrastructure as it improves scalability and the simplicity of network design.
There are no specific requirements for this document.
This document is not restricted to specific software and hardware versions.
The information in this document was created from the devices in a specific lab environment. All of the devices used in this document started with a cleared (default) configuration. If your network is live, make sure that you understand the potential impact of any command.
When you look at the history of the network packet-based services, then a change in network business values can be observed. This goes from discrete connectivity enhancements in order to make applications as fluent as possible, to collaboration technologies in order to support mobile collaboration. Finally, the on-demand cloud services are introduced with the application services in order to optimize the tools used with an organization and improve stability and cost-of-ownership.
Figure 1
This continuous value and functionality enhancement of the network results in a much more pervasive need for network simplicity, manageability, integration, and stability where networks have been segmented as a result of disjointed operational islands and no real end-to-end path control. Now there is a need to bring it all together with a single architecture which is easy to manage, provides scalability to 100,000's of nodes, and uses the current High Availability and Fast Convergence technologies. This is what Unified MPLS brings to the table, which is the segmented network into a single control plane and end-to-end path visibility.
Modern Network Requirements
How can you simplify MPLS operations in increasingly larger networks with more complex application requirements?
Traditional MPLS Challenges with Different Access Technologies
The Unified MPLS attraction is summarized in this list:
Unified MPLS is defined by the addition of extra features with classical/traditional MPLS and it gives more scalability, security, simplicity and manageability. In order to deliver the MPLS services end-to-end, end-to-end Labeled Switches Path (LSP) is needed. The goal is to keep the MPLS services (MPLS VPN, MPLS L2VPN) as they are, but introduce greater scalability. In order to do this, move some of the IGP prefixes into Border Gateway Protocol (BGP) (the loopback prefixes of the Provider Edge (PE) routers), which then distributes the prefixes end-to-end.
Figure 2
Before the Cisco Unified MPLS architecture is discussed, it is important to understand the key features used in order to make this a reality.
It is a prerequisite to have a scalable method in order to exchange prefixes between network segments. You could simply merge the IGPs (Open Shortest Path First (OSPF), Intermediate System-to-Intermediate System (IS-IS), or Enhanced Interior Gateway Routing Protocol (EIGRP)) into a single domain. However an IGP is not designed to carry 100,000s of prefixes. The protocol of choice for that purpose is BGP. It is a well-proven protocol which supports the Internet with 100,000's of routes and MPLS-VPN environments with millions of entries. Cisco Unified MPLS uses BGP-4 with label information exchange (RFC3107). When BGP distributes a route, it can also distribute an MPLS label that is mapped to that route. The MPLS label mapping information for the route is carried in the BGP update message that contains the information about the route. If the next hop is not changed, the label is preserved and the label changes if the next hop changes. In Unified MPLS, the next hop changes at Area Border Routers (ABRs).
When you enable RFC 3107 on both BGP routers, the routers advertise to each other that they can then send MPLS labels with the routes. If the routers successfully negotiate their ability to send MPLS labels, the routers add MPLS labels to all outgoing BGP updates.
The label exchange is needed in order to keep the end-to-end path information between segments. As a result, each segment becomes small enough to be managed by operators and at the same time there is circuit information distributed for path awareness between two different IP speakers.
How does it work?
Figure 3
In Figure 3 you can see that there are three segments with Label Discovery Protocol Labeled Switches Path (LDP LSP) and the access network does not have LDP enabled. The objective is to join them together so that there is a single MPLS path (Internal BGP (iBGP) hierarchal LSP) between Pre-Aggregation (Pre-Agg) Nodes. As the network is a single BGP Autonomous System (AS), all sessions are iBGP sessions. Each segment runs its own IGP (OSPF, IS-IS,or EIGRP) and LDP LSP paths within the IGP domain. Within Cisco Unified MPLS, the routers (ABRs) that join the segments must be BGP inline route-reflectors with the Next-Hop-Self and RFC 3107 in order to carry a IPv4 + Label configured on the sessions. These BGP speakers are within the Cisco Unified MPLS Architecture referenced to as ABRs.
Why are the ABRs inline route-reflectors?
One of the goals of Unified MPLS is to have a highly scalable end-to-end infrastructure. Thus, each segment should be kept simple in order to operate. All peerings are iBGP peerings, therefore there is a need for a full-mesh of peerings between all iBGP speakers within the complete network. That results in a very impractical network environment if there are thousands of BGP speakers. If the ABRs are made route-reflectors, the number of iBGP peering is reduced to the number of BGP speakers 'per-segment' instead of between 'all' BGP speakers of the complete AS.
Why Next-Hop-Self?
BGP operates on the base of recursive routing lookups. This is done in order to accommodate scalability within the underlying IGP that is utilized. For the recursive lookup, BGP uses Next-Hop attached to each BGP route entry. Thus, for example, if a Source-Node desires to send a packet to a Destination-Node and if the packet hits the BGP router, then the BGP router does a routing lookup in its BGP routing table. It finds a route toward Destination-Node and finds the Next-Hop as a next step. This Next-Hop must be known by the underlying IGP. As the final step, the BGP router forwards the packet onwards based upon the IP and MPLS label information attached to that Next-Hop.
In order to make sure that within each segment only the Next-Hops are needed to be known by the IGP, it is needed that the Next-Hop attached to the BGP entry is within the network segment and not within a neighbor or further away segment. If you rewrite the BGP Next-Hop with the Next-Hop-Self feature, ensure that the Next-Hop is within the local segment.
Put It All Together
Figure 4
Figure 4 provides an example of how the L3 VPN prefix 'A' and label exchange operates and how the MPLS label stack is created to have the end-to-end path information for the traffic flow between both PEs.
The network is partitioned as three independent IGP/LDP domains. The reduced size of routing and forwarding tables on the routers is to enable better stability and faster convergence. LDP is used to build intradomain LSPs within domains. RFC 3107 BGP IPv4+ labels are used as interdomain label distribution protocol in order to build hierarchical BGP LSPs across domains. BGP3107 inserts one extra label in the forwarding label stack in the Unified MPLS architecture.
Intradomain - LDP LSP
Interdomain - BGP Hierarchical LSP
Figure 5
VPN Prefix 'A' is advertised by PE31 to PE11 with L3VPN service label 30 and next hop as PE31's loopback via end-to-end interdomain hierarchical BGP LSP. Now, look at the forwarding path for VPN prefix 'A' from PE11 to PE31.
When you look at the MPLS label stack, the switching of the packet between a source and destination device based upon the previous prefix and label exchange is observed within the MPLS switching environment.
Figure 6
This is a Cisco technology which is used in BGP failure scenarios. The network converges without a loss of the traditional seconds in the BGP reconvergence. When BGP PIC is used, most failure scenarios can be reduced to a reconvergence time below 100 msec.
How is this done?
Traditionally when BGP detects a failure, it recalculates for each BGP entry for the best path. When there is a routing table with thousands of route entries, this can take a considerable amount of time. In addition, this BGP router needs to distribute all those new best paths to each of its neighbors in order to inform them of the changed network topology and the changed best-paths. As the final step, each of the recipient BGP speakers needs to make a best path calculation in order to find the new best paths.
Every time the first BGP speaker detects something wrong, it starts the best path calculation until all of its neighbor BGP speakers have done their recalculation, the traffic flow might be dropped.
Figure 7
The BGP PIC for IP and MPLS VPN feature improves BGP convergence after a network failure. This convergence is applicable to both core and edge failures and can be used in both IP and MPLS networks. The BGP PIC for IP and the MPLS VPN feature creates and stores a backup/alternate path in the routing information base (RIB), forwarding information base (FIB), and Cisco Express Forwarding (CEF) so that when a failure is detected, the backup/alternate path can immediately take over, thus it enables fast failover.
With a single rewrite of the next-hop information the traffic flow is restored. Additionally the network BGP convergence happens in the background, but the traffic flows are not impacted anymore. This rewrite happens within 50 msec. If you use this technology, network convergence is reduced to from seconds to 50 msec plus the IGP convergence.
BGP Add-Path is an improvement on how BGP entries are communicated between BGP speakers. If on a certain BGP speaker there is more than a single entry towards a certain destination, then that BGP speaker only sends the entry which is its best path for that destination to its neighbors. The result is that no provisions are made in order to allow the advertisement of multiple paths for the same destination.
BGP Add-Path is a BGP feature to allow more as only the best path, and allows multiple paths for the same destination without the new paths implicitly replacing any previous ones. This extension to BGP is particularly important in order to aid with BGP PIC, when BGP route-reflectors are used, so that the different BGP speakers within an AS have access to more BGP paths as just the 'Best BGP path' in accordance with the route-reflector.
Operations to achieve 50-millisecond restoration after a link or node failure can be simplified dramatically with the introduction of a new technology called loop-free alternates (LFAs). LFA enhance the link-state routing protocols (IS-IS and OSPF) in order to find alternative routing paths in a loop-free manner. LFA allows each router to define and use a predetermined backup path if an adjacency (network node or link) fails. In order to deliver a 50 msec restoration time in case of link or node failures, MPLS TE FRR can be deployed. However, this requires the addition of another protocol (Resource Reservation Protocol, or RSVP) for setup and management of TE tunnels. While this might be necessary for bandwidth management, the protection and restoration operation does not require bandwidth management. Hence, the overhead associated with the addition of RSVP TE is considered high for simple protection of links and nodes.
LFA can provide a simple and easy technique without the deployment of RSVP TE in such scenarios. As a result of these techniques, today's interconnected routers in large-scale networks can deliver 50 msec restoration for link and node failures without a configuration requirement for the operator.
Figure 8
The LFA-FRR is a mechanism that provides local protection for unicast traffic in IP, MPLS, Ethernet Over MPLS (EoMPLS), Inverse Multiplexing over ATM (IMA) over MPLS, Circuit Emulation Service over Packet Switched Network (CESoPSN) over MPLS, and Structure-Agnostic Time Division Multiplexing over Packet (SAToP) over MPLS networks. However, some topologies (such as the ring topology) require protection that is not afforded by LFA-FRR alone. The Remote LFA-FRR feature is useful in such situations.
The Remote LFA-FRR extends the basic behavior of LFA-FRR to any topology. It forwards the traffic around a failed node to a remote LFA that is more than one hop away. In Figure 9, if the link between C1 and C2 fails to reach A1 then C2 sends the packet over a directed LDP session to C5 which has reachability to A1.
Figure 9
In Remote LFA-FRR, a node dynamically computes its LFA node. After the alternate node is determined (which is not directly connected), the node automatically establishes a directed Label Distribution Protocol (LDP) session to the alternate node. The directed LDP session exchanges labels for the particular forward error correction (FEC).
When the link fails, the node uses label stacking in order to tunnel the traffic to the remote LFA node, in order to forward the traffic to the destination. All the label exchanges and tunneling to the remote LFA node are dynamic in nature and preprovisioning is not required. The whole label exchange and tunneling mechanism is dynamic and does not involve any manual provisioning.
For intradomain LSPs, remote LFA FRR is utilized for unicast MPLS traffic in ring topologies. Remote LFA FRR precalculates a backup path for every prefix in the IGP routing table, which allows the node to rapidly switch to the backup path when a failure is encountered. This provides recovery times on the order of 50 msec.
When all of the previous tools and features are put together within a network environment, it creates the Cisco Unified MPLS network environment. This is the architecture example for large service providers.
Figure 10
Here ia a simplified example of Unified MPLS.
Pre-Aggregation and Cell Site Gateway Routers - Cisco IOS
Figure 11
200:200 | MPC Community |
300:300 | Aggregation Community |
Core IGP Domain | ISIS Level 2 |
Aggregation IGP Domain | ISIS Level 1 |
Access IGP Domain | OSPF 0 Areas |
Figure 12
! IGP Configuration
router isis core-agg
net 49.0100.1010.0001.0001.00
address-family ipv4 unicast
metric-style wide
propagate level 1 into level 2 route-policy drop-all ! Disable L1 to L2 redistribution
!
interface Loopback0
ipv4 address 10.10.10.1 255.255.255.255
passive
!
interface TenGigE0/0/0/0
!
interface TenGigE0/0/0/1
circuit-type level-2-only ! Core facing ISIS L2 Link
!
interface TenGigE0/0/0/2
circuit-type level-1 ! Aggregation facingis ISIS L1 Link
!
route-policy drop-all
drop
end-policy
! BGP Configuration
router bgp 100
ibgp policy out enforce-modifications
bgp router-id 10.10.10.1
address-family ipv4 unicast
allocate-label all ! Send labels with BGP routes
!
session-group infra
remote-as 100
cluster-id 1001
update-source Loopback0
!
neighbor-group agg
use session-group infra
address-family ipv4 labeled-unicast
route-reflector-client
route-policy BGP_Egress_Filter out ! BGP Community based Egress filtering
next-hop-self
!
neighbor-group mpc
use session-group infra
address-family ipv4 labeled-unicast
route-reflector-client
next-hop-self
!
neighbor-group core
use session-group infra
address-family ipv4 labeled-unicast
next-hop-self
community-set Allowed-Comm
200:200,
300:300,
!
route-policy BGP_Egress_Filter
if community matches-any Allowed-Comm then
pass
Figure 13
interface Loopback0
ipv4 address 10.10.9.9 255.255.255.255
!
interface Loopback1
ipv4 address 10.10.99.9 255.255.255.255
! Pre-Agg IGP Configuration
router isis core-agg
net 49.0100.1010.0001.9007.00
is-type level-1 ! ISIS L1 router
metric-style wide
passive-interface Loopback0 ! Core-agg IGP loopback0
!RAN Access IGP Configuration
router ospf 1
router-id 10.10.99.9
redistribute bgp 100 subnets route-map BGP_to_RAN ! iBGP to RAN IGP redistribution
network 10.9.9.2 0.0.0.1 area 0
network 10.9.9.4 0.0.0.1 area 0
network 10.10.99.9 0.0.0.0 area 0
distribute-list route-map Redist_from_BGP in ! Inbound filtering to prefer
labeled BGP learnt prefixes
ip community-list standard MPC_Comm permit 200:200
!
route-map BGP_to_RAN permit 10 ! Only redistribute prefixes
marked with MPC community
match community MPC_Comm
set tag 1000
route-map Redist_from_BGP deny 10
match tag 1000
!
route-map Redist_from_BGP permit 20
! BGP Configuration
router bgp 100
ibgp policy out enforce-modifications
bgp router-id 10.10.9.10
bgp cluster-id 909
neighbor csr peer-group
neighbor csr remote-as 100
neighbor csr update-source Loopback100 ! Cell Site - Routers RAN IGP
loopback100 as source
neighbor abr peer-group
neighbor abr remote-as 100
neighbor abr update-source Loopback0 ! Core POP ABRs - core-agg IGP
loopback0 as source
neighbor 10.10.10.1 peer-group abr
neighbor 10.10.10.2 peer-group abr
neighbor 10.10.13.1 peer-group csr
!
address-family ipv4
bgp redistribute-internal
network 10.10.9.10 mask 255.255.255.255 route-map AGG_Comm ! Advertise with
Aggregation Community (300:300)
redistribute ospf 1 ! Redistribute RAN IGP prefixes
neighbor abr send-community
neighbor abr next-hop-self
neighbor abr send-label ! Send labels with BGP routes
neighbor 10.10.10.1 activate
neighbor 10.10.10.2 activate
exit-address-family
!
route-map AGG_Comm permit 10
set community 300:300
Figure 14
interface Loopback0
ip address 10.10.13.2 255.255.255.255
! IGP Configuration
router ospf 1
router-id 10.10.13.2
network 10.9.10.0 0.0.0.1 area 0
network 10.13.0.0 0.0.255.255 area 0
network 10.10.13.3 0.0.0.0 area 0
Figure 15
Interface lookback0
ip address 10.10.11.1 255.255.255.255
! IGP Configuration
router isis core-agg
is-type level-2-only ! ISIS L2 router
net 49.0100.1010.0001.1001.00
address-family ipv4 unicast
metric-style wide
! BGP Configuration
router bgp 100
ibgp policy out enforce-modifications
bgp router-id 10.10.11.1
address-family ipv4 unicast
network 10.10.11.1/32 route-policy MPC_Comm ! Advertise Loopback-0 with MPC Community
allocate-label all ! Send labels with BGP routes
!
session-group infra
remote-as 100
update-source Loopback0
!
neighbor-group abr
use session-group infra
address-family ipv4 labeled-unicast
next-hop-self
!
neighbor 10.10.6.1
use neighbor-group abr
!
neighbor 10.10.12.1
use neighbor-group abr
community-set MPC_Comm
200:200
end-set
!
route-policy MPC_Comm
set community MPC_Comm
end-policy
The loopback prefix of the Mobile Packet Gateway (MPG) is 10.10.11.1/32, so that prefix is of interest. Now, look at how packets are forwarded from CSG to MPG.
The MPC prefix 10.10.11.1 is known to the CSG router from Pre-agg with route tag 1000 and it can be forwarded as a labeled packet with outgoing LDP label 31 (intra domain LDP LSP). The MPC community 200:200 was mapped with route tag 1000 in Pre-agg node while redistribution is in OSPF.
CSG#sh mpls forwarding-table 10.10.11.1 detail
Local Outgoing Prefix Bytes Label Outgoing Next Hop
Label Label or Tunnel Id Switched interface
34 31 10.10.11.1/32 0 Vl40 10.13.1.0
MAC/Encaps=14/18, MRU=1500, Label Stack{31}
In Pre-agg node, the MPC prefix is redistributed from BGP to RAN access OSPF process with community-based filtering and the OSPF process is redistributed into BGP. This controlled redistribution is necessary in order to make end-to-end IP reachabilty, at the same time each segment has minimum required routes.
The 10.10.11.1/32 prefix is known via hierarichal BGP 100 with the MPC 200:200 community attached. The 16020 BGP 3107 label received from the core Area Border Router (ABR) and the LDP label 22 is added on top for intradomain forwarding after the next hop recursive lookup.
Pre-AGG1#sh ip route 10.10.11.1
Routing entry for 10.10.11.1/32
Known via "bgp 100", distance 200, metric 0, type internal
Redistributing via ospf 1
Advertised by ospf 1 subnets tag 1000 route-map BGP_TO_RAN
Routing Descriptor Blocks:
* 10.10.10.2, from 10.10.10.2, 1d17h ago
Route metric is 0, traffic share count is 1
AS Hops 0
MPLS label: 16020
Pre-AGG1#sh bgp ipv4 unicast 10.10.11.1
BGP routing table entry for 10.10.11.1/32, version 116586
Paths: (2 available, best #2, table default)
Not advertised to any peer
Local
<SNIP>
Local
10.10.10.2 (metric 30) from 10.10.10.2 (10.10.10.2)
Origin IGP, metric 0, localpref 100, valid, internal, best
Community: 200:200
Originator: 10.10.11.1, Cluster list: 0.0.3.233, 0.0.2.89
mpls labels in/out nolabel/16020
Pre-AGG1#sh bgp ipv4 unicast labels
Network Next Hop In label/Out label
10.10.11.1/32 10.10.10.1 nolabel/16021
10.10.10.2 nolabel/16020
Pre-AGG1#sh mpls forwarding-table 10.10.10.2 detail
Local Outgoing Prefix Bytes Label Outgoing Next Hop
Label Label or Tunnel Id Switched interface
79 22 10.10.10.2/32 76109369 Vl10 10.9.9.1
MAC/Encaps=14/18, MRU=1500, Label Stack{22}
Pre-AGG#sh mpls forwarding-table 10.10.11.1 detail
Local Outgoing Prefix Bytes Label Outgoing Next Hop
Label Label or Tunnel Id Switched interface
530 16020 10.10.11.1/32 20924900800 Vl10 10.9.9.1
MAC/Encaps=14/22, MRU=1496, Label Stack{22 16020}
The prefix 10.10.11.1 is known via intradomain IGP (ISIS-L2) and as per the MPLS forwarding table. It is reachable through LDP LSP.
ABR-Core2#sh ip route 10.10.11.1
Routing entry for 10.10.11.1/32
Known via "isis core-agg", distance 115, metric 20, type level-2
Installed Sep 12 21:13:03.673 for 2w3d
Routing Descriptor Blocks
10.10.1.0, from 10.10.11.1, via TenGigE0/0/0/0, Backup
Route metric is 0
10.10.2.3, from 10.10.11.1, via TenGigE0/0/0/3, Protected
Route metric is 20
No advertising protos.
For the distribution of the prefixes between the segmented areas, BGP with the label (RFC 3107) is utilized. What needs to reside still within the segmented areas of IGP are the loopbacks of the PEs and addresses related to the central infrastructure.
The BGP routers that connect different areas together are the ABRs that act as a BGP Route-Reflector. These devices use the Next-Hop-Self feature, in order to avoid the need to have all Next-Hops of the complete Autonomous System within the IGP, instead of only the IP addresses of the PEs and the central infrastructure. Loop detection is completed based upon the BGP Cluster-IDs.
For Network resilience, BGP PIC with the BGP Add Path feature should be used with BGP and LFA with IGP. These features are not used in previous example.
There is currently no specific troubleshooting information available for this configuration.