Pervasive Load Balancing

Pervasive Load Balancing for the Programmable Fabric

In a programmable fabric, the servers, the virtual machines (VMs), and the containers (specific to a given service) can be distributed across the fabric, and attached to different ToR or leaf switches. The Pervasive Load Balancing (PLB) feature enables load balancing to the servers that are distributed across the fabric.

PLB enables fabric to act as a massive load-balancer and makes it capable of providing massive telemetry and analytics. When PLB is used as a load-balancer, you can connect between Layer 4 and Layer 7 appliances anywhere in the fabric. This is shown in figure, Load Balancing across the Fabric.

Figure 1. Load Balancing across the Fabric

You may have a large number of clients (local and across the border leaf), that include database servers, application servers, web servers, firewalls, WAAS, IPS, IDS, and video caches. The information about traffic flowing to each firewall, WAAS, IPS, IDS, and server from each device in the fabric, including information about when traffic is high or low is very valuable.

PLB sits on the path between clients and servers or Layer 4 and Layer 7 services, making PLB aware about traffic information. With this information PLB provides valuable traffic analytics and telemetry.

In the load balancing function, a virtual IP (VIP) abstracts a service provided by a physical server farm distributed across the DC fabric. When different clients (local to fabric or from a remote location) send requests for a given service, these requests are always destined to the VIP of these servers.

On the ToR or leaf switches, PLB matches the source IP address bits and mask, the destination IP address (Virtual IP address), and relevant Layer 3 or Layer 4 fields to load balance these requests among the servers.

PLB provides an infrastructure to configure a cluster of the servers (nodes) inside a device group. It segregates the client traffic based on the buckets (bit mask), and the tenant SVI configured under the PLB service. Based on the defined cluster of nodes (servers) and buckets, PLB automatically creates rules to match the client IP traffic into the buckets mask and redirects the matched traffic to a specific server node.

PLB provides the infrastructure to configure cluster of servers’ nodes inside a device group and segregates client traffic based on the buckets (bit mask) and tenant SVI configured under PLB service. Based on the defined cluster of nodes (servers) and buckets.

PLB also provides the infrastructure to periodically monitor health of all server nodes and status of its application services such as TCP, UDP, and DNS on a given VRF.

In case, if server become non-responsive or non-operational then PLB automatically switches the client traffic from the non-operational node to a single or group of configured standby nodes. Traffic assignment is achieved by automatically changing flows to a standby node.

PLB currently uses Direct Server Return (DSR) concept and functionality so that server responses are directly sent to the client.

Figure 2. Direct Server Return

PLB is fabric agnostic but currently supported with VXLAN EVPN Fabric.

PLB is currently supported on Cisco Nexus 9000 Series switches that support PBR over VXLAN.

High-level Overview of Designing PLB Topology

A high-level overview of designing Pervasive Load Balancing on the ToR switch is as follows:

  • Identify load balancing servers and create a device group.

  • Create a PLB service instance for the group, and complete the following:

    • Associate a virtual IP address (VIP) for incoming PLB traffic. The VIP represents the servers in the device group.

    • Enable other load balancing configurations.

The figure, An Example of PLB Topology, illustrates how PLB load balances east-west and north-south data center client traffic to multiple servers distributed across the fabric for a VIP.

Figure 3. An Example of PLB Topology

PLB is achieved at line-rate speed, with which, it also provides health monitoring and fail action handling capabilities.

Configuring PLB on Cisco Nexus 9000 Series Switches

Configuring PLB

Use the feature plb CLI to enable Pervasive Load Balancing (PLB) under the global configuration mode. Before configuring the PLB device group (cluster of servers) and services, PLB should be enabled in the system.


Note

The feature pbr and feature sla sender CLIs are the prerequisites for configuring PLB.


Sample Configuration


switch(config)# feature plb
switch# show feature | grep plb
plb                    1          enabled 

The no feature plb CLI removes the PLB configuration from the system. By default, PLB is disabled in the system.

Configuring PLB Device Group

You can configure the PLB device groups using the plb device-group <dg-name> CLI. All the nodes (servers) are configured under a device group submode.

Sample Configuration


switch(config)# plb device-group dg200
switch(config-device-group)#
switch# show running-config plb-services 

!Command: show running-config plb-services
!Time: Mon Jul 24 11:48:10 2017

version 7.0(3)I6(1)
feature plb

plb device-group dg200

The no plb device-group <dg-name> CLI removes the PLB device group from the configuration.


Note

A specified device-group configuration cannot be removed if it is being used under the PLB service. It should be removed from the PLB service first before removing it globally.


Configuring Nodes (Servers)

The nodes (servers clusters) can be configured using the node ip <ip-addr> CLI under a device-group sub-mode. Maximum 64 nodes can be configured inside a device group.

Sample Configuration


switch(config)# plb device-group dg200
switch(config-device-group)# node ip 10.0.0.31 
leaf3(config-dg-node)#
switch# show running-config plb-services 

!Command: show running-config plb-services
!Time: Mon Mar 27 13:17:55 2017

version 7.0(3)I6(1)
feature plb

plb device-group dg200
   node ip 10.0.0.31

Use the no node ip <ip-addr> CLI to remove the node or server configuration under the device group.

Configuring Standby Nodes

Use the standby ip <ip-addr> CLI to configure the standby node for an active node.

A node-level standby can be associated for each node.

The standby value specifies the standby node information for this active node.

The standby node for an active node, attached to the local leaf is strongly recommended.

Sample Configuration


switch(config)# plb device-group dg200
switch(config-device-group)# node ip 10.0.0.21 
switch(config-dg-node)# standby ip 20.0.0.22

switch# show running-config plb-services 

!Command: show running-config plb-services
!Time: Mon Mar 27 13:17:55 2017

version 7.0(3)I6(1)

feature plb

plb device-group dg200
   node ip 10.0.0.21
       standby ip 20.0.0.22
   node ip 10.0.0.31      
       standby ip 20.0.0.32

Use the no standby ip <ip-addr> CLI to remove the standby node configuration for an active node.

Using the Weight option

Use the weight <value> CLI to configure the weight for a node.

The weight value keyword specifies the proportionate weight for the node for the weighted traffic distribution.

The weight can be assigned only to the active node.

The default weight is 1 for each node and the maximum value to which it can be configured is 8.

Sample Configuration


switch(config)# plb device-group dg200
switch(config-device-group)# node ip 10.0.0.21 
switch(config-dg-node)# weight 2
switch# show running-config plb-services 

!Command: show running-config plb-services
!Time: Mon Mar 27 13:17:55 2017

version 7.0(3)I6(1)

feature plb

plb device-group dg200
   node ip 10.0.0.21
       standby ip 20.0.0.22
   weight 2
   node ip 10.0.0.31      
       standby ip 20.0.0.32      


Use the no weight <value> CLI to remove the weight for a node or a server.

Configuring Probe for a Node

Use the probe {icmp| tcp port <port-number>| udp port <port-number>| dns{<hostname> | <target-address>}} [frequency <seconds>] [[retry-down-count | retry-up-count] <number>] [timeout <seconds>] CLI to configure probe for a node or a server.

A node-level probe (ICMP/TCP/UDP) can be configured to monitor the health of the node. The probe value specifies the probe parameters to use for monitoring the health of this active node.


Note

The feature sla sender and feature sla responder CLI should be enabled for the probe functionality to be enabled.


The frequency field specifies the interval for the probe. The default frequency is 10 Seconds.

The retry-down-count field specifies the consecutive number of times that the probe must have failed prior to the node being marked as operationally down. The default retry-down-count is 3.

The retry-up-count field specifies the consecutive number of times the probe must have succeeded prior to the node being marked as operationally up. The default retry-up-count is 3.

The timeout field specifies the number of seconds to wait for the probe response. The default timeout is 5 Seconds.

Sample Configuration


switch(config)# plb device-group dg200 
switch(config-device-group)# node ip 10.0.0.21
switch(config-dg-node)# probe icmp retry-down-count 3 retry-up-count 3 timeout 60 frequency 60


switch# show running-config plb-services
!Command: show running-config plb-services
!Time: Mon Mar 27 13:17:55 2017
version 7.0(3)I6(1) 

feature plb

plb device-group dg200
  node ip 10.0.0.21 
    probe icmp frequency 60 timeout 60 retry-down-count 3 retry-up-count 3
    standby ip 20.0.0.22 
      probe icmp frequency 60 timeout 60 retry-down-count 3 retry-up-count 3
    weight 2 
  node ip 10.0.0.31
   probe icmp
    standby ip 20.0.0.22 
      probe icmp
       

Use the no probe {icmp| tcp port <port-number>| udp port <port-number>| dns{<hostname> | <target-address>}} [frequency <seconds>] [[retry-down-count | retry-up-count] <number>] [timeout <seconds>] CLI to remove the probe configuration for a node.


Note

When you configure a probe for an active node, you must also configure it for the standby node (if it is present).


Configuring the PLB Service at the Global Level

Configure the PLB service using the plb <service-name> CLI.

Sample Configuration


switch(config)# plb srv200 
switch(config-plb)#

switch(config-plb)# show running-config plb-services
!Command: show running-config plb-services
!Time: Mon Mar 27 14:17:43 2017
version 7.0(3)I6(1) 

feature plb

plb device-group dg200
  node ip 10.0.0.21 
    probe icmp frequency 60 timeout 60 retry-down-count 3 retry-up-count 3
    standby ip 20.0.0.22 
      probe icmp frequency 60 timeout 60 retry-down-count 3 retry-up-count 3
    weight 2 
  node ip 10.0.0.31 
    probe icmp
    standby ip 20.0.0.22 
      probe icmp			     

plb srv200

Use the no plb <service-name> command to remove the PLB service.

Configuring Default Device Group under the PLB Service

Use the device-group <dg-name> CLI to associate a configured device group to the PLB service.

Sample Configuration


switch(config)# plb srv200
switch(config-plb)# device-group dg200 

switch(config-plb)# show running-config plb-services
!Command: show running-config plb-services
!Time: Mon Mar 27 14:17:43 2017
version 7.0(3)I6(1) 

feature plb

plb device-group dg200
  node ip 10.0.0.21 
    probe icmp frequency 60 timeout 60 retry-down-count 3 retry-up-count 3
    standby ip 20.0.0.22 
      probe icmp frequency 60 timeout 60 retry-down-count 3 retry-up-count 3
    weight 2 
  node ip 10.0.0.31 
    probe icmp
    standby ip 20.0.0.22 
      probe icmp			     
plb srv200  
  device-group dg200

Use the no device-group <dg-name> CLI to detach a device group from the service.

Configuring Virtual IP (VIP) for the PLB Service

Use the virtual ip <ipv4-address > <ipv4-network-mask >[tcp | udp {port-number | any}] [device-group <dg-name>] CLI to configure virtual IP address for the PLB service for a cluster of nodes.

Same VIP cannot be used with different device groups within the same PLB service.

Maximum 64 VIPs can be configured inside the PLB service.


Note

The device group can be associated with a VIP. Using this option, multiple device groups can be a part of one service. The default device- group configuration is not required if the device group is specified with all the VIPs configured under the PLB service.


Sample Configuration

Example 1: Default device group configuration and VIP in separate commands



switch(config)# plb srv200
switch(config-plb)# device-group dg200
switch(config-plb)# virtual ip 200.200.200.200 255.255.255.255 

switch(config-plb)# show running-config plb-services
!Command: show running-config plb-services
!Time: Mon Mar 27 14:17:43 2017
version 7.0(3)I6(1) 

feature plb

plb device-group dg200
  node ip 10.0.0.21 
    probe icmp frequency 60 timeout 60 retry-down-count 3 retry-up-count 3
    standby ip 20.0.0.22 
      probe icmp frequency 60 timeout 60 retry-down-count 3 retry-up-count 3
    weight 2 
  node ip 10.0.0.31 
    probe icmp
    standby ip 20.0.0.22 
      probe icmp			     

plb srv200
device-group dg200    → default device group
virtual ip 200.200.200.200 255.255.255.255

Example 2: VIP with default device group configuration in a single command


switch(config)# plb srv200
switch(config-plb)# virtual ip 200.200.200.201 255.255.255.255 device-group dg201

switch(config-plb)# show running-config plb-services
!Command: show running-config plb-services
!Time: Mon Mar 27 14:17:43 2017
version 7.0(3)I6(1) 

feature plb

plb device-group dg201
  node ip 10.0.0.21 
    probe icmp frequency 60 timeout 60 retry-down-count 3 retry-up-count 3
    standby ip 20.0.0.22 
probe icmp frequency 60 timeout 60 retry-down-count 3 retry-up-count 3
    weight 2 
  node ip 10.0.0.31 
    probe icmp
    standby ip 20.0.0.22 
      probe icmp			     

plb srv200
virtual ip 200.200.200.200 255.255.255.255 device-group dg201

Note

Configure VIP with the approach mentioned in examples 1 or 2 in the sample configuration but do not configure both the sample configurations together.


Configuring VRF under the PLB Service

Use the vrf <vrf-name> CLI to configure VRF on the PLB service. The VRF configuration is required for the probes to work.

Sample Configuration


switch(config)# plb srv200
switch(config-plb)# vrf vpn1

switch(config-plb)# show running-config plb-services
!Command: show running-config plb-services
!Time: Mon Mar 27 14:17:43 2017
version 7.0(3)I6(1) 

feature plb

plb device-group dg200
  node ip 10.0.0.21 
    probe icmp frequency 60 timeout 60 retry-down-count 3 retry-up-count 3
    standby ip 20.0.0.22 
      probe icmp frequency 60 timeout 60 retry-down-count 3 retry-up-count 3
    weight 2 
  node ip 10.0.0.31 
    probe icmp
    standby ip 20.0.0.22 
      probe icmp			     

plb srv200 

vrf vpn1
virtual ip 200.200.200.200 255.255.255.255 device-group dg200

Use the no ingress interface <interface> CLI to remove an interface or interfaces from a service.

Configuring Source Interface under the PLB Service

Use the source-interface loopback <loopback-id> CLI to configure the source interface for the PLB service.

The source interface configuration is required for the probes on the vPC hosts or when the local VM (servers) moves from the local leaf to the remote leaf.

The prerequisite for configuring the source interface under the PLB service is that the loopback interface that gets assigned to the PLB service should be created first on the same VRF as the PLB VRF (tenant VRF) and the IP address with /32 mask should be assigned prior to using this loopback as the source interface on the PLB service.

Additionally, per VRF IGP session should be created in case of vPC switches, for the vPC host probe to work between the vPC peers.


Note

Alternatively, if per VRF IGP session is not created, use advertise-pip command under the bgp instance. For more details, see the bgp configuration details.


The loopback and per VRF IGP session configuration examples are mentioned in the sample configuration.

An ingress interface should be in the same VRF as the PLB service.

Sample Configuration

The loopback configuration prior to the source interface configuration:

  1. Create a loopback interface as displayed in the example:

    
    switch# show running-config interface loopback30
    interface loopback30
      vrf member vpn1
      ip address 20.8.1.103/32 tag 12345
      ip pim sparse-mode
    
    
  2. Create regular VLAN and per VRF SVI as displayed in the example: (It is needed only for vPC).

    
    switch# show run vlan 3
    vlan 3
    
    switch# show running-config interface Vlan3
    interface Vlan3
      no shutdown
      vrf member cisco:foo-1
      ip address 11.11.11.11/30 tag 12345 (assign 11.11.11.12 for other VPC peer)
    
    
  3. Create bgp neighbor association between the vPC peer under the bgp instance (It is needed only for vPC).

    
    switch#show run bgp
    router bgp 65101
    ..
    vrf cisco:foo-1
       ..
        neighbor 11.1.1.2 -> For per VRF IGP session, Ip from Interface Vlan 3.
          remote-as 65101
          address-family ipv4 unicast
    
    
  4. Make sure that the per VRF IGP session becomes operationally up (It is needed only for vPC).

    
    switch#sh ip bgp vrf cisco:foo-1 summary
    Neighbor        V    AS MsgRcvd MsgSent   TblVer  InQ OutQ Up/Down  State/PfxRcd
    11.1.1.1        4 65101      38      37     1504    0    0 00:31:37 9
    
    

See the following details for the source-interface configuration under the PLB service:


switch(config)# plb srv200
switch(config-plb)# source-interface loopback30 
switch(config-plb)# show running-config plb-services
!Command: show running-config plb-services
!Time: Mon Mar 27 14:17:43 2017
version 7.0(3)I6(1) 

feature plb

plb device-group dg200
  node ip 10.0.0.21 
    probe icmp frequency 60 timeout 60 retry-down-count 3 retry-up-count 3
    standby ip 20.0.0.22 
      probe icmp frequency 60 timeout 60 retry-down-count 3 retry-up-count 3
    weight 2 
  node ip 10.0.0.31 
    probe icmp
    standby ip 20.0.0.22 
      probe icmp			     

plb srv200 

vrf vpn1
source-interface loopback30
virtual ip 200.200.200.200 255.255.255.255 device-group dg200


Use the no source-interface loopback <id> CLI to remove an interface or interfaces from a service.

Configuring Ingress Interface Configuration under the PLB Service

Use the ingress interface <interface> CLI to add an ingress interface or multiple interfaces to the PLB service.


Note

An ingress interface should be in the same VRF as the PLB service.


Sample Configuration


switch(config)# plb srv200
switch(config-plb)# ingress interface Vlan2
switch(config-plb)# show running-config plb-services 

!Command: show running-config plb-services
!Time: Mon Mar 27 14:17:43 2017

version 7.0(3)I6(1)
feature plb

plb device-group dg200
  node ip 10.0.0.21 
    probe icmp frequency 60 timeout 60 retry-down-count 3 retry-up-count 3
    standby ip 20.0.0.22 
      probe icmp frequency 60 timeout 60 retry-down-count 3 retry-up-count 3
    weight 2 
  node ip 10.0.0.31 
    probe icmp
    standby ip 20.0.0.22 
      probe icmp
plb srv200 vrf vpn1
source-interface loopback30
virtual ip 200.200.200.200 255.255.255.255 device-group dg200
ingress interface Vlan2 -> this is in vrf vpn1


Use the no ingress interface <interface> CLI to remove an interface or interfaces from a service.

Configuring the Load-Balancing Options for the PLB Service

Use the load-balance {method {src {ip | ip-l4port [tcp | udp] range <x> <y>} | dst {ip | ip-l4port [tcp | udp] range <x> <y>}} | buckets <count> | mask-position <position>} to configure the load-balancing for the PLB service using the following options:

  • Buckets—Specifies the number of buckets to be created. The buckets must be configured in the power of two (If the number is not specified, the system then automatically allocates in the power of 2 based on the number of nodes).

  • Mask-position—Specifies the mask position of the load balance, starting of the bucket position (If it is not specified, then the default is 0th position but if it is specified, then it starts from 8th bit of the IP address. If it is configured greater than 23 till 32, then it restarts again from 0th position).

  • Method—Specifies the source IP address or the destination IP address, or the source IP address and the source port, or the destination IP address and the destination port based load-balancing (With respect to PLB, the source IP address based load balancing is required).

Sample Configuration


switch(config)# plb srv200
switch(config-plb)#   load-balance buckets 2 mask-position 11
switch(config-plb)# show running-config plb-services 

!Command: show running-config plb-services
!Time: Mon Mar 27 14:17:43 2017

version 7.0(3)I6(1)
feature plb

plb device-group dg200
  node ip 10.0.0.21 
    probe icmp frequency 60 timeout 60 retry-down-count 3 retry-up-count 3
    standby ip 20.0.0.22 
      probe icmp frequency 60 timeout 60 retry-down-count 3 retry-up-count 3
    weight 2 
  node ip 10.0.0.31 
    probe icmp
    standby ip 20.0.0.22 
      probe icmp

plb srv200
  device-group dg200
  virtual ip 200.200.200.200 255.255.255.255
  ingress interface Vlan2
  load-balance buckets 2 mask-position 11

Configuring Failaction Node Reassignment for the PLB Service

The failaction node reassign CLI configures failaction node reassignment for the PLB service. Failaction for PLB enables the traffic on the failed nodes that can be reassigned to the first available active probe configured node.


Note

Make sure that you that probes have been configured on the local or remote (at least 1) node to achieve failaction reassignment. Otherwise, if there are no active probes configured for nodes in the device group, traffic might be affected during failaction in case the reassigned node is not active.

Sample Configuration


switch(config)# plb srv200
switch(config-plb)#   failaction node reassign
switch(config-plb)# show running-config plb-services 

!Command: show running-config plb-services
!Time: Mon Mar 27 14:17:43 2017

version 7.0(3)I6(1)
feature plb
plb device-group dg200
  node ip 10.0.0.21 
    probe icmp frequency 60 timeout 60 retry-down-count 3 retry-up-count 3
    standby ip 20.0.0.22 
      probe icmp frequency 60 timeout 60 retry-down-count 3 retry-up-count 3
    weight 2 
  node ip 10.0.0.31 
    probe icmp
    standby ip 20.0.0.22 
      probe icmp

plb srv200
  device-group dg200
  virtual ip 200.200.200.200 255.255.255.255
  ingress interface Vlan2
  load-balance buckets 2 mask-position 11
  failaction node reassign

Shutting Down the PLB Service

Use the shut command to shutdown the PLB service. Before modifying any of the configuration under the PLB service, service should be shutdown.

By default, the PLB service is in shut mode. Shutting down the PLB service may cause traffic disruption, if the traffic is already flowing.

The no shut CLI is used to unshut the PLB service and to enable the PLB functionality.

Sample Configuration

switch(config)# plb srv200 switch(config-plb)#	no shut
switch(config-plb)# show running-config plb-services
!Command: show running-config plb-services
!Time: Mon Mar 27 14:17:43 2017
version 7.0(3)I6(1) 

feature plb

plb device-group dg200
  node ip 10.0.0.21 
    probe icmp frequency 60 timeout 60 retry-down-count 3 retry-up-count 3
    standby ip 20.0.0.22 
      probe icmp frequency 60 timeout 60 retry-down-count 3 retry-up-count 3
    weight 2 
  node ip 10.0.0.31 
    probe icmp
    standby ip 20.0.0.22 
      probe icmp			     

plb srv200
vrf vpn1
source-interface loopback30
virtual ip 200.200.200.200 255.255.255.255 device-group dg200
ingress interface Vlan2
load-balance buckets 2 mask-position 11 failaction node reassign
no shut

Enabling PLB Analytics

Use the plb analytics <plb-service-name> CLI to enable PLB analytics for a given PLB service.


Note

If the PLB service is shut down after configuring the plb analytics <plb-service-name> command, the same command should be configured again to get the PLB analytics.


Sample Configuration


switch(config)# plb analytics srv200

The figures 6 through 10 show how PLB facilitates analytics information collections with multiple options and combinations.

Figure 4. PLB Analytics for Servers of Device Group DG1
Figure 5. PLB Leaf Analytics for Device Group DG1
Figure 6. Analytics for Servers of Different Device Groups from Different Tenant SVIs
Figure 7. Analytics for Various Applications using Different Device Groups
Figure 8. Apps (VIP) Traffic Analytics across Fabric

Verifying PLB Configuration

Use the following show commands to verify PLB configuration:

  • show running-config plb-services —Displays the running configuration of all PLB services on a VDC or ToR/leaf switch.

  • show tech-support plb [detail] —Displays the technical support information for PLB.

  • show plb <svc-name> [brief] —Displays the current state of the specified PLB service or all services.

  • show plb <vrf> <svc-name> —Displays the configured VRF of the specified PLB service or all services.

  • show plb analytics <svc-name> [brief] —Displays the buckets, servers, VIP, and PLB service loads analytics for the specified PLB service or all services.

PLB Active Service Sample Outputs

The following shows an active PLB sample outputs:

Figure 9. Active Service Sample
Figure 10. PLB Analytics Sample Output