Bandwidth Optimization (BWOpt)

The Bandwidth Optimization (BWOpt) function pack provides automated SR policy based tactical traffic engineering capability to detect and mitigate congestion in your network. It achieves this through a real-time view of the network topology overlaid with a demand matrix built through telemetry-based Segment Routing Traffic Matrix (SRTM). BWOpt uses the threshold interface utilization requested by the user and compares it to the actual utilization in the network. When interface congestion is detected by BWOpt, it attempts to shift traffic away from hot spots through the use of tactical traffic engineered SR policies which are deployed to the network via SR-PCE. As network conditions (topology and/or traffic) change over time, BWOpt will continue to monitor interface utilization and manage any tactical SR policies deployed, including changing their paths and/or removing them from the network when deemed no longer necessary.

Important Notes and Limitations for BWOpt

Consider the following notes and limitations when using BWOpt:

  • Only traffic that is not in an SR policy or existing BWOpt SR policy can be rerouted to mitigate congested links. BWOpt will not shift traffic in existing SR policies that it did not create. This may prevent it from being able to mitigate congestion if most of the traffic on the congested link is in non-BWOpt SR policies.

  • BWOpt relies on the PCC's autoroute feature to steer traffic into the tactical SR policies it creates. Autoroute is applied to these policies through the properProfile ID option set in BWOpt (to align with configuration on the PCC associating that Profile ID with autoroute feature). This is critical to tactical SR policies shifting traffic away from congested links.

  • BWOpt does not support multi-area or multi-level IGP (see "IGP and Inter-AS Support" in the Cisco Crosswork Optimization Engine Installation Guide). Autoroute will not properly steer traffic onto inter-area or inter-level tactical SR policies. So, although they can be provisioned, traffic will not use them. Therefore, BWOpt will be ineffective if enabled in this environment.

  • BWOpt uses simulated traffic based on measured SRTM data to determine link utilizations and when to mitigate congestion. The simulated interface utilization that BWOpt monitors should closely align with the SNMP-based interface utilization that is displayed in the Optimization Engine UI. However, due to various factors, including SNMP polling cadence and rate averaging techniques, they may differ at times. This can result in scenarios like a link appearing to be congested in the UI and BWOpt not reacting.

  • BWOpt only creates tactical SR policies on PCCs that are sources of SRTM telemetry data. Only these nodes (typically provider edge routers) provide the telemetry-based data needed to create simulated traffic demands in the internal model representing the traffic from that node to other PE nodes in the network.

  • Only solutions that produce interface utilization below the threshold (set across all interfaces) will be deployed. If BWOpt is unable to mitigate congestion across the entire network, it will not deploy any tactical SR policies and a “Network Congested. BWOpt unable to mitigate.” alarm is set. This alarm is unset when congestion either subsides on its own or can be addressed successfully through BWOpt tactical SR policy deployments.

  • BWOpt temporarily pauses operation whenever the system is unavailable due to a restart or a rebuild of the topology from Topology Services. When this occurs, an alarm indicating this condition is set by BWOpt. During this time, BWOpt will not evaluate congestion in the network. All currently deployed tactical SR policies are maintained, but will not be modified or deleted. As soon as the model becomes available, the alarm is cleared and BWOpt will resume normal operation.

Configure Bandwidth Optimization

After Bandwidth Optimization is enabled, monitors all interfaces in the network for congestion based on the configured utilization threshold. When the utilization threshold is exceeded, it automatically deploys tactical polices and moves traffic away from the congested links. When congestion is alleviated, Bandwidth Optimization automatically removes the tactical SR policy.

Do the following to enable and configure Bandwidth Optimization.

Before you begin

Bandwidth Optimization must be installed.


Note

Bandwidth Optimization should only be enabled (Enable option set to True) together with Bandwidth on Demand if the Bandwidth on Demand "Priority Mode" option is set to True. Otherwise, their actions may conflict resulting in unpredictable behavior.


Procedure


Step 1

From the main menu, choose Optimization Engine > Function Packs > Bandwidth Optimization.

Figure 1. Bandwidth Optimization Configuration Window
Bandwidth Optimization Configuration Window
Step 2

From the Enable tile, toggle the slider to True.

Notice that each time a tile is updated it turns blue.

Step 3

Select one of the following Optimization Objectives:

  • Maximize Available Bandwidth—Leads to preferred paths that result in higher available bandwidth values on interfaces.

  • Minimize the IGP/TE/Delay—Leads to preferred paths that result in lower total IGP/TE or Delay metrics.

Step 4

In the Color tile, enter a color value to be assigned to Bandwidth Optimization SR policies.

Step 5

In the Utilization Threshold tile, enter a percentage that represents the interface utilization threshold for congestion. Traffic utilization on any interface exceeding this threshold will trigger Bandwidth Optimization to attempt to mitigate. To set thresholds for individual links, see Set Bandwidth Threshold for Links.

Step 6

In the Utilization Hold Margin tile, enter a percentage that represents the utilization below the threshold required of all interfaces to consider removing existing tactical SR policies. For example, if the Utilization Threshold is 90% and the Utilization Hold Margin is 5%, then tactical SR policies deployed by Bandwidth Optimization will only be removed from the network if all interface utilization is under 85% (90 - 5) without the tactical policy in the network. This serves as a dampening mechanism to prevent small oscillations in interface utilization from resulting in repeated deployment and deletion of tactical SR policies. The Utilization Hold Margin must be between 0 and the Utilization Threshold.

Step 7

In the Maximum Global Reoptimization Interval tile, enter the maximum time interval (in minutes) to reoptimize the existing tactical SR policies globally. During a global reoptimization, existing tactical policies may be rerouted or removed to produce a globally more optimal solution. Set to 0 to disable.

Step 8

From the Delete Tactical SR Policies when Disabled tile, toggle the slider to True if you want all deployed tactical SR policies deleted when Bandwidth Optimization is disabled.

Step 9

In the Profile ID tile, enter the profile ID that will be assigned to tactical SR policies that are created. Enter 0 if you do not wish to assign a profile ID.

Step 10

In the Max Number of Parallel Tactical Policies tile, enter the number of parallel tactical polices that Bandwidth Optimization can create between the same source and destination to obtain the utilization threshold. This is helpful when faced with large demands that cannot be moved in its entirety. Having the ability to create parallel tactical policies increases the chance for Bandwidth Optimization to mitigate congestion.

Step 11

Click the Advanced tab for more advanced configuration (see the following table for field descriptions).

Step 12

Click Commit Changes to save the configuration. begins to monitor network congestion based on the threshold that was configured.

Note 
  • You can easily turn Bandwidth Optimization on or off by toggling the Enable slider to True or False.

  • Click Show Events icon to view events relating to instantiation and removal of tactical SR policies created by Bandwidth Optimization.


Table 1. Advanced Bandwidth on Demand Fields
Field Description
Fix Tactical SR Policy Duration

The minimum time (in seconds) between the creation of a new tactical SR policy and when it can be removed or modified. This serves as a dampening factor to control the rate of change to deployed tactical SR policies.

Removal Suspension Interval

The time (in seconds) between any tactical SR policy change and when any tactical SR policy can be removed or modified. This allows SRTM to converge after a tactical SR policy creation, allowing traffic on the policy to be reported accurately.

Deployment Timeout

The maximum time (in seconds) to wait until deployment of tactical SR policies is confirmed.

The value assigned should be larger for larger networks to account for the increased processing time needed by SR-PCE to deploy an SR policy. Tactical SR policies not confirmed before this timeout are declared failed and Bandwidth Optimization will disable itself for troubleshooting.

Congestion Check Suspension Interval

The minimum duration in seconds after any tactical SR policy addition or deletion to suspend congestion detection or mitigation to allow model convergence.

Debug Optimizer

Debug Opt Max Plan Files

The maximum number of optimizer debug files written to disk.

Debug Opt

If True, optimizer debug files will be saved to disk in the /tmp directory of the Bandwidth Optimization container.

Set Bandwidth Threshold for Links

Networks have many different links (10G, 40G, 100G) that require different thresholds to be set. The Bandwidth Optimization Link Management feature allows a threshold value to be set per interface instead of just one value for the entire network.

Procedure


Step 1

From the main menu, choose Optimization Engine > Function Packs > Bandwidth Optimization.

Step 2

Click Import icon. The Import Configuration File dialog box appears.

Step 3

Click the Download sample configuration file link.

Step 4

Open and edit the file with the node, interface, and threshold information that you want to set.

Step 5

Save the file with your changes and go back to the Import Configuration File dialog box.

Step 6

Click Browse and navigate to the CSV file you just edited.

Step 7

Click Import. Bandwidth Optimization checks the CSV node entries for validity. If valid, all the entries appear in the Link Management table.

Step 8

You can do the following from this table:

  • To delete all entries, click Delete All.

  • To export the entries as a CSV file, click Export icon.


Bandwidth Optimization Example

In this example, we have enabled bandwidth optimization functionality and configured the following options in BWOpt:
Figure 2. Bandwidth Optimization Configuration
Bandwidth Optimization Configuration
Below is a network with various devices and links that span the United States. Note that there are no SR policies listed in the SR Policies window.
Figure 3. Example: Current Network
No SR policies displayed
Suppose the link between P3_NCS5501 and P4_NCS5501 goes down. Traffic moves towards other links causing congestion and exceeds the configured utilization threshold.
Figure 4. Example: Link Down Between P3 and P4 Nodes
Example: Link Down
recognizes the congestion and immediately calculates and deploys a tactical SR policy. This new tactical SR policy is listed in the SR Policies window.
Figure 5. Example: Tactical SR Policy Deployed
Tactical SR Policy Deployed

continually monitors the network. When the links between P3_NCS5501 and P4_NCS5501 are back up, will detect that the congestion (based on the defined criteria) has been mitigated. When the congestion falls under the set utilization threshold minus the utilization hold margin, the tactical SR policy is automatically removed from the network.

Troubleshoot BWOpt

BWOpt disables itself and issues an alarm when specific error conditions occur that hinder its ability to manage congestion properly and may lead to instability. The following table defines some of these conditions and possible causes to investigate. Additional details can be obtained for each error condition by referring to the BWOpt logs.

Table 2. Errors

Error Event Message

Possible Causes and Recommended Corrective Action

Optima Engine model error

The network model used by BWOpt from the Optimization Engine is corrupt or is missing key data that is needed to properly support BWOpt. Possible causes include network discovery issues or synchronization problems between the Optimization Engine and Topology Services. Try restarting the Optimization Engine pod to rebuild the model.

This error can also occur if the time required to deploy a tactical policy through SR-PCE, discover it, and add it to the model exceeds the Deployment Timeout option set for BWOpt. The default is 30 seconds which should suffice for small to medium sized networks. However, larger networks may require additional time.

PCE Dispatch unreachable

The deployment of a tactical policy to the network is not confirmed successful before the Deployment Timeout is exceeded. Increase the Deployment Timeout option to allow for additional time for deployments in larger networks.

Unable to deploy a tactical SR policy

A tactical SR policy deployment to SR-PCE was unsuccessful. There could be a variety of reasons for this. BWOpt and/or PCE Dispatch logs can provide some guidance as to the details of the failure. Confirm basic SR policy provisioning capability to the PCC via one of the SR-PCE providers is working.