Cisco Network Insights for Resources Setup and Settings

This chapter contains the following sections:

Cisco Network Insights for Resources Topology

The following figure describes the leaf switch-spine switch topology for 1-HOP or 2-HOP flow telemetry correlation.

The following figure describes the leaf switch-spine switch topology for 3-HOP or 4-HOP flow telemetry correlation.

Supported scenarios: The Cisco Network Insights for Resources topology supports the following scenarios.

VXLAN

  • VPC on leaf switch

  • Border spine switch

  • Border leaf switch

  • IR or Multicast underlay

  • EBGP or IBGP

  • IPv4 underlay

  • IPv4 or IPv6 overlay

Legacy spine switch or leaf switch

  • VPC on leaf switch

  • IPv4 or IPv6

Cisco Network Insights for Resources Components in Cisco DCNM

The Cisco Network Insights for Resources (Cisco NIR) is a real-time monitoring and analytics application.

The Cisco NIR app consists of the following components:

  • Data Collection—The streaming of telemetry data is done by the Operating System on the fabric nodes. As each data source is different and the format in which data is streamed is different, there are corresponding collectors running analytics that translate the telemetry events from the nodes into data records to be stored in the data lake. The data stored in the data lake is in a format that the analytics pipeline can understand and work upon.

    The following telemetry information is collected from various nodes in the fabric:

    • Resources Analytics—This includes monitoring software and hardware resources of fabric nodes on Cisco DCNM.

    • Environmental—This includes monitoring environmental statistics such as fan, CPU, memory, and power of the fabric nodes.

    • Statistics Analytics—This includes monitoring of nodes, interfaces, and protocol statistics on Cisco DCNM and fabric nodes.

    • Flow Analytics—This includes detecting anomalies in the flow such as average latency, packet drop indication, and flow move indication across the fabric.

  • Resource Utilization and Environmental Statistics—Resource analytics supports configuration, operational and hardware resources. Environmental covers CPU, memory, temperature, and fan speed. System analytics also covers, Anomalies, and trending information of each resource and graphing of parameters which help Network operators to debug over period of time.

  • Predictive Analytics and Correlation—The value-add of this platform is predicting failures in the fabric and correlating internal fabric failures to the user-visible/interested failures.

  • Anomaly Detection—Involves understanding the behavior of each component while using different machine learning algorithms and raising anomalies when the resource behavior deviates from the expected pattern. Anomaly detector applications use different supervised and unsupervised learning algorithms to detect the anomalies in the resources and they log the anomalies in an anomaly database.

Guidelines and Limitations

The following are the guidelines and limitations for the Cisco Network Insights for Resources (Cisco NIR) application in the Cisco Data Center Network Manager (Cisco DCNM):

  • After upgrading Cisco DCNM or Cisco NIR app to new version and before starting Cisco NIR app, make sure the following are set:

    • Navigate to Applications > Preferences from Cisco DCNM Configuration page and modify Telemetry Network Configuration to the desired value.


      Note

      The Out-Of-Band is a default value for the interface, which may not be what you set prior to upgrade.


    • Click Submit.

  • When you did not modify the telemetry network configuration post Cisco DCNM or Cisco NIR app upgrade and enabled telemetry on any fabrics from Cisco NIR, then the application is not enabled and configured properly.

    • Login to Cisco DCNM active node using SSH client as root. Incase you are already logged into Cisco DCNM active node, change to root.

    • Execute the following command.
      curl -d '{"AppName": "NIR"}' http://127.0.0.1:9595/telemetry/force_cleanup_app
    • Execute the following command.
      curl -d '{"AppName": "NIR"}' http://127.0.0.1:9595/telemetry/force_cleanup_hw_app
    • After few minutes all the fabrics in Cisco NIR Configuration page show a disabled state.

    • Once the status is disabled for all the fabrics, modify Telemetry Network Configuration in Cisco DCNM to the desired value.

    • Enable telemetry on the fabrics from the Cisco NIR Configuration page.

  • For Flow Telemetry the Cisco NIR app captures the maximum anomaly score for a particular flow, for the entire cycle of the user specified time range.

  • If Strict Config Compliance (SCC) is enabled in a fabric, you can not deploy Cisco NIR on Cisco DCNM.

  • To enable telemetry on monitored fabric through Cisco NIR app, you must first delete all existing telemetry configurations on all the nodes in the monitored fabric before you enable this fabric from Cisco NIR app. The telemetry then assigns the receiver IPs to these nodes, which the Health page displays. The telemetry configuration will not push any telemetry configurations to the nodes because they are monitored. Therefore you have to check the receiver IPs from the Health page and must configure the nodes manually.

  • After enabling telemetry in Cisco NIR app, to upgrade or downgrade a switch follow these steps:

    • Remove the switch from the fabric that needs upgrade or downgrade. Then upgrade or downgrade the switch image and add it back to the fabric.

    • Or, disable telemetry on the fabrics where switches need upgrade or downgrade. Then upgrade or downgrade the switch image and then enable telemetry on the fabrics.

  • IPv6 is not supported for receiving telemetry data for Cisco NIR app.

  • The Cisco NIR application requires that physical servers hosting Cisco DCNM computes as VMs are atleast Cisco C220-M4 category. It is also required that a compute be hosted on a data store with a dedicated hard disk of atleast 500GB. See Hardware Requirements.

  • For instances where one or more fabrics do not recover from disabling state, you must stop and restart the Cisco NIR application in the Cisco DCNM. This will recover the failed disable state.

Cisco NIR App Initial Setup

The first time you launch the Cisco NIR app, you are greeted with a Welcome to Network Insights dialog. Follow these steps to complete the initial setup of Cisco NIR app:

Before you begin

Before you begin the initial set up of the Cisco NIR application in the Cisco DCNM, make sure the following prerequisites are met:

  • The primary and standby hosts (HA) return a status of OK:

    1. In Cisco DCNM, click Administration.

    2. Under Cisco DCNM Server, click Native HA.

    3. Check the HA Status attribute as shown in the following image:


    Note

    It may take some time for both hosts to be recognized. Once the OK status is displayed, AMQP notifications can begin. Check the AMQP server status below.
  • The AMQP Server returns a status of OK:

    1. In Cisco DCNM, click Dashboard.

    2. In the Server Status tile, click Health Check.

    3. Check the status of the AMQP Server component as shown in the following image:
  • Precision Time Protocol (PTP) must be configured on all nodes you want to support with Cisco NIR. In both managed and monitor fabric mode, the user must ensure PTP is correctly configured on all nodes in the fabric. To ensure Precision Time Protocol is setup correctly:

    • For details about Precision Time Protocol Easy Fabric, refer to Precision Time Protocol for Easy Fabric.

      Ensure PTP is enabled in Cisco DCNM easy fabric setup. The Advanced tab on Cisco DCNM fabric setup, check the box for Enable Precision Time Protocol (PTP). For details, refer to Add/Edit Fabric.

Procedure


Step 1

On the welcome dialog, click Begin First Time Setup.

The Network Insights Setup window appears.

Step 2

On the Network Insights Setup window, click Configure to configure the Data Collection Setup.

The following steps enable the fabric to be monitored by Cisco NIR application.

Step 3

In the list of available fabrics, choose a fabric you want to monitor with Cisco NIR.

Step 4

In the VXLAN / Classic column, choose the fabric type:

  • VXLAN: Identifies the fabric as a VXLAN fabric type.

    Note 
    If your network is a VXLAN fabric and you want to see VXLAN-specific information in the Cisco NIR application, you must select this option.
  • Classic: Identifies the fabric as a Classic LAN fabric.

Step 5

In the Mode column, choose the mode you want to use for the fabric selected:

  • Managed: Cisco DCNM monitors and manages the configuration of the nodes in the selected fabric. This option allows Cisco NIR app to push the telemetry configuration to the nodes in the chosen fabric.

  • Monitored: Cisco DCNM does not deploy configuration to the nodes. Cisco DCNM discovers the nodes and displays them in the topology (read-only). Cisco NIR app will not send telemetry configuration to the nodes.

    Note 
    If this option is chosen, telemetry must be configured directly on the nodes in order for Cisco NIR app to receive data. The following configuration must be added on the NX-OS switches to stream telemetry data to Cisco NIR app when the fabric is configured to be in Monitored mode:

Example:


configure terminal

feature nxapi
feature ntp
feature lldp
feature icam
feature telemetry

telemetry
  destination-profile
    use-vrf management
    
  destination-group 500
    ip address <IP address of  port 57500 protocol gRPC encoding GPB 
  sensor-group 500
    data-source NX-API
    path "show vrf all" depth unbounded
    path "show nve vrf" depth unbounded
    path "show routing ip summary cached vrf all" depth unbounded
    path "show routing ipv6 summary cached vrf all" depth unbounded
    path "show ip mroute summary vrf all" depth unbounded
    path "show ipv6 mroute summary vrf all" depth unbounded
    path "show mac address-table count" depth unbounded
    path "show nve vni" depth unbounded
    path "show nve peers detail" depth unbounded
    path "show vlan summary" depth unbounded
    path "show vpc" depth unbounded
    path "show system internal icam app system internal access-list resource utilization" depth unbounded query-condition show-output-format=json
    path "show system internal icam app hardware internal forwarding table utilization" depth unbounded query-condition show-output-format=json
  sensor-group 507
    data-source DME
    path sys/cdp depth 1 query-condition query-target=subtree&target-subtree-class=cdpIf,cdpAdjEp,cdpIfStats
    path sys/bgp depth 1 query-condition query-target=subtree&target-subtree-class=bgpEntity,bgpInst,bgpDom,bgpDomAf,bgpPeer,bgpPeerEntry,bgpPeerEntryStats,bgpPeerEvents,bgpPeerAfEntry
    path sys/lldp depth 1 query-condition query-target=subtree&target-subtree-class=lldpIf,lldpAdjEp,lldpIfStats
  sensor-group 508
    data-source DME
    path sys/intf depth 1 query-condition query-target=subtree&target-subtree-class=pcAggrIf&query-target-filter=deleted()
  sensor-group 503
    data-source DME
    path sys/intf depth 0 query-condition query-target=subtree&target-subtree-class=eqptFcotLane,eqptFcotSensor
  sensor-group 501
    data-source NX-API
    path "show port-channel summary" depth unbounded
    path "show lacp counters detail" depth unbounded
    path "show lacp interface" depth unbounded
    path "show lldp traffic interface all" depth unbounded
  sensor-group 502
    data-source DME
    path sys/intf depth 0 query-condition query-target=subtree&target-subtree-class=ethpmPhysIf,rmonEtherStats,rmonIfIn,rmonIfOut,ethpmAggrIf,l1PhysIf,pcAggrIf
  sensor-group 505
    data-source NX-API
    path "show environment fan detail" depth unbounded
    path "show environment power" depth unbounded
    path "show system internal flash" depth unbounded
    path "show clock" depth unbounded
    path "show feature" depth unbounded
  sensor-group 506
    data-source NX-API
    path "show system routing mode" depth unbounded
  sensor-group 500
    data-source NX-API
    path "show module" depth unbounded
    path "show processes log" depth unbounded
    path "show icam scale" depth unbounded
    path "show environment temperature" depth unbounded
    path "show processes cpu" depth unbounded
    path "show processes memory physical" depth unbounded
    path "show system resources" depth unbounded
  subscription 500
    dst-grp 500
    snsr-grp 504 sample-interval 61000
    snsr-grp 507 sample-interval 65000
    snsr-grp 508 sample-interval 0
    snsr-grp 503 sample-interval 62000
    snsr-grp 501 sample-interval 60000
    snsr-grp 502 sample-interval 60000
    snsr-grp 505 sample-interval 300000
    snsr-grp 506 sample-interval 3600000
    snsr-grp 500 sample-interval 59000

Step 6

Click Save.

Step 7

Click Done.

The second time you launch the Cisco NIR application, click Review First Time Setup to review the setup. Check Do not show on launch for the splash screen welcome dialog to not appear again.

Click Get Started to launch the application.


Cisco NIR App Settings

Once Cisco NIR app is installed, the following need to be checked off for the application to be fully set up:

  • NTP and Time Zone Configuration

If there are Faults present in the application, they will show on the Faults tab. In the Settings menu click Collection Status, you should see the green circles in the table indicating the nodes where information is being transmitted.

Property Description

Time Range

Specify a time range and the tables below display the data that is collected during the specified interval.

Fabric

Choose a fabric containing the nodes from which to collect telemetry data.

Clicking on this icon allows you to alter the following:

  • Flow Collection Configuration—Enable or disable flow collection and choose a previously configured fabric. Create a VRF flow collection rule configuration per fabric:

    • Choose the Fabric from the drop-down.

    • Click the Plus icon and enter the VRF name.

    • Select the switch to create a flow collection rule.

    • Click Save.

  • System Status—Displays software, hardware, operational, and capacity usage of the Cisco NIR application on the compute cluster.

  • Collection Status—Displays data collection of System Metrics, and Events information per node.

  • NetworkInsights Setup—Lets the user configure the Cisco NIR application setup and enable or disable Flow Analytics.

  • About Network Insights—Displays the application version number.

The Flow Collection Configuration example.

Cisco NIR Service Instance Status

To view Cisco NIR app service instance status, exit the Cisco NIR app and click the gear image in the lower left corner of the Cisco NIR app icon in the Cisco DCNM application work pane.

Navigating Cisco NIR

The Cisco NIR app window is divided into two parts: the Navigation pane and the Work pane.

Navigation Pane

The Cisco NIR app navigation pane divides the collected data into three categories:

1 Dashboard: The main dashboard for the Cisco NIR app providing immediate access to anomalies.

2 System: Resource and environmental utilization.

3 Operations: Statistics information for interfaces and protocols.

Expanding System and/or Operations reveals additional functions:

1 Dashboard View icon: Provides immediate access to top usage or issues for the selected telemetry type.

2 Browse View icon: Provides a detailed view of returned data for the selected telemetry type and allows for filtering to further isolate problem areas.

Work Pane

The work pane is the main viewing location in the Cisco NIR app. All information tiles, graphs, charts, and lists appear in the work pane.

Dashboard Work Pane

In an information tile, you can usually click on a numeric value to switch to the Browse work pane:

1 Launches the Browse work pane with all of the items displayed from the graph in the information tile.

2 Launches the Browse work pane with only the selected items displayed from the number in the information tile.

Browse Work Pane

The Browse work pane isolates the data for the parameter chosen on the Dashboard. The Browse work pane displays a top node lists, graphs over time, and lists all the nodes in an order defined by the anomaly score:

Clicking on one of the nodes in the list opens the Details work pane for that selection.

Details Work Pane

The Details work pane provides resource details about the item selected in the event list on the Browse work pane. The Details work pane consists of:

  • General Information: Includes the anomaly score and the node name.

  • Resource Trends: Includes operational resources, configuration resources, and hardware resources.

  • Anomalies: Includes all anomalies for the node resource.