About Telemetry
Collecting data for analyzing and troubleshooting has always been an important aspect in monitoring the health of a network.
Cisco NX-OS provides several mechanisms such as SNMP, CLI, and Syslog to collect data from a network. These mechanisms have limitations that restrict automation and scale. One limitation is the use of the pull model, where the initial request for data from network elements originates from the client. The pull model does not scale when there is more than one network management station (NMS) in the network. With this model, the server sends data only when clients request it. To initiate such requests, continual manual intervention is required. This continual manual intervention makes the pull model inefficient.
A push model continuously streams data out of the network and notifies the client. Telemetry enables the push model, which provides near-real-time access to monitoring data.
Telemetry Components and Process
Telemetry consists of four key elements:
-
Data Collection — Telemetry data is collected from the Data Management Engine (DME) database in branches of the object model specified using distinguished name (DN) paths. The data can be retrieved periodically (frequency-based) or only when a change occurs in any object on a specified path (event-based). You can use the NX-API to collect frequency-based data.
-
Data Encoding — The telemetry encoder encapsulates the collected data into the desired format for transporting.
NX-OS encodes telemetry data in the Google Protocol Buffers (GPB) and JSON format.
-
Data Transport — NX-OS transports telemetry data using HTTP for JSON encoding and the Google remote procedure call (gRPC) protocol for GPB encoding. The gRPC receiver supports message sizes greater than 4 MB. (Telemetry data using HTTPS is also supported if a certificate is configured.)
Starting with Cisco NX-OS Release 9.2(1), telemetry now supports streaming to IPv6 destinations and IPv4 destinations.
Use the following command to configure the UDP transport to stream data using a datagram socket either in JSON or GPB: destination-group num ip address xxx.xxx.xxx.xxx port xxxx protocol UDP encoding {JSON | GPB }
Example for an IPv6 destination: destination-group 100 ipv6 address 10:10::1 port 8000 protocol gRPC encoding GPB
The UDP telemetry is with the following header: typedef enum tm_encode_ { TM_ENCODE_DUMMY, TM_ENCODE_GPB, TM_ENCODE_JSON, TM_ENCODE_XML, TM_ENCODE_MAX, } tm_encode_type_t; typedef struct tm_pak_hdr_ { uint8_t version; /* 1 */ uint8_t encoding; uint16_t msg_size; uint8_t secure; uint8_t padding; }__attribute__ ((packed, aligned (1))) tm_pak_hdr_t;
Use the first 6 bytes in the payload to process telemetry data using UDP, using one of the following methods:
-
Read the information in the header to determine which decoder to use to decode the data, JSON or GPB, if the receiver is meant to receive different types of data from multiple endpoints.
-
Remove the header if you are expecting one decoder (JSON or GPB) but not the other.
-
-
Telemetry Receiver — A telemetry receiver is a remote management system or application that stores the telemetry data.
The GPB encoder
stores data in a generic key-value format. The encoder requires metadata in the
form of a compiled
.proto
file to translate the data into GPB format.
In order to receive and decode the data stream correctly, the receiver requires the .proto
file that describes the encoding and the transport services. The encoding decodes the binary stream into a key value string
pair.
A telemetry
.proto
file that describes the GPB encoding and gRPC
transport is available on Cisco's GitLab: https://github.com/CiscoDevNet/nx-telemetry-proto
High Availability of the Telemetry Process
High availability of the telemetry process is supported with the following behaviors:
-
System Reload — During a system reload, any telemetry configuration and streaming services are restored.
-
Supervisor Failover — Although telemetry is not on hot standby, telemetry configuration and streaming services are restored when the new active supervisor is running.
-
Process Restart — If the telemetry process freezes or restarts for any reason, configuration and streaming services are restored when telemetry is restarted.