Key Performance Indicators (KPIs)
The following section describes KPIs.
ETCD/Cachepod Replication KPIs
The following table lists ETCD/Cachepod Replication KPIs.
KPI Name |
Description |
Labels |
Possible Values |
---|---|---|---|
geo_replication _total |
This KPI displays total number of replication requests/responses for various Sync types and Replication types. |
ReplicationRequest Type |
Request / Response |
ReplicationSync Type |
Immediate / Deferred / Pull |
||
ReplicationNode |
ETCD / CACHE_POD / PEER |
||
ReplicationReceiver |
Local / Remote |
||
status |
True / False |
||
status_code |
Error code/description |
Geo Rejected Role Change KPIs
The following table lists Geo Rejected Role Change KPIs.
KPI Name |
Description |
Labels |
Possible Values |
---|---|---|---|
geo_RejectedRole Changed_total |
This KPI displays the total number of rejected requests/calls received for STANDBY instance. After the count, the same instance is moved to PRIMARY. |
RejectedCount |
Number value indicating rejected calls/requests received for standby instance. |
GRInstance Number |
1 / 2 |
Monitoring KPIs
The following table lists monitoring KPIs.
KPI Name |
Description |
Labels |
Possible Values |
---|---|---|---|
geo_monitoring _total |
This KPI displays the total number of successful / failure messages of different kinds such as, heartbeat / remoteNotify / TriggerGR and so on. |
ControlAction Type |
AdminMonitoring ActionType / AdminRemote MessageAction Type / AdminRole ChangeActionType |
ControlAction NameType |
MonitorPod / MonitorBfd / RemoteMsgHeartbeat / RemoteMsgNotifyFailover / RemoteMsgNotify PrepareFailover / RemoteMsgGetSiteStatus / RemoteClusterPodFailure / RemoteSiteRole Monitoring / TriggerGRApi / ResetRoleApi |
||
Admin Node |
Any string value. For example, GR Instance ID or instance key / pod name |
||
Status Code |
0 / 1001 / 1002 / 1003 / 1004 / 1005 / 1006 / 1007 / 1008 / received error code (1206, 1219, 2404, …) |
||
Status Message |
Success (0) / STANDBY_ERROR => STANDBY/STANDBY => PRIMARY (0) / Pod Failure (0) / CLI (0) / BFD Failure (0) / Decode Failure (1001) / remote status unavailable (1002) / target role does not support (1002) / Pod Failure (1002) / CLI (1002) / BFD Failure (1002) / site is down (1003) / Pod Failure (1003) / CLI (1003) / BFD Failure (1003) / Traffic Hit (1004) / Pod Failure (1004) / CLI (1004) / BFD Failure (1004) / current role is not STANDBY_ERROR/ STANDBY to reset role (1005) / resetRole: Key not found in etcd (1006) / monitoring threshold per pod is breached (1007) / Retry on heartbeat failure (1008) / received error message (No remote host available for this request / Selected remote host <remotehostname> has no client connection / Sla is expired for transaction / …) |
BFD KPIs
The following table lists BFD KPIs.
KPI Name |
Description |
Labels |
Possible Values |
---|---|---|---|
bgp_speaker _bfd_status |
This KPI displays BFD link status on BGP Speaker. |
status |
STATE_UP / STATE_DOWN |
geo_bfd_ status |
This KPI displays BFD link status on Geo POD. |
status |
STATE_UP / STATE_DOWN |
KPI Name |
Description |
Gauge |
---|---|---|
bgp_speaker _bfd_status |
This KPI displays BFD link status on BGP Speaker. |
1 (UP) or 0 (DOWN) |
geo_bfd_ status |
This KPI displays BFD link status on Geo POD. |
1 (UP) or 0 (DOWN) |
Cross-rack-routing BFD Interface Monitoring
KPI Name |
Description |
Labels |
Possible Values |
---|---|---|---|
geo_monitoring_ total |
This KPI displays the total number of Gateway Down or LocalBFDInterface down messages when peer rack is down with the details of gateway IP or interface name. |
ControlAction Type |
AdminMonitoring ActionType |
ControlAction NameType |
MonitorGateway / MonitorLocalBfdInterface |
||
AdminNode |
gateway_ip / interface_name |
||
status |
gateway ip is down from all proto node / local bfd interface is down from all proto node |
||
status_code |
1012 / 1013 |
||
bgp_bfd_Monitor_ Interface_ status (Type - Gauge) |
This KPI indicates each peer connection status. This connection is BFD interface configured and peers on the remote rack. |
interface |
<Local Rack Interface Name> |
peer_address |
<Remote Rack neighbor Ip address> |
||
type |
Bfd-Peer |
||
bgp_bfd_Monitor_ Remote_Rack_ status (Type - Gauge) |
This KPI indicates the status of remote rack. Current rack interface and remote rack peers are configured in as a part of BFD peering. Rack status is up if any of the connection from both the proto node is up. If connection is down at both the proto nodes, then this KPI indicates the remote rack status is down. |
status |
BFD_Remote_ Rack_STATUS |
Local Interface Monitoring
KPI Name |
Description |
Labels |
Possible Values |
---|---|---|---|
geo_monitoring_ total |
This KPI displays the total number of local interface down cases with the details of interface name. |
ControlAction Type |
AdminMonitoring ActionType |
ControlAction NameType |
MonitorInterface |
||
AdminNode |
interface_name |
||
status |
Local interface is down from all proto node |
||
status_code |
1014 |
GR Instance Information
KPI Name |
Description |
Labels |
Possible Values |
---|---|---|---|
gr_instance_ information (Type – Guage) |
This KPI displays the current role of the GR instance in the application. |
gr_instance_id |
Configured GR instances value (numerical value) |
Geo Maintenance Mode
KPI Name |
Description |
Labels |
Possible Values |
---|---|---|---|
geo_MaintenanceMode_ info (Type – Guage) |
This KPI displays the current state of maintenance mode for the rack. |
MaintenanceMode |
0: false 1: true |