Site Roles

Note
  • Cachepod/ETCD and CDL Replication happens during all the roles mentioned in the following section.

  • If GR links are down or periodic heartbeat fails, GR triggers are suspended.

  • PRIMARY: Site is ready and actively taking traffic for the given instance.

  • STANDBY: Site is standby, ready to take traffic but not taking traffic for the given instance.

  • STANDBY_ERROR: Site is in problem, not active and not ready to take traffic for the given instance.

  • FAILOVER_INIT: Site has started to failover and not in condition to take traffic. Buffer time is 2 sec for application to complete their activity.

  • FAILOVER_COMPLETE: Site has completed the failover and attempted to inform the peer site about the failover for given instance. Buffer time is 2 seconds.

  • FAILBACK_STARTED: Manual failover is triggered with delay from remote site for the given instance.

For fresh installation, site boots up with:

  • Role PRIMARY for local instance (each site has local instance-id configured to identify local instance). It is recommended not to configure the pods for monitoring during fresh installation. Once the setup is ready, you can configure the pods for monitoring.

  • Role STANDBY for other instances

For upgrades, site boots up with:

  • STANDBY_ERROR role for all the instances as moving the traffic post upgrade needs manual intervention.

  • ETCD stores instance roles.

Note

Rolling upgrade or in-service upgrade isn’t supported.