Overview
Disaster recovery adds another layer of redundancy to safeguard against network downtime. It responds to a cluster failure by handing off network management duties to a connected cluster (referred to as a site going forward). Cisco DNA Center's disaster recovery implementation consists of three components: the main site, the recovery site, and the witness site. At any given time, the main and recovery sites are operating in either the active or standby role. The active site manages your network while the standby site maintains a continuously updated copy of the active site's data and managed services. Whenever an active site goes down, Cisco DNA Center automatically initiates a failover, completing the tasks necessary to designate the former standby site as the new active site.
Refer to the topics in this chapter for a description of how to set up and use disaster recovery in your production environment.
Key Terms
The following terms are key for understanding Cisco DNA Center's disaster recovery implementation:
-
Main Site: The first site you configure when setting up your disaster recovery system. By default, it operates as the active site that manages your network. For information on how to configure the sites in your system, see Configure Disaster Recovery.
-
Recovery Site: The second site you configure when setting up your disaster recovery system. By default, it acts as your system's standby site.
-
Witness Site: The third site you configure when setting up your disaster recovery system. This site, which resides on a virtual machine or separate server, is not involved with the replication of data or managed services. Its role is to give the current active site the quorum it needs to carry out disaster recovery tasks. In the event that a site fails, this prevents the split brain scenario from taking place. This scenario can occur in a two-member system when the sites cannot communicate with each other. Each site believes that it should become active, creating two active sites. Cisco DNA Center uses the witness site to arbitrate between the active and standby sites, allowing only one active site at any given time. For a description of witness site requirements, see Prerequisites.
-
Register: To add a site to a disaster recovery system, you must first register it with the system by providing information such as your main site's VIP. When registering your recovery or witness site, you will also need to provide the token that is generated when you register your main site. For more information, see Configure Disaster Recovery.
-
Configure Active: The process of establishing a site as the active site, which involves tasks such as exposing the appropriate managed service ports.
-
Active site: The site that is currently managing your network. Cisco DNA Center continuously replicates its data to your standby site.
-
Configure Standby: The process of establishing a site as the standby site, which involves tasks such as configuring the replication of the active site's data and disabling the services which manage the network on the standby site.
-
Standby Ready: When an isolated site meets the prerequisites to become a standby site, Cisco DNA Center moves it to this state. To establish this site as your system's standby site, click Rejoin in the Action area.
-
Standby site: The site that maintains an up-to-date copy of your active site's data and managed services. In the event that your active site goes down, your system initiates a failover and your standby site takes over as the active site.
Note
A message will indicate when you are currently viewing your system's standby site. You need to initiate all disaster recovery tasks from the active site.
-
Failover: Cisco DNA Center supports two types of failover:
-
System-triggered: As soon as Cisco DNA Center recognizes that your active site has gone down, it automatically carries out the tasks required to establish your standby site as the new active site. You can monitor these tasks from the Event Timeline.
-
Manual: You can initiate a manual failover to designate the current standby site as the new active site. For more information, see Initiate a Manual Failover.
Important
-
After a failover, Assurance restarts and processes a fresh set of data on the new active site. Historical Assurance data from the former active site is not migrated over.
-
After a failover, the Cisco DNA Center inventory service triggers a full device sync. This can take anywhere from a few minutes to a few hours, depending on the number of devices that are managed. As is the case when Cisco DNA Center's normally scheduled device sync is running, you will not be able to provision devices on the newly activated cluster until the device sync triggered by a failover completes.
-
-
Isolate: During a failover, the former active site is separated from the disaster recovery system. Cisco DNA Center suspends its services and stops advertising its virtual IP address (VIP). From here, Cisco DNA Center completes the tasks necessary to establish the former standby site as the new active site.
-
Pause: Temporarily suspend your disaster recovery system in order to separate the sites that make up your system and stop data and service replication. For more information, see Pause Your Disaster Recovery System.
-
Rejoin: From the tab, click this button in the Action area in order to add a Standby Ready or Paused site back into a disaster recovery system as the new standby site (after a failover has taken place). You would also click this button in order to restart a disaster recovery system that is currently paused.
-
Activate DR: User-initiated operation that creates your system's active and standby sites. This operation entails setting up intracluster communication, verifying that the sites meet disaster recovery prerequisites, and replicating data between the two sites.
-
Deregister: Click this button in the Action area to remove the three sites you have configured for your disaster recovery system. You must do so in order to make changes to any of the site settings you have entered previously.
-
Retry: In the Action area, click this button in order to reinitiate any action that failed previously.
Data Replication Overview
The data replication process syncs data between your disaster recovery system’s main site and recovery site. Its duration will depend on a few factors: the amount of data that needs to be replicated, your network’s effective bandwidth, and the amount of latency that exists between the main and recovery sites. When disaster recovery is active for your Cisco DNA Center deployment, data replication will not impact any operations or application use on the current active site (which is managing your network).
Either a full or incremental replication of data takes place, depending on which of the following scenarios is applicable:
-
After initial activation: After the initial configuration and activation of your disaster recovery system, the recovery site does not have any data. In this scenario, a full replication of data between the main and recovery sites happens.
-
After a failover: Whenever the current active site fails, the disaster recovery system triggers a failover. In this scenario, a full data replication between the main and recovery sites occurs after the failed site rejoins the system.
-
During normal operation: This is the scenario that will typically apply to your system. During its day-to-day operation, changes that take place on the current active site are continuously synced with the current standby site.
Navigate the Disaster Recovery GUI
The following table describes the components that make up Cisco DNA Center's disaster recovery GUI and their function.
Callout |
Description |
---|---|
1 |
Monitoring tab: Click to do the following:
|
2 |
Show Detail Information link: Click to open the Disaster Recovery System slide-in pane. See View Disaster Recovery System Status for more information. |
3 |
Topology: Displays either a logical or physical topology of your system that indicates the current status of your sites and their members.
|
4 |
Event Timeline: Lists every disaster recovery task that is currently in progress or has been completed for your system. For more information, see Monitor the Event Timeline. |
5 |
Configure tab: Click to enter the settings necessary to establish a connection between your disaster recovery system's sites. See Configure Disaster Recovery for more information. |
6 |
Logical and Physical tabs: Click the appropriate tab to toggle between a logical and physical topology of your system. |
7 |
Status area: Indicates the current status of your system. To view a description of the possible system states, see System and Site States. |
8 |
Legend: Indicates what the topology icons represent. To view the legend, click in the bottom right corner of the Disaster Recovery page. |
9 |
Interactive Help button: Click to open a slide-in pane that provides links to walkthroughs that provide on-screen guidance to help you complete specific tasks in Cisco DNA Center. |
10 |
Action area: Displays the disaster recovery tasks that are currently available for you to initiate. The tasks you can choose from vary, depending on whether you have configured your sites and your system's status. |
View Disaster Recovery System Status
The topology provides a graphical representation of your disaster recovery system's current status. If you want to view this information in a tabular format, you can do so in the Disaster Recovery System slide-in pane. To open this pane, do one of the following:
-
Click the Show Detail Information link. Then expand the site whose status you want to view in the slide-in pane.
-
In the topology, place your cursor over a site's Enterprise virtual IP address or a particular node's icon. In the popup window that opens, click the link in the bottom right-hand corner of the window.
The slide-in pane opens with the relevant site's information displayed.