简介
本文档介绍vManage集群DR设置中配置数据库恢复失败的问题。
问题
从备份恢复vManage NMS:vmanage cluster DR设置中的configuration-db恢复失败
在CLI中,使用request nms configuration-db restore path命令。此命令从文件定位数据路径恢复配置数据库。在本例中,目标为备用vManage NMS。在备用vManage NMS上运行以下命令:
vmanage-1# request nms configuration-db restore path /home/admin/cluster-backup.tar.gz
Configuration database is running in a cluster mode
!
!
!
line omitted
!
!
!
.................... 80%
.................... 90%
.................... 100%
Backup complete.
Finished DB backup from: 30.1.1.1
Stopping NMS application server on 30.1.1.1
Stopping NMS application server on 30.1.1.2
Stopping NMS application server on 30.1.1.3
Stopping NMS configuration database on 30.1.1.1
Stopping NMS configuration database on 30.1.1.2
Stopping NMS configuration database on 30.1.1.3
Reseting NMS configuration database on 30.1.1.1
Reseting NMS configuration database on 30.1.1.2
Reseting NMS configuration database on 30.1.1.3
Restoring from DB backup: /opt/data/backup/staging/graph.db-backup
cmd to restore db: sh /usr/bin/vconfd_script_nms_neo4jwrapper.sh restore /opt/data/backup/staging/graph.db-backup
Successfully restored DB backup: /opt/data/backup/staging/graph.db-backup
Starting NMS configuration database on 30.1.1.1
Waiting for 10s before starting other instances...
Starting NMS configuration database on 30.1.1.2
Waiting for 120s for the instance to start...
NMS configuration database on 30.1.1.2 has started.
Starting NMS configuration database on 30.1.1.3
Waiting for 120s for the instance to start...
NMS configuration database on 30.1.1.3 has started.
NMS configuration database on 30.1.1.1 has started.
Updating DB with the saved cluster configuration data
Successfully reinserted cluster meta information
Starting NMS application-server on 30.1.1.1
Waiting for 120s for the instance to start...
Starting NMS application-server on 30.1.1.2
Waiting for 120s for the instance to start...
Starting NMS application-server on 30.1.1.3
Waiting for 120s for the instance to start...
Removed old database directory: /opt/data/backup/local/graph.db-backup
Successfully restored database
vmanage-1#
步骤1. Config-db应使用这些日志恢复,但存在配置数据库备份失败并出现这些错误消息的情况。
vmanage-1# request nms configuration-db restore path /home/admin/cluster-backup.tar.gz
Configuration database is running in a cluster mode
!
!
line ommited
!
!
2020-08-09 17:13:48.758+0800 INFO [o.n.k.i.s.f.RecordFormatSelector] Selected RecordFormat:StandardV3_2[v0.A.8] record format from store /opt/data/backup/local/graph.db-backup
2020-08-09 17:13:48.759+0800 INFO [o.n.k.i.s.f.RecordFormatSelector] Format not configured. Selected format from the store: RecordFormat:StandardV3_2[v0.A.8]
.................... 10%
.................... 20%
.................... 30%
.................... 40%
.................... 50%
.................... 60%
.................... 70%
...............Checking node and relationship counts
.................... 10%
.................... 20%
.................... 30%
.................... 40%
.................... 50%
.................... 60%
.................... 70%
.................... 80%
.................... 90%
.................... 100% Backup complete.
Finished DB backup from: 30.1.1.1
Stopping NMS application server on 30.1.1.1
Stopping NMS application server on 30.1.1.2
Could not stop NMS application-server on 30.1.1.2
Failed to restore the database
步骤2.在上述故障中,在vmanage中的集群管理页下,导航到Administrator > Cluster management > Select neighbor vmanage(...)> Edit
编辑集群管理中的vManage时,收到的错误是:"未能获取已配置ips的列表 — 身份验证失败"'
解决方案
在vManage集群中的config-db恢复操作期间,需要启动/停止远程节点上的服务。这由Netconf向集群中的远程节点发出的请求完成。
如果集群中vmanages之间存在控制连接,则vmanage尝试使用远程节点的公钥对远程节点进行身份验证,以验证Netconf请求,这类似于控制设备之间的连接。如果不存在,则它将回退到数据库表中存储的用于形成集群的凭据。
我们遇到的问题是,密码通过CLI进行了更改,但数据库中的集群管理密码未更新。因此,无论何时我们更改最初用于创建集群的netadmin帐户的密码,您都需要借助集群管理的编辑操作来更新密码。这些是您需要遵循的附加步骤。
- 登录到每个vmanager GUI。
- 导航至Administrator > Cluster management > Select evory vManage(...) > Edit,如图所示。
- 更新与CLI等效的密码。
注意:在此场景中,从CLI回滚密码不可行。
最佳实践
在群集中更改vManage密码的最佳做法是导航到Administrator > Manage users > update password。
此过程会更新集群中所有3个vManager中的密码以及集群管理密码。
相关信息