此产品的文档集力求使用非歧视性语言。在本文档集中,非歧视性语言是指不隐含针对年龄、残障、性别、种族身份、族群身份、性取向、社会经济地位和交叉性的歧视的语言。由于产品软件的用户界面中使用的硬编码语言、基于 RFP 文档使用的语言或引用的第三方产品使用的语言,文档中可能无法确保完全使用非歧视性语言。 深入了解思科如何使用包容性语言。
思科采用人工翻译与机器翻译相结合的方式将此文档翻译成不同语言,希望全球的用户都能通过各自的语言得到支持性的内容。 请注意:即使是最好的机器翻译,其准确度也不及专业翻译人员的水平。 Cisco Systems, Inc. 对于翻译的准确性不承担任何责任,并建议您总是参考英文原始文档(已提供链接)。
本文档介绍有关如何恢复CNAT VM、CUPS VM和5G-UPF VM的高级信息。
Cisco 建议您了解以下主题:
本文档中的信息基于以下软件和硬件版本:
本文档中的信息都是基于特定实验室环境中的设备编写的。本文档中使用的所有设备最初均采用原始(默认)配置。如果您的网络处于活动状态,请确保您了解所有命令的潜在影响。
UAME是新的Ultra Automation Services(UAS)软件模块,引入目的:
UAME为以下项提供部署协调:
ESC是本文中提到的VNFM,是目前唯一支持的平台。
托管云本地5G SMI VM的VM在ESC中处于错误状态。
crucs502-cnat-cn_oam1_0_d7f90c1e-4401-4be9-87f6-f39ecf04ea3a VM_ERROR_STATE
crucs502-cnat-cn_master_0_05487525-c86f-47e1-a07e-fd33720d114f VM_ERROR_STATE
crucs502-4g-CRPC_CRPCF5_0_ee07bf60-a8f8-405f-9a0d-cfa7363e32e7 VM_ERROR_STATE
在UAME和ESC中检查VM状态。从ESC开始恢复过程。如果ESC无法恢复VM,请继续从UAME重新部署。
登录UAME,导航至confd cli,然后检查状态,如下所示。
ubuntu@crucs502-uame-1:~$ /opt/cisco/usp/uas/confd-6.3.8/bin/confd_cli -u admin -C
Enter Password for 'admin':
elcome to the ConfD CLI
admin connected from 10.249.80.137 using ssh on crucs502-uame-1
crucs502-uame-1#
crucs502-uame-1#show vnfr state
VNFR ID STATE
---------------------------------
crucs502-4g-CRPCF504 alive
crucs502-4g-CRPCF505 alive
crucs502-4g-CRPCF506 alive
crucs502-4g-CRPCF507 error
crucs502-4g-CRPCF604 alive
crucs502-cnat-cnat error
尝试手动从ESC恢复。
注意:恢复可能需要900秒(15分钟)才能完成。
bootup_time 300
recovery_wait_time 600
登录主ESC,检查运行状况,然后执行恢复命令,如下所示。
Last login: Wed May 13 02:07:42 2020 from 10.x.x.x
####################################################################
# ESC on crucs502-esc-vnf-esc-core-esc-1 is in MASTER state.
####################################################################
[admin@crucs502-esc-vnf-esc-core-esc-1 ~]$ health.sh
============== ESC HA (MASTER) with DRBD =================
vimmanager (pgid 14643) is running
monitor (pgid 14712) is running
mona (pgid 14768) is running
drbd (pgid 0) is master
snmp is disabled at startup
etsi is disabled at startup
pgsql (pgid 15119) is running
keepalived (pgid 14070) is running
portal is disabled at startup
confd (pgid 15016) is running
filesystem (pgid 0) is running
escmanager (pgid 15254) is running
=======================================
ESC HEALTH PASSED
/opt/cisco/esc/esc-confd/esc-cli/esc_nc_cli recovery-vm-action DO crucs502-cnat-cn_oam1_0_d7f90c1e-4401-4be9-87f6-f39ecf04ea3a
tail -50f /var/log/esc/yangesc.log
2020-05-05 02:29:01.534 WARN ===== SEND NOTIFICATION STARTS =====
2020-05-05 02:29:01.534 WARN Type: VM_RECOVERY_COMPLETE
2020-05-05 02:29:01.534 WARN Status: SUCCESS
2020-05-05 02:29:01.534 WARN Status Code: 200
2020-05-05 02:29:01.534 WARN Status Msg: Recovery: Successfully recovered VM [crucs502-cnat-cn_oam1_0_d7f90c1e-4401-4be9-87f6-f39ecf04ea3a].
2020-05-05 02:29:01.534 WARN Tenant: core
2020-05-05 02:29:01.534 WARN Deployment name: crucs502-cnat-cnat-core
2020-05-05 02:29:01.534 WARN VM group name: oam1
<output trimmed>
/opt/cisco/esc/esc-confd/esc-cli/esc_nc_cli recovery-vm-action DO crucs502-cnat-cn_master_0_05487525-c86f-47e1-a07e-fd33720d114f
tail -50f /var/log/esc/yangesc.log
2020-05-05 02:12:51.512 WARN ===== SEND NOTIFICATION STARTS =====
2020-05-05 02:12:51.512 WARN Type: VM_RECOVERY_COMPLETE
2020-05-05 02:12:51.512 WARN Status: SUCCESS
2020-05-05 02:12:51.512 WARN Status Code: 200
2020-05-05 02:12:51.512 WARN Status Msg: Recovery: Successfully recovered VM [crucs502-cnat-cn_master_0_05487525-c86f-47e1-a07e-fd33720d114f].
2020-05-05 02:12:51.512 WARN Tenant: core
2020-05-05 02:12:51.512 WARN Deployment name: crucs502-cnat-cnat-core
<output trimmed>
检查yangesc日志(tail -50f /var/log/esc/yangesc.log),并查找状态和恢复,如上所示。如果成功,请导航至confd cli并验证。
[admin@crucs502-esc-vnf-esc-core-esc-1 ~]$ /opt/cisco/esc/confd/bin/confd_cli -u admin -C
admin connected from 10.249.80.137 using ssh on crucs502-esc-vnf-esc-core-esc-1
crucs502-esc-vnf-esc-core-esc-1# show esc_datamodel opdata tenants tenant | select deployments state_machine
NAME DEPLOYMENT NAME STATE VM NAME STATE
-------------------------------------------------------------------------------------------------------------------------------------------
<trucated output>
crucs502-cnat-cn_etcd2_0_7263c87c-ee62-4b81-8e1e-a0f5c463a5b5 VM_ALIVE_STATE
crucs502-cnat-cn_etcd3_0_512ef3c0-96a2-4a10-83b0-4c7d13805856 VM_ALIVE_STATE
crucs502-cnat-cn_master_0_05487525-c86f-47e1-a07e-fd33720d114f VM_ALIVE_STATE
crucs502-cnat-cn_master_0_8cf66daa-9dfe-4c7e-817e-36624f9c98c2 VM_ALIVE_STATE
crucs502-cnat-cn_master_0_dff4ad36-7982-4131-a737-ccb6c8eae348 VM_ALIVE_STATE
crucs502-cnat-cn_oam1_0_d7f90c1e-4401-4be9-87f6-f39ecf04ea3a VM_ALIVE_STATE
When ESC shows VM_ALIVE_STATE, verify the status in UAME
crucs502-uame-1#show vnfr state
VNFR ID STATE
---------------------------------
crucs502-4g-CRPCF504 alive
crucs502-4g-CRPCF505 alive
crucs502-4g-CRPCF506 alive
crucs502-4g-CRPCF507 alive
crucs502-4g-CRPCF604 alive
crucs502-4g-CRPCF605 alive
crucs502-4g-CRPCF606 alive
crucs502-4g-CRPCF607 alive
crucs502-4g-CRPGW502 alive
crucs502-4g-CRPGW503 alive
crucs502-4g-CRPGW608 alive
crucs502-4g-CRPGW609 alive
crucs502-4g-CRPGW610 alive
crucs502-4g-CRPGW611 alive
crucs502-4g-CRPGW612 alive
crucs502-4g-CRPGW613 alive
crucs502-4g-CRPGW614 alive
crucs502-4g-CRPGW615 alive
crucs502-4g-CRSGW606 alive
crucs502-4g-CRSGW607 alive
crucs502-4g-CRSGW608 alive
crucs502-4g-CRSGW609 alive
crucs502-4g-CRSGW610 alive
crucs502-4g-CRSGW611 alive
crucs502-5g-upf-CRUPF014 alive
crucs502-5g-upf-CRUPF015 alive
crucs502-5g-upf-CRUPF016 alive
crucs502-5g-upf-CRUPF017 alive
crucs502-5g-upf-CRUPF018 alive
crucs502-5g-upf-CRUPF019 alive
crucs502-5g-upf-CRUPF020 alive
crucs502-5g-upf-CRUPF021 alive
crucs502-5g-upf-CRUPF022 alive
crucs502-5g-upf-CRUPF023 alive
crucs502-5g-upf-CRUPF024 alive
crucs502-5g-upf-CRUPF025 alive
crucs502-5g-upf-CRUPF026 alive
crucs502-5g-upf-CRUPF027 alive
crucs502-cnat-cnat alive
crucs502-cnat-smi-cm alive
crucs502-esc-vnf-esc alive
verify the same in openstack (source the correct overcloud rc file)
(crucs502) [stack@crucs502-ospd ~]$ nova list --fields name,status,host |egrep "CRPCF507|cnat"
<truncated output>
| 3eb43fe7-9f41-42d8-afe4-80f6fd62c385 | crucs502-4g-CRPCF507-core-CRPCF5071 | ACTIVE | crucs502-compute-11.localdomain |
| cc678283-2967-4404-a714-e4dd78000e82 | crucs502-cnat-cnat-core-etcd1 | ACTIVE | crucs502-osd-compute-0.localdomain |
| 711d6fcd-b816-49d4-a702-e993765757b0 | crucs502-cnat-cnat-core-master3 | ACTIVE | crucs502-osd-compute-3.localdomain |
| 46f64bde-a8db-48f2-bf3d-fe3b01295f2f | crucs502-cnat-cnat-core-oam1 | ACTIVE | crucs502-osd-compute-3.localdomain |
| f470ba3d-813e-434b-aac8-78bc646fda22 | crucs502-cnat-cnat-core-oam2 | ACTIVE | crucs502-osd-compute-2.localdomain |
此示例显示从ESC恢复失败的案例。在这种情况下,VM从UAME重新部署。
/opt/cisco/esc/esc-confd/esc-cli/esc_nc_cli recovery-vm-action DO crucs502-4g-CRPC_CRPCF5_0_ee07bf60-a8f8-405f-9a0d-cfa7363e32e7
此输出显示yangesc.log中的故障消息
tail -50f /var/log/esc/yangesc.log
2020-05-05 02:57:21.143 WARN ===== SEND NOTIFICATION STARTS =====
2020-05-05 02:57:21.143 WARN Type: VM_RECOVERY_INIT
2020-05-05 02:57:21.143 WARN Status: SUCCESS
2020-05-05 02:57:21.143 WARN Status Code: 200
2020-05-05 02:57:21.143 WARN Status Msg: Recovery event for VM Generated ID [crucs502-4g-CRPC_CRPCF5_0_ee07bf60-a8f8-405f-9a0d-cfa7363e32e7] triggered.
2020-05-05 02:57:21.143 WARN Tenant: core
2020-05-05 02:57:21.143 WARN Deployment name: crucs502-4g-CRPCF507-core
2020-05-05 02:57:21.143 WARN VM group name: CRPCF5071
<output trimmed>
2020-05-05 02:57:21.144 WARN ===== SEND NOTIFICATION ENDS =====
2020-05-05 03:09:21.655 WARN
2020-05-05 03:09:21.655 WARN ===== SEND NOTIFICATION STARTS =====
2020-05-05 03:09:21.655 WARN Type: VM_RECOVERY_REBOOT
2020-05-05 03:09:21.655 WARN Status: SUCCESS
2020-05-05 03:09:21.655 WARN Status Code: 200
2020-05-05 03:09:21.655 WARN Status Msg: VM Generated ID [crucs502-4g-CRPC_CRPCF5_0_ee07bf60-a8f8-405f-9a0d-cfa7363e32e7] is rebooted.
2020-05-05 03:09:21.655 WARN Tenant: core
2020-05-05 03:09:21.655 WARN Deployment name: crucs502-4g-CRPCF507-core
2020-05-05 03:09:21.655 WARN VM group name: CRPCF5071
<output trimmed>
2020-05-05 03:09:21.656 WARN ===== SEND NOTIFICATION ENDS =====
2020-05-05 03:14:22.079 WARN
2020-05-05 03:14:22.079 WARN ===== SEND NOTIFICATION STARTS =====
2020-05-05 03:14:22.079 WARN Type: VM_RECOVERY_COMPLETE
2020-05-05 03:14:22.079 WARN Status: FAILURE
2020-05-05 03:14:22.079 WARN Status Code: 500
2020-05-05 03:14:22.079 WARN Status Msg: Recovery: Recovery completed with errors for VM: [crucs502-4g-CRPC_CRPCF5_0_ee07bf60-a8f8-405f-9a0d-cfa7363e32e7]
2020-05-05 03:14:22.079 WARN Tenant: core
2020-05-05 03:14:22.079 WARN Deployment name: crucs502-4g-CRPCF507-core
2020-05-05 03:14:22.079 WARN VM group name: CRPCF5071
<output trimmed>
在ESC中,恢复方法仅重新启动。这表明VM无法通过重新启动(需要重新部署)重新启动。
crucs502-esc-vnf-esc-core-esc-1# show running-config | include recovery_policy
recovery_policy recovery_type AUTO
recovery_policy action_on_recovery REBOOT_ONLY
recovery_policy max_retries 1
在UAME中重新确认VM状态
ubuntu@crucs502-uame-1:~$ /opt/cisco/usp/uas/confd-6.3.8/bin/confd_cli -u admin -C
Enter Password for 'admin':
elcome to the ConfD CLI
admin connected from 10.249.80.137 using ssh on crucs502-uame-1
crucs502-uame-1#
crucs502-uame-1#
crucs502-uame-1#show vnfr state
VNFR ID STATE
---------------------------------
crucs502-4g-CRPCF504 alive
crucs502-4g-CRPCF505 alive
crucs502-4g-CRPCF506 alive
crucs502-4g-CRPCF507 error
crucs502-4g-CRPCF604 alive
crucs502-uame-1# recover nsd-id crucs502-4g vnfd CRPCF507 recovery-action redeploy
查看UAME日志和ESC日志,整个过程最多可能需要15分钟。
UAME日志:
tail -50f /var/log/upstart /uame/log
<truncated output>
2020-05-06 08:57:22,252 - | VM_RECOVERY_DEPLOYED | CRPCF5071 | SUCCESS | Waiting for: VM_RECOVERY_COMPLETE|
2020-05-06 08:57:22,255 - Timing out in 143 seconds
2020-05-06 08:57:48,227 - | VM_RECOVERY_COMPLETE | crucs502-4g-CRPC_CRPCF5_0_ee07bf60-a8f8-405f-9a0d-cfa7363e32e7 | SUCCESS | (1/1)
2020-05-06 08:57:48,229 - NETCONF transaction completed successfully!
2020-05-06 08:57:48,231 - Released lock: esc_vnf_req
2020-05-06 08:57:48,347 - Deployment recover-vnf-deployment: crucs502-4g succeeded
2020-05-06 08:57:48,354 - Send Deployment notification for: crucs502-4g-CRPCF507
ESC日志:
tail -50f /var/log/esc/yangesc.log
2020-05-06 08:58:01.454 WARN Type: VM_RECOVERY_COMPLETE
2020-05-06 08:58:01.454 WARN Status: SUCCESS
2020-05-06 08:58:01.454 WARN Status Code: 200
2020-05-06 08:58:01.454 WARN Status Msg: Recovery: Successfully recovered VM [crucs502-4g-CRPC_CRPCF5_0_ee07bf60-a8f8-405f-9a0d-cfa7363e32e7].
2020-05-06 08:58:01.454 WARN Tenant: core
2020-05-06 08:58:01.454 WARN Deployment ID: 4f958c43-dfa4-45d4-a69d-76289620c337
2020-05-06 08:58:01.454 WARN Deployment name: crucs502-4g-CRPCF507-core
2020-05-06 08:58:01.454 WARN VM group name: CRPCF5071
<output trimmed>
验证VM的状态。按照步骤3中的步骤操作。