Introduction
This document describes the steps required in order to replace both of the faulty HDDs in the server in an Ultra-M setup that hosts StarOS Virtual Network Functions (VNFs).
Background Information
Ultra-M is a pre-packaged and validated virtualized mobile packet core solution designed to simplify the deployment of VNFs. OpenStack is the Virtualized Infrastructure Manager (VIM) for Ultra-M and consists of these node types:
- Compute
- Object Storage Disk - Compute (OSD - Compute)
- Controller
- OpenStack Platform - Director (OSPD)
The high-level architecture of Ultra-M and the components involved are depicted in this image:
UltraM Architecture
This document is intended for the Cisco personnel familiar with Cisco Ultra-M platform and it details the steps required to be carried out at OpenStack and StarOS VNF level at the time of the Controller Server Replacement.
Note: Ultra M 5.1.x release is considered in order to define the procedures in this document.
Abbreviations
VNF |
Virtual Network Function |
CF |
Control Function |
SF |
Service Function |
ESC |
Elastic Service Controller |
MOP |
Method of Procedure |
OSD |
Object Storage Disks
|
HDD |
Hard Disk Drive |
SSD |
Solid State Drive |
VIM |
Virtual Infrastructure Manager |
VM |
Virtual Machine |
EM |
Element Manager |
UAS |
Ultra Automation Services |
UUID |
Universally Unique IDentifier |
Both HDDs Failure
Each bare-metal server will be provisioned with two HDD drives in order to act as BOOT DISK in Raid 1 configuration. In case of single HDD failure, since there is RAID 1 level redundancy, the faulty HDD can be hot swapped. However, when both the HDDs fail, the server will be down and you will lose the access to the server. In order to restore the access to the server and the services, it is required to replace both the HDDs and add the server to the overcloud stack that exists.
The procedure to replace a faulty component on UCS C240 M4 server can be referred from Replacing the Server Components.
In case of both the HDDs failure, replace only these two faulty HDDs in the same UCS 240M4 server. BIOS upgrade procedure is not required after you replace new disks.
In OpenStack based Ultra-M solution, UCS 240M4 bare-metal server can take up one of these roles: Compute, OSD-Compute, Controller or OSPD. The steps required in order to handle both HDD failures in each of these server roles are mentioned in these sections.
Note: In scenarios where both the HDDs are healthy but some other hardware is faulty in UCS 240M4 server, replace the UCS 240M4 with the new hardware, however, re-use the same HDDs. In this case, only the HDDs are faulty, so re-use the same UCS 240M4 and replace the faulty HDDs with new HDDs.
Both HDDs Failure on Compute Server
If the failure of both the HDDs is observed in UCS 240M4 which acts as a compute node, follow the replacement procedure as given in the Compute Server Replacement Procedure.
Both HDDs Failure on Controller Server
If the failure of both the HDDs is observed in UCS 240M4 which acts as a controller node, follow the replacement procedure as given in the .
Since the controller server that observes both the HDDs failure will be not reachable via Secure Shell (SSH), log in into another controller node in order to perform the graceful shutdown procedure listed in the link mentioned.
Both HDDs Failure on OSD-Compute Server
If the failure of both the HDDs is observed in UCS 240M4 which acts as OSD-Compute node, follow the replacement procedure as given in the .
In the procedure mentioned here, the Ceph storage graceful shutdown cannot be performed as both the failures result in unreachability of the server. Therefore, ignore those steps.
Both HDDs Failure on OSPD Server
If the failure of both the HDDs is observed in UCS 240M4, which acts as sn OSPD node, follow the replacement procedure as given in the .
In this case, the previously stored OSPD backup is needed for restoration after HDD disk replacement, else it will be like complete stack re-deployment.