Managing HX Storage Clusters

Changing the Cluster Access Policy Level

Procedure


Step 1

The storage cluster must be in a healthy state prior to changing the Cluster Access Policy to strict.

Step 2

From the command line of a storage controller VM in the storage cluster, type:

# stcli cluster get-cluster-access-policy

# stcli cluster set-cluster-access-policy --name {strict,lenient}


Rebalancing the Cluster

The storage cluster is rebalanced on a regular schedule. It is used to realign the distribution of stored data across changes in available storage and to restore storage cluster health. When a new node is added to the existing cluster, the added node(s) take on new writes as soon as it joins the existing cluster. The Cluster automatically rebalances if required (usually within 24 hours) and the new node may initially show less storage utilization than the existing converged nodes if the overall storage utilization is low. If the current storage utilization is high, and once the new node is added to the cluster, data is rebalanced onto the new node drives over a period of time.


Note

Forcing a manual rebalance can cause interference with regular User IO on the cluster and increase the latency. Therefore, the HyperFlex system initiates a rebalance only when required in order to minimize performance penalties.


Procedure


Verify rebalancing status from the storage controller VM.

  1. Enter the following on the command line:

    # stcli rebalance status
    rebalanceStatus:
    rebalanceState:
    cluster_rebalance_ongoing
    percentComplete: 10
    rebalanceEnabled: True
  2. Reenter the command line to confirm the process completes:

    # stcli rebalance status
    rebalanceStatus:
    rebalanceState: cluster_rebalance_not_running
    rebalanceEnabled: True

    This sample indicates that rebalance is enabled, and ready to perform a rebalance, but is not currently rebalancing the storage cluster.


Checking Cluster Rebalance and Self-Healing Status

The storage cluster is rebalanced on a regular schedule and when the amount of available storage in the cluster changes. A rebalance is also triggered when there is a change in the amount of available storage. This is an automatic self-healing function.


Important

Rebalance typically occurs only when a single disk usage exceeds 50% or cluster aggregate disk usage is greater than 50%.


You can check rebalance status through the HX Data Platform plug-in or through the storage controller VM command line.

Procedure


Step 1

Check the rebalance status through HX Data Platform plug-in.

  1. From the vSphere Web Client Navigator, select vCenter Inventory Lists > Cisco HyperFlex Systems > Cisco HX Data Platform > cluster > Summary.

    The Status portlet lists the Self-healing status.

  2. Expand the 'Resiliency Status' to see the 'Self-healing status' section. The Self-healing status field lists the rebalance activity or N/A, when rebalance is not currently active.

Step 2

Check the rebalance status through the storage controller VM command line.

  1. Login to a controller VM using ssh.

  2. From the controller VM command line, run the command.

    # stcli rebalance status

    The following output indicates that rebalance is not currently running on the storage cluster.

    rebalanceStatus:
    percentComplete: 0
    rebalanceState: cluster_rebalance_not_running
    rebalanceEnabled: True

    The Recent Tasks tab in the HX Data Platform plug-in displays a status message.


Handling Out of Space Errors

If your system displays an Out of Space error, you can either add a node to increase free capacity or delete existing unused VMs to release space.

When there is an Out of Space condition, the VMs are unresponsive.


Note

Do not delete storage controller VMs. Storage controller VM names have the prefix stCtlVM.


Procedure


Step 1

To add a node, use the Expand Cluster feature of the HX Data Platform Installer.

Step 2

To delete unused VMs, complete the following:

  1. Determine which guest VMs you can delete. You can consider factors such as disk space used by the VM or naming conventions.

  2. Go to vCenter > Virtual Machines to display the virtual machines in the inventory.

  3. Double-click a VM that you want to delete.

  4. Select the Summary > Answer Questions to display a dialog box.

  5. Click the Cancel radio button and click OK.

  6. Power off the VM.

  7. Delete the VM.

Step 3

After the Out of Space condition is cleared, complete the following:

  1. Go to vCenter > Virtual Machines to display the VM in the inventory.

  2. Double-click a VM that you want to use.

  3. Select the Summary > Answer Questions to display a dialog box.

  4. Click the Retry radio button and click OK.


Checking Cleaner Schedule

The stcli cleaner command typically runs in the background continuously. cleaner goes into sleep mode when it is not needed and wakes when policy defined conditions are met. For example, if your storage cluster is experiencing ENOSPC condition, the cleaner automatically runs at High Priority.

Do not expand the cluster while the cleaner is running. Check the cleaner schedule or adjust the schedule, as needed.

Procedure


Step 1

Login to any controller VM in the storage cluster. Run the listed commands from the controller VM command line.

Step 2

View the cleaner schedule.

# stcli cleaner get-schedule --id ID | --ip NAME

Parameter

Description

--id ID

ID of storage cluster node

--ip NAME

IP address of storage cluster node


Moving the Storage Cluster from a Current vCenter Server to a New VCenter Server

Before you begin

  • If your HX Cluster is running HX Data Platform version older than 1.8(1c), upgrade before attempting to reregister to a new vCenter.

  • Perform this task during a maintenance window.

  • Ensure the cluster is healthy and upgrade state is OK and Healthy. You can view the state using the stcli command from the controller VM command line.

    # stcli cluster info

    Check response for:

    upgradeState: ok
    healthState: healthy
  • Ensure vCenter must be up and running.

  • Snapshot schedules are not moved with the storage cluster when you move the storage cluster between vCenter clusters.

Procedure


Step 1

From the current vCenter, delete the cluster.

This is the vCenter cluster specified when the HX storage cluster was created.

Step 2

On the new vCenter, create a new cluster using the same cluster name.

Step 3

Add ESX hosts to new vCenter in the newly created cluster.


What to do next

Proceed to Unregistering a Storage Cluster from a vCenter Cluster.

Moving the Storage Cluster from a Current vCenter Server to a New VCenter Server

Before you begin

  • If your HX Cluster is running HX Data Platform version older than 1.8(1c), upgrade before attempting to reregister to a new vCenter.

  • Perform this task during a maintenance window.

  • Ensure the cluster is healthy and upgrade state is OK and Healthy. You can view the state using the stcli command from the controller VM command line.

    # stcli cluster info

    Check response for:

    upgradeState: ok
    healthState: healthy
  • Ensure vCenter must be up and running.

  • Snapshot schedules are not moved with the storage cluster when you move the storage cluster between vCenter clusters.

Procedure


Step 1

From the current vCenter, delete the cluster.

This is the vCenter cluster specified when the HX storage cluster was created.

Step 2

On the new vCenter, create a new cluster using the same cluster name.

Step 3

Add ESX hosts to new vCenter in the newly created cluster.


What to do next

Proceed to Unregistering a Storage Cluster from a vCenter Cluster.

Unregistering a Storage Cluster from a vCenter Cluster

This step is optional and not required. It is recommended to leave the HX Data Platform Plug-in registration alone in the old vCenter.

Before you begin

As part of the task to move a storage cluster from one vCenter server to another vCenter server, complete the steps in Moving the Storage Cluster from a Current vCenter Server to a New VCenter Server.


Note

  • If multiple HX clusters are registered to the same vCenter, do not attempt this procedure until all HX clusters have been fully migrated to different vCenter. Running this procedure is disruptive to any existing HX clusters registered to the vCenter.


Procedure


Step 1

Complete the steps in Unregistering and Removing EAM Extensions.

Note 

Newly deployed HX clusters starting with HyperFlex Release 4.0(1a) no longer leverage the vSphere ESX Agent Manager (EAM) for the HyperFlex Storage Controller VMs. HX clusters built prior to HX 4.0(1a) will continue to utilize EAM. If that cluster is migrated to a new vCenter, however, the EAM integration will not be configured.

This is the step that removes (unregisters) the HX cluster from the old vCenter server.

Also, if there are more ESX agencies than the number of HX clusters installed on the given vSphere server, it is likely there are stale EAM configurations that need cleanup.

Step 2

Complete the steps in Removing HX Data Platform Files from the vSphere Client.

Step 3

Complete the steps in Verifying HX Cluster is Unregistered from vCenter.


What to do next

Proceed to Registering a Storage Cluster with a New vCenter Cluster.

Unregistering and Removing EAM Extensions

If you have partially installed or uninstalled HX Data Platform, or unregistered a HX cluster where there are more agencies than the number of HX clusters installed on the given vSphere, sometimes a stale ESX Agent Manager (EAM) for the HX Data Platform extension remains. Remove stale extensions using the Managed Object Browser (MOB) extension manager.

Before you begin
  • Download the vSphere ESX Agent Manager SDK, if you have not already done so.

  • If multiple HX clusters are registered to the same vCenter, do not attempt this procedure until all HX clusters have been fully migrated to a different vCenter. Running this procedure is disruptive to any existing HX clusters registered to the vCenter.

  • Remove the datacenter from your vSphere cluster.


Note

Newly deployed HX clusters starting with HyperFlex Release 4.0 no longer leverage the vSphere ESX Agent Manager (EAM) for the HyperFlex Storage Controller VMs. HX clusters built prior to HX 4.0 will continue to utilize EAM. If that cluster is migrated to a new vCenter, however, the EAM integration will not be configured.


Procedure

Step 1

Identify the HX cluster UUID.

Every agency has a field cluster_domain_id which refers to the underlying vSphere extension. This extension ID uses a Managed Object ID (moid).

If you have multiple HyperFlex clusters, ensure that you select the correct cluster ID to unregister.

From a storage controller VM command line, run the command:

# stcli cluster info | grep vCenterClusterId:
vCenterClusterId: domain-c26
Step 2

To unregister the storage cluster extension: Login to the vCenter server MOB extension manager

First unregister the HyperFlex cluster.

  1. In a browser, enter the path and command.

    https://vcenter_server/mob/?moid=ExtensionManager

    vcenter_server is the IP address of the vCenter where the storage cluster is currently registered.

  2. Enter administrator login credentials.

Step 3

Locate the HX storage cluster extensions with the cluster IDs. Scroll through the Properties > extensionList to locate the storage cluster extensions:

com.springpath.sysmgmt.cluster_domain_id and com.springpath.sysmgmt.uuid.cluster_domain_id.

Copy each of these strings into your clipboard. Exclude the double quotes (") on either end of string, if there are any.

Step 4

Unregister each storage cluster extension.

  1. From the Methods table click UnregisterExtension.

  2. In the UnregisterExtension popup, enter an extension key value, com.springpath.sysgmt.cluster_domain_id.

    For example: com.springpath.sysgmt.domain-26

  3. Click Invoke Method.

Step 5

To remove stale EAM extensions: Login to the vCenter server MOB ESX agencies extension manager.

Second remove stale EAM extensions that were associated with the HyperFlex cluster.

  1. In a browser, enter the path and command.

    https://vcenter_server/eam/mob/

    vcenter_server is the IP address of the vCenter where the storage cluster is currently registered.

  2. Enter administrator login credentials.

Step 6

Locate the stale HX storage cluster ESX agency extensions with the cluster IDs.

  1. Scroll through the Properties > agency > Value.

  2. Click an agency value.

  3. In the Agency window, check the Properties > solutionID > Value extension. Verify has the correct cluster_domain_id.

    For example: com.springpath.sysgmt.domain-26

Step 7

Remove stale ESX agency extensions.

  1. From the Agency window, Methods table select a method.

    Stale ESX agencies can be removed using either the destroyAgency or uninstall.

  2. In the method popup, click Invoke Method.

Step 8

Refresh the ExtensionManager tab and verify that the extensionList entry does not include com.springpath.sysgmt.cluster_domain_id extensions.

Step 9

Restart the vSphere Client services.

The HX Data Platform extensions are removed when the vSphere Client services are restarted. Restarting the vSphere client service temporarily disables access to vCenter through the browser.


Removing HX Data Platform Files from the vSphere Client

This task is a step in unregistering a HX Storage Cluster from vCenter.

Procedure

Remove the HX Data Platform files from the vSphere Client. Select a method.

Linux vCenter

  1. Login to the Linux vCenter server using ssh as a root user.

  2. Change to the folder containing the HX Data Platform Plug-in folder.

    For vCenter 6.0

    # cd /etc/vmware/vsphere-client/vc-packages/vsphere-client-serenity/

    For vCenter 5.5

    # cd /var/lib/just/vmware/vsphere-client/vc-packages/vsphere-client-serenity/
  3. Remove the HX Data Platform Plug-in folder and files.

    # rm -rf com.springpath*
  4. Restart the vSphere Client.

    # service vsphere-client restart

Windows vCenter

  1. Login to the Windows vCenter system command line using Remote Desktop Protocol (RDP).

  2. Change to the folder containing the HX Data Platform Plug-in folder.

    # cd "%PROGRAMDATA%\VMware\vSphere Web Client\vc-packages\vsphere-client-serenity
  3. Remove the HX Data Platform Plug-in folder and files.

    # rmdir /com.springpath*
  4. Open the Service screen.

    # services.msc
  5. Restart the vSphere Web Client to logout of vCenter.

    # serviceLogout

Verifying HX Cluster is Unregistered from vCenter

This task is a step in unregistering a HX Storage Cluster from vCenter.

Verify that the HX cluster is no longer on the old vCenter.

Before you begin

Complete the steps in:

Procedure

Step 1

Clear your cache before logging back into vCenter.

Step 2

Log out of the old vCenter.

Step 3

Log in again to the old vCenter and verify the HX Data Platform Plug-in has been removed.


Registering a Storage Cluster with a New vCenter Cluster

Before you begin

Before attempting to register the HyperFlex cluster to vCenter, you must disable ESXi Lockdown mode on all ESXi hosts, and ensure SSH service is enabled and running.

As part of the task to move a storage cluster from one vCenter server to another vCenter server, complete the steps in Unregistering a Storage Cluster from a vCenter Cluster.

Procedure

Step 1

Login to a controller VM.

Step 2

Run the stcli cluster reregister command.

stcli cluster reregister [-h] --vcenter-datacenter NEWDATACENTER --vcenter-cluster NEWVCENTERCLUSTER --vcenter-url NEWVCENTERURLIP [--vcenter-sso-url NEWVCENTERSSOURL] --vcenter-user NEWVCENTERUSER

Apply additional listed options as needed.

Syntax Description

Option

Required or Optional

Description

--vcenter-cluster NEWVCENTERCLUSTER

Required

Name of the new vCenter cluster.

--vcenter-datacenter NEWDATACENTER

Required

Name of the new vCenter datacenter.

--vcenter-sso-url NEWVCENTERSSOURL

Optional

URL of the new vCenter SSO server. This is inferred from --vcenter-url, if not specified.

--vcenter-url NEWVCENTERURLIP

Required

URL of the new vCenter, <vcentername>. Where <vcentername> can be IP or FQDN of new vCenter.

--vcenter-user NEWVCENTERUSER

Required

User name of the new vCenter administrator.

Enter vCenter administrator password when prompted.

Example response:

Reregister StorFS cluster with a new vCenter ...
Enter NEW vCenter Administrator password:
Waiting for Cluster creation to finish ... 

If, after your storage cluster is re-registered, your compute only nodes fail to register with EAM, or are not present in the EAM client, and not under the resource pool in vCenter, then run the command below to re-add the compute only nodes:

# stcli node add --node-ips <computeNodeIP> --controller-root-password <ctlvm-pwd> --esx-username <esx-user> --esx-password <esx-pwd>

Contact TAC for assistance if required.

Step 3

Re-enter your snapshot schedules.

Snapshot schedules are not moved with the storage cluster when you move the storage cluster between vCenter clusters.

Step 4

(Optional) Once registration is successful, re-enable ESXi Lockdown mode if you disabled it prior to registering the HyperFlex cluster to vCenter.


Renaming Clusters

After you create a HX Data Platform storage cluster, you can rename it without disrupting any processes.


Note

These steps apply to renaming the HX Cluster, not the vCenter cluster.


Procedure


Step 1

From the vSphere Web Client Navigator, select vCenter Inventory Lists > Cisco HyperFlex Systems > Cisco HX Data Platform > cluster to rename.

Step 2

Open the Rename Cluster dialog box. Either right-click on the storage cluster or click the Actions drop-down list at the top of the tab.

Step 3

Select Rename Cluster.

Step 4

Enter a new name for the storage cluster in the text field.

HX cluster names cannot exceed 50 characters.

Step 5

Click OK to apply the new name.


Replacing Self-Signed Certificate

Replacing Self-Signed Certificate with External CA Certificate on a vCenter Server

Procedure


Set the certMgmt mode in vCenter to Custom to add the ESXi hosts with third party certificate to vCenter.

Note 

By default, the certMgmt mode is vmsa. In the default vmsa mode, you can add only the ESX host with self signed certificates. If you try to add an ESX with CA certificate to a vCenter, it will not allow you to add the ESX host unless CA certificate is replaced with self-signed certificate.

To update the certMgmt mode:

  1. Select the vCenter server that manages the hosts and click Settings.

  2. Click Advanced Settings, and click Edit.

  3. In the Filter box, enter certmgmt to display only certificate management keys.

  4. Change the value of vpxd.certmgmt.mode to custom and click OK.

  5. Restart the vCenter server service.

    To restart services, enter the following link in a browser and then click Enter:

    https://<VC URL>:5480/ui/services


Note

The behavior of host addition in vCenter varies according to the certificate and certMgmt mode.

  • When the host has self-signed certificate with the certMgmt mode set to the default value of vmsa in vCenter:

    • Only ESX host with self-signed certificate can be added.

    • The addition of ESX with third party CA certificate is not allowed.

    • If you try to add an ESX to a vCenter after replacing the self-signed certificate with a third party CA certificate, the system will prompt you to replace third party CA certificate with self-signed certificate. You can add the ESX host after replacing CA certificate with self-signed certificate.

  • When the host has self-signed certificate with the certMgmt mode set to custom in vCenter:

    • If you try to add an ESX to a vCenter after replacing the self-signed certificate with a third party CA certificate, the system throws an error: ssl thumbprint mismatch and add host fails. In this case, do the following to replace the third party CA certificate with the self-signed certificate:

      1. Place the host in the maintenance mode (MM mode).

      2. Replace the certified rui.crt and rui.key files with the backed up previous key and certificate.

      3. Restart the hostd and vpxa service. The CA certificate comes up in the new node.

      4. Right-click and connect to vCenter. The host removes the CA certificate and gets replaced with self-signed certification in VMware.

  • When the host has third party CA certificate with the certMgmt mode set to the default value of vmsa in vCenter:

    • ESX host with self-signed certificate can be added.

    • The addition of ESX with third party CA certificate is not allowed.

  • When the host has third party CA certificate with the certMgmt mode set to custom in vCenter:

    • ESX host with self-signed certificate cannot be added.

    • The self-signed certificate in ESX host needs to be replaced with a CA certificate of vCenter.


Replacing Self-Signed Certificate with External CA Certificate on an ESXi Host

Procedure


Step 1

Generate the host certificate (rui.crt) and key (rui.key) files and send the files to the certificate authority.

Note 

Ensure that a proper hostname or FQDN of the ESX host is provided while generating the rui.key and rui.crt files.

Step 2

Replace the certified host certificate (rui.crt) and key (rui.key) files in the /etc/vmware/ssl directory on each ESXi host after taking backup of the original host certificate (rui.crt) and key (rui.key) files.

Note 

Replace host certificate (rui.crt) and key (rui.key) files in a rolling fashion by placing only one host in maintenance mode and then wait for the cluster to be healthy and then replace the certificates for the other nodes.

  1. Log in to the ESXi host from an SSH client with administrator privileges.

  2. Place the host in the maintenance mode (MM mode).

  3. Take a backup of the previous key and certificate to the rui.bak file in the /etc/vmware/ssl/ directory.

  4. Upload the new certified rui.crt and rui.key files to the /etc/vmware/ssl/ directory.

  5. Restart the hostd and vpxa service, and check the running status using the following commands:

    /etc/init.d/hostd restart
    /etc/init.d/vpxa restart
    /etc/init.d/hostd status
    /etc/init.d/vpxa status
  6. Reconnect the host to vCenter and exit the maintenance mode.

Note 

Repeat the same procedure on all the nodes. You can verify the certificate of each node by accessing it through web.


Reregistering a HyperFlex cluster

After adding all the hosts to the vCenter after replacing the certified files, reregister the HX cluster to the vCenter using the following command:

stcli cluster reregister 

Note

Before attempting to register the HyperFlex cluster to vCenter, you must disable ESXi Lockdown mode on all ESXi hosts, and ensure SSH service is enabled and running. Once registration is successful, you may re-enable Lockdown mode.


Recreating a Self-Signed Certificate

If you face any issue with the host certificate after replacing external CA certificate, you can recreate the self-signed certificate by executing the following procedure:

  1. Log in to the ESXi host from an SSH client.

  2. Delete the rui.key and rui.crt files from the /etc/vmware/ssl/ directory.

  3. Recreate the self-signed certificate for the host using the following command:

    /sbin/generate-certificates
  4. Restart the hostd and vpxa service using the following commands:

    /etc/init.d/hostd restart
    /etc/init.d/vpxa restart

Boost Mode

Boost Mode allows the Cisco HyperFlex cluster to deliver higher IOPs by increasing the storage controller VM CPU resources by 4 vCPU. Enabling Boost Mode takes additional CPU resources from user VM for the HX data platform, and should only be enabled in deployments where support has determined that the benefit of additional CPUs, outweighs the impact to the sizing of your deployment. For more information about the CPUs supported by Boost Mode, see the Spec Sheets for Cisco HyperFlex HX220c M6 All NVMe, All Flash and Hybrid Server Nodes and Cisco HyperFlex HX240C M6 All NVMe, All Flash and Hybrid Server Nodes.

Configuring Boost Mode

Perform the following steps for each cluster you want to enable Boost Mode on:

Before you begin

Boost Mode Support is limited to the following configurations:

  • Supported Hardware:

    • All NVMe

    • All Flash C240

    • All Flash C220

  • Hypervisor: ESX only

  • Boost Mode number of controller VM vCPUs:

    • All NVMe: 16

    • All Flash C240: 12

    • All Flash C220:12

  • Cluster expansion requires you to apply Boost Mode to the new nodes.

  • Boost Mode is supported in Cisco HX Release 4.0(2a) and later.

  • Boost Mode should be enabled after support has determined that your deployment will benefit from the additional CPUs.


Note

CPU - number of physical cores must be equal to at least the new number of controller vCPUs. To verify the number of physical cores in the vSphere Client; Click host > Configure > Hardware > Processors > Processor cores per socket


Procedure


Step 1

From the vCenter, right-click one controller VM and Shut Down Guest OS.

Step 2

Increase the number controller VM vCPUs to 16 for all-NVMe, or 12 for all flash C220, and all flash C240. In the vSphere client, click Edit Settings for the VM and change the value of the CPU field in the first line.

Note 

Boost Mode number of controller VM vCPUs:

  • All NVMe: 16

  • All Flash C240: 12

  • All Flash C220:12

Step 3

Click OK to apply the configuration changes.

Step 4

Power up the controller VM.

Step 5

Log in to HX Connect and wait for the cluster to become healthy.

Step 6

Repeat the process for each host (or node) in the cluster.


Disabling Boost Mode

To disable Boost Mode, perform the following steps:

Procedure


Step 1

From the vCenter, right-click one controller VM and Shut Down Guest OS.

Step 2

Decrease the number controller VM vCPUs back to 12 for all-NVMe, or 8 for all flash C220, and all flash C240. In the vSphere client, click Edit Settings for the VM and change the value of the CPU field in the first line.

Step 3

Click OK to apply the configuration changes.

Step 4

Power up the controller VM.

Step 5

Log in to HX Connect and wait for the cluster to become healthy.

Step 6

Repeat the process for each host (or node) in the cluster.