Install and Upgrade Issues
Deploy IP Addresses Page Lists Duplicate Servers
Description
During HX Data Platform deploy, the IP addresses page lists the same servers twice.
Action: Select only one from pair
This might occur when UCS Manager configuration is skipped and HX Data Platform references both UCS Manager and the imported JSON file. Select only one of each pair of IP addresses.
Installation Fails when Manually Reboot FIs
Description
Installation fails when FIs are manually rebooted during deploy.
Action: Reboot the HX Data Platform Installer
Procedure
Step 1 |
Reboot the HX Data Platform installer VM. |
Step 2 |
Restart the deployment. |
During UCS Manager Only Upgrade, Controller VM Might Not Power On
Description
During UCS Manager only upgrade, Controller VM might not power on after exiting the node out of maintenance mode.
Action: Restart the EAM service on vCenter
VMware VCenter EAM service will not automatically power on the controller VM. The controller VM will be outside of the EAM resource pool.
Note |
Newly deployed HX clusters starting with HyperFlex Release 4.0(1a) no longer leverage the vSphere ESX Agent Manager (EAM) for the HyperFlex Storage Controller VMs. HX clusters built prior to HX 4.0(1a) will continue to utilize EAM. However, if that cluster is migrated to a new vCenter the EAM integration will not be configured. Users should contact TAC for help removing EAM. |
-
Restart the EAM service on the VCenter by running
/etc/init.d/vmware-eam restart
.EAM should re-scan all the EAM agent VMs, and resolve all issues on these VMs, including powering on the Controller VM.
Deploy or Upgrade Fail with Error: ''NoneType' object has no attribute 'scsiLun'
Description
Deploying or upgrade fails with error: ''NoneType' object has no attribute 'scsiLun
''
Action: Disconnect and reconnect
This is a VMware issue. Disconnect the hosts from vCenter and reconnect them.
Important |
Do not remove the node from the cluster. This is a disconnect only. |
Upgrade Fails to Enter Maintenance Mode
Description
Upgrade fails because a node failed to enter maintenance mode.
Actions: Restart vmware-vpxd service
If all other validations are successful then this might be a VMware issue, where the VMware VPXD crashed.
Procedure
Step 1 |
Ensure that VPXD restarted, and if not, manually restart it from the ESX command line.
|
Step 2 |
Retry the upgrade. Enter Maintenance Mode should succeed. |
Upgrade Fails at vMotion Compatability Validation
Description
Retry upgrade fails at validation on vMotion compatibility.
Action: Rescan storage system from host
This is due to a sync issue between vCenter and ESXi.
Rescan the storage system on the ESX host using a vCenter client.
See VMware article, Perform Storage Rescan in the vSphere Client, at
https://docs.vmware.com/en/VMware-vSphere/6.0/com.vmware.vsphere.hostclient.doc/GUID-FA49E8EF-A3DC-46B8-AA5B-051C80762642.htmlUpgrade VM Power On Error: No compatible host was found
Description
While attempting an upgrade, VM fails to power on with error: No compatible host was found
Action: Manually power on the VM
Procedure
Step 1 |
From ESX command line, power on the VM. |
Step 2 |
Using controller VM command line, run command
|
During Upgrade when two Nodes Fail, Controller VMs Power On Fail
Description
If, during an upgrade, two nodes fail, the upgrade fails because controller VMs do not power on.
Action: Restart EAM Service
Procedure
Step 1 |
Restart the vCenter EAM service. From the ESX command line:
|
Step 2 |
Proceed with the upgrade. |
Upgrade with Pre-6.5 vCenter Groups Some Controller VMs
Description
After upgrading HX Data Platform using a vCenter version older than 6.5, some controller VMs are listed in a resource pool labeled, ESX Agents.
Action: None required
No action is required. There is no functionality impact. All the virtual machines, including controller VMs are EAM registered and remain in the HX Cluster. All HX Cluster operations work as expected.
If you need to perform group operations, from the vCenter interface, drag and drop the controller VMs into ESX Agents resource pools.
A Node Fails to Upgrade due to vCenter Issues
Description
Sometimes during an online upgrade the vCenter daemon crashes on a node. When this happens, the node cannot enter HX maintenance mode. Without entering HX maintenance mode, the node cannot complete the upgrade. All other nodes, with properly functioning vCenter, complete the upgrade.
Action: Re-run the Upgrade on the Affected Node
Procedure
Step 1 |
Correct the vCenter issue. |
Step 2 |
Restart the upgrade from any node in the cluster. HX Data Platform skips any node that is already upgraded and moves on to complete the upgrade on any node that was previously missed. |
HX Data Platform Installer Shows Host Managed by Different vCenter
Description
HX Data Platform Installer shows that a host is managed by a different vCenter.
When a host removed from vCenter, typically this removes the managementServerIP from the host summary information.
If host services were not running when the host was removed, vCenter continues to show the host.
Action: Reboot vCenter
Reboot vCenter and the host should no longer be listed with vCenter.
Configuration Settings Differ Between HX Data Platform and UCS Manager
Description
During the installation, upgrade, and expand the storage cluster processes, HX Data Platform installer verifies the configuration settings entered with the settings in UCS Manager. Mismatches can occur, for example, in the following scenarios:
-
Sometimes by the time the task is ready to apply the validations and configurations, a previously unassociated server is no longer unassociated. These servers need to be disassociated.
-
You are using servers that were previously associated with an HX Data Platform storage cluster. These servers need to be disassociated.
-
Manually entered existing storage cluster configuration information is prone to errors. The information, such as VLAN IDs and LAN configuration, needs to match the information listed in UCS Manager. Use previously saved configuration file to import the configuration.
Action: Import existing configuration
When installation, upgrade, or expand the storage cluster is completed, an option to Save Configuration is available. Use this option to save the cluster configuration information, then import the file with the saved configuration details when you need to make changes to the storage cluster.
Action: Disassociate the Server
See the Cisco HyperFlex Systems Getting Started Guide for steps on unassociating a server through UCS Manager. Briefly, the steps include:
Procedure
Step 1 |
From the UCS Manager, select . |
Step 2 |
Confirm the node is disassociating, select removing. . The transition state if |
Step 3 |
Confirm the node completes the disassociation. Wait until the Assoc State is none. Do not select a node that has an Assoc State, removing. |
Cluster Creation Fails with DNS Error due to FQDN
Description
Sometimes when the Fully Qualified Domain Name (FQDN) is provide to identify objects in the storage cluster, cluster creation fails. This is typically because the Domain Name Service (DNS) server specified is unavailable.
This applies to all possible domain name objects that are entered for any HX Data Platform Installer object that is identified by the domain name or IP address. This can include: vCenter server, ESX servers, controller VM addresses, storage cluster management or data network addresses, DNS servers, NTP servers, mail servers, or SSO servers.
Action: Verify the DNS server
Procedure
Step 1 |
Login to the command line of
the HX Data Platform Installer VM. For example, use
|
Step 2 |
Verify the DN servers provided, work. |
Step 3 |
Verify each object needed to create the cluster can be resolved from the provided DNS server. These objects are provided through either a JSON file or the HX DP Installer GUI fields. |
Step 4 |
If either Step 2 or Step 3 cannot be verified, then use IP addresses only instead of Fully Qualified Domain Names (FQDN) in the HX Data Platform Installer GUI. |
Offline Upgrade Cluster Start Command Error: Node Not Available
Description
After an offline upgrade, due to a VMware EAM issue, sometimes all the controller VMs do not restart. The stcli start cluster
command returns an error: Node not available
.
Action: Manually power on the controller VMs, then start the storage cluster.
Procedure
Step 1 |
Manually power on the controller VMs. |
Step 2 |
Restart the storage cluster. |
vSphere Replication Plug-in Fails after HX Plug-in Deployed
Description
This error occurs when the vSphere Replication plug-in is installed after the HX Data Platform plug-in. Recommended order is to install the vSphere Replication plug-in first, then install the HX Data Platform plug-in.
Action: Unregister the HX Data Platform plug-in
This task removes the HX extensions from the vCenter Managed Object Browser (MOB).
Before you begin
-
Remove the vSphere Replication plug-in from the vCenter MOB.
-
Remove the vSphere Replication virtual machine from the vCenter inventory.
-
Remove the HX vCenter cluster from the vCenter datacenter.
Procedure
Step 1 |
Download the vSphere ESX Agent Manager SDK, if you have not already done so. |
Step 2 |
Remove the HyperFlex cluster object from vCenter. |
Step 3 |
Login to the vCenter server MOB extension manager. |
Step 4 |
From the vCenter server MOB extension manager, view the mob and the extension associated with removed cluster. |
Step 5 |
Unregister the extension from the ExtensionManager page. |
Step 6 |
If the removed cluster was the CIP that vCenter used for communicating with the HX Data Platform plug-in, restart the vsphere-client services. |
Step 7 |
If the cip located in the previous Step is associated to the cluster that you removed from vCenter, the extension needs to be cleaned up. |
Step 8 |
Log out of all sessions and log back in. |
What to do next
-
Recreate the datacenter cluster. Add the hosts to the HX vCenter cluster, one at a time.
-
Re-register the vSphere Replication virtual machine from the datastore.
-
From the vSphere Replication appliance web front end, recreate the vSphere Replication plug-in. Verify the vSphere Replication plug-in is available in vCenter.
-
From the HX Data Platform Installer, reinstall the HX Data Platform plug-in and recreate the storage cluster.
Upgrade Fails but Reports All Nodes Up-to-Date
Description
This is due to RemoteException sent by vCenter and is most likely due to intermittent network connectivity between the HX storage cluster and vCenter.
Action: Retry the upgrade
Restarting Online Upgrade Fails
Description
In some rare cases, restarting online upgrade on a HX storage cluster with the previous upgrade in failed state might fail again. Even though the HX cluster recovered from failure and is in a healthy state.
Action: Retry the upgrade again
When retrying upgrade using CLI, please use -f
or --force
option command stcli cluster upgrade
or use HX Data Platform Plug-in to retry the upgrade.
Controller VM Fails to Power On During Cisco UCS Upgrade
Description
Sometimes when vSphere is exiting maintenance mode, all the VMs on the server do not power on. This can include the storage controller VM.
Action: Manually restart the controller VM
This is a known VMware issue. For more information, see VMware KB article - Auto-Start Is Not Run When Manually Restarting a Host in Maintenance Mode.
Firmware Upgrade Fails from Server Storage Controller with Unsupported Board
Description
Upgrading the UCS firmware failed. Possible reason due to use of an unsupported board in the HX server.
Action: Decommission then recommission the board.
Procedure
Step 1 |
Decommission and then recommission the referenced board |
Step 2 |
Verify that the server is healthy. |
Step 3 |
Retry the firmware upgrade. |
Step 4 |
If this does not resolve the issue, contact Cisco TAC for more assistance. |
A Node Fails to Upgrade due to vCenter Issues
Description
Sometimes during an online upgrade the vCenter daemon crashes on a node. When this happens, the node cannot enter HX maintenance mode. Without entering HX maintenance mode, the node cannot complete the upgrade. All other nodes, with properly functioning vCenter, complete the upgrade.
Action: Re-run the Upgrade on the Affected Node
Procedure
Step 1 |
Correct the vCenter issue. |
Step 2 |
Restart the upgrade from any node in the cluster. HX Data Platform skips any node that is already upgraded and moves on to complete the upgrade on any node that was previously missed. |
Upgrade Stalls Waiting for Node to Return to Healthy State
Description
If your LSI version is older than version 9, sometimes the disks are not found during an upgrade on the node. If the node is not healthy, the upgrade cannot proceed.
LSI version 9 is associated with UCS firmware version 2.2(6f) and 2.2(7c).
Action: Reboot the node manually.
Procedure
Step 1 |
Login to the controller VM command line. For example, using |
Step 2 |
Verify the disks are showing. Run the
|
Step 3 |
Reboot the node manually. |
Cluster Expansion Error: No cluster found
Description
From the HX Data Platform Expand Cluster wizard, the HX storage cluster was not discovered.
Action: Manually enter cluster IP address
Manually enter the HX storage cluster management IP address in the Management IP Address field of the Expand Cluster wizard.
To locate the cluster IP address:
Procedure
Step 1 |
From the vSphere Web Client, select . |
Step 2 |
Click-select the storage cluster name. From the Action Menu at the top of the panel, select Summary. |
Step 3 |
Locate the Cluster Management IP Address in the Summary display. |
Cluster Expansion Fails when DNS Server is Not Found
Description
Expanding a storage cluster requires a DNS server, even if you specify the new node using an IP address and are not using a FQDN. The HX Data Platform Installer checks for any DNS servers that were provided during cluster creation.
-
If any of the previously provided DNS servers are not reachable, cluster expansion fails.
-
If you did not specify a DNS server when you installed HX Data Platform, cluster expansion fails.
If either of these conditions apply, perform the corrective action.
Action: Identify and Provide Correct DNS Servers
Procedure
Step 1 |
Login to the command line of any HX controller VM. For example, use |
Step 2 |
Identify and DNS servers configured for the storage cluster.
Sample Response
If no DNS addresses are listed, skip to Step 4. |
Step 3 |
Remove all DNS servers that are no longer available to the storage cluster.
|
Step 4 |
Add any DNS servers that are new to the storage cluster. If you did not specify a DNS server when you created the storage cluster, add a fake DNS server.
|
Step 5 |
Verify each object needed to create the cluster can be resolved from the provided DNS server. These objects are provided through either a JSON file or the HX DP Installer GUI fields. |
Step 6 |
Verify the DN servers provided, work. |
Step 7 |
Repeat Step 5 and Step 6 to verify every added DNS server is valid and every HXDP object can be resolved through each DNS server. |
Step 8 |
Return to the HX Data Platform Installer and proceed with the storage cluster expansion. |
Expand Cluster Fails with Stale HX Installer
Description
An expand cluster node is added to an incorrect cluster. This happens when the same HX Data Platform Installer is used to create multiple clusters, then that same HX DP installer is used to expand one of the clusters. The HX DP installer defaults to adding the node to the most recent cluster.
Action: Redeploy HX Data Platform Installer OVA
Procedure
Step 1 |
Redeploy the HX Data Platform Installer OVA. |
Step 2 |
Use the new HX Data Platform Installer to expand the cluster. |
Installation Fails when Secure Boot is Enabled
Description
Installation fails when ESXi is redeployed on nodes with secure boot enabled. During install or expansion if the nodes are a mix of ESXi version 7.0 U2 and earlier then the hypervisor configuration phase may fail with the following error:
Action
Upgrade all nodes in your cluster to ESXi version 7.0 U2 or higher and retry.