Blade Server Hardware Management

Blade Server Management

You can manage and monitor all blade servers in a Cisco UCS domain through Cisco UCS Manager. You can perform some blade server management tasks, such as changes to the power state, from the server and service profile.

The remaining management tasks can only be performed on the server.

The power supply units go into power save mode when a chassis has two blades or less. When a third blade is added to the chassis and is fully discovered, the power supply units return to regular mode.

If a blade server slot in a chassis is empty, Cisco UCS Manager provides information, errors, and faults for that slot. You can also re-acknowledge the slot to resolve server mismatch errors and to have Cisco UCS Manager rediscover the blade server in the slot.

Guidelines for Removing and Decommissioning Blade Servers

Consider the following guidelines when deciding whether to remove or decommission a blade server using Cisco UCS Manager:

Decommissioning a Blade Server

If you want to temporarily decommission a physically present and connected blade server, you can temporarily remove it from the configuration. A portion of the server's information is retained by Cisco UCS Manager for future use, in case the blade server is recommissioned.

Removing a Blade Server

Removing is performed when you physically remove a blade server from the Cisco UCS Manager by disconnecting it from the chassis. You cannot remove a blade server from Cisco UCS Manager if it is physically present and connected to a chassis. After the physical removal of the blade server is completed, the configuration for that blade server can be removed in Cisco UCS Manager.

During removal, active links to the blade server are disabled, all entries from databases are removed, and the server is automatically removed from any server pools that it was assigned to during discovery.


Note

Only servers added to a server pool automatically during discovery are removed automatically. Servers that were manually added to a server pool must be removed manually.


To add a removed blade server back to the configuration, it must be reconnected, then rediscovered. When a server is reintroduced to Cisco UCS Manager, it is treated as a new server and is subject to the deep discovery process. For this reason, it is possible for Cisco UCS Manager to assign the server a new ID that might be different from the ID that it held before.

Recommendations for Avoiding Unexpected Server Power Changes

If a server is not associated with a service profile, you can use any available means to change the server power state, including the physical Power or Reset buttons on the server.

If a server is associated with, or assigned to, a service profile, you should only use the following methods to change the server power state:

  • In Cisco UCS Manager GUI, go to the General tab for the server or the service profile associated with the server and select Boot Server or Shutdown Server from the Actions area.

  • In Cisco UCS Manager CLI, scope to the server or the service profile associated with the server and use the power up or power down commands.


Important

Do not use any of the following options on an associated server that is currently powered off:

  • Reset in the GUI

  • cycle cycle-immediate or reset hard-reset-immediate in the CLI

  • The physical Power or Reset buttons on the server


If you reset, cycle, or use the physical power buttons on a server that is currently powered off, the server's actual power state might become out of sync with the desired power state setting in the service profile. If the communication between the server and Cisco UCS Manager is disrupted or if the service profile configuration changes, Cisco UCS Manager might apply the desired power state from the service profile to the server, causing an unexpected power change.

Power synchronization issues can lead to an unexpected server restart, as shown below:

Desired Power State in Service Profile

Current Server Power State

Server Power State After Communication Is Disrupted

Up

Powered Off

Powered On

Down

Powered On

Powered On

Note 

Running servers are not shut down regardless of the desired power state in the service profile.

Booting a Blade Server

If the Boot Server link is dimmed in the Actions area, you must shut down the server first.

Procedure


Step 1

In the Navigation pane, click Equipment.

Step 2

Expand Equipment > Chassis > Chassis Number > Servers.

Step 3

Choose the server that you want to boot.

Step 4

In the Work pane, click the General tab.

Step 5

In the Actions area, click Boot Server.

Step 6

If a confirmation dialog box displays, click Yes.


After the server boots, the Overall Status field on the General tab displays an OK status.

Booting a Rack-Mount Server from the Service Profile

Procedure


Step 1

In the Navigation pane, click Servers.

Step 2

Expand Servers > Service Profiles.

Step 3

Expand the node for the organization where you want to create the service profile.

If the system does not include multitenancy, expand the root node.

Step 4

Choose the service profile that requires the associated server to boot.

Step 5

In the Work pane, click the General tab.

Step 6

In the Actions area, click Boot Server.

Step 7

If a confirmation dialog box displays, click Yes.

Step 8

Click OK in the Boot Server dialog box.

After the server boots, the Overall Status field on the General tab displays an ok status or an up status.


Determining the Boot Order of a Blade Server


Tip

You can also view the boot order tabs from the General tab of the service profile associated with a server.


Procedure


Step 1

In the Navigation pane, click Equipment.

Step 2

Expand Equipment > Chassis > Chassis Number > Servers.

Step 3

Click the server for which you want to determine the boot order.

Step 4

In the Work pane, click the General tab.

Step 5

If the Boot Order Details area is not expanded, click the Expand icon to the right of the heading.

Step 6

To view the boot order assigned to the server, click the Configured Boot Order tab.

Step 7

To view what will boot from the various devices in the physical server configuration, click the Actual Boot Order tab.

Note 

The Actual Boot Order tab always shows "Internal EFI Shell" at the bottom of the boot order list.


Shutting Down a Blade Server

When you use this procedure to shut down a server with an installed operating system, Cisco UCS Manager triggers the OS into a graceful shutdown sequence.

If the Shutdown Server link is dimmed in the Actions area, the server is not running.


Note

When a blade server that is associated with a service profile is shut down, the VIF down alerts F0283 and F0479 are automatically suppressed.


Procedure


Step 1

In the Navigation pane, click Equipment.

Step 2

Expand Equipment > Chassis > Chassis Number > Servers.

Step 3

Choose the server that you want to shut down.

Step 4

In the Work pane, click the General tab.

Step 5

In the Actions area, click Shutdown Server.

Step 6

If a confirmation dialog box displays, click Yes.


After the server has been successfully shut down, the Overall Status field on the General tab displays a power-off status.

Shutting Down a Server from the Service Profile

When you use this procedure to shut down a server with an installed operating system, Cisco UCS Manager triggers the OS into a graceful shutdown sequence.

If the Shutdown Server link is dimmed in the Actions area, the server is not running.

Procedure


Step 1

In the Navigation pane, click Servers.

Step 2

Expand Servers > Service Profiles.

Step 3

Expand the node for the organization where you want to create the service profile.

If the system does not include multitenancy, expand the root node.

Step 4

Choose the service profile that requires the associated server to shut down.

Step 5

In the Work pane, click the General tab.

Step 6

In the Actions area, click Shutdown Server.

Step 7

If a confirmation dialog box displays, click Yes.


After the server successfully shuts down, the Overall Status field on the General tab displays a down status or a power-off status.

Resetting a Blade Server

When you reset a server, Cisco UCS Manager sends a pulse on the reset line. You can choose to gracefully shut down the operating system. If the operating system does not support a graceful shutdown, the server is power cycled. The option to have Cisco UCS Manager complete all management operations before it resets the server does not guarantee the completion of these operations before the server is reset.


Note

If you are trying to boot a server from a power-down state, you should not use Reset.

If you continue the power-up with this process, the desired power state of the servers become out of sync with the actual power state and the servers might unexpectedly shut down at a later time. To safely reboot the selected servers from a power-down state, click Cancel, then select the Boot Server action.


Procedure


Step 1

In the Navigation pane, click Equipment.

Step 2

Expand Equipment > Chassis > Chassis Number > Servers.

Step 3

Choose the server that you want to reset.

Step 4

In the Work pane, click the General tab.

Step 5

In the Actions area, click Reset.

Step 6

In the Reset Server dialog box, do the following:

  1. Click the Power Cycle option.

  2. (Optional) Check the check box if you want Cisco UCS Manager to complete all management operations that are pending on this server.

  3. Click OK.


The reset may take several minutes to complete. After the server has been reset, the Overall Status field on the General tab displays an ok status.

Resetting a Blade Server to Factory Default Settings

You can now reset a blade server to its factory settings. By default, the factory reset operation does not affect storage drives and flexflash drives. This is to prevent any loss of data. However, you can choose to reset these devices to a known state as well.


Important

Resetting storage devices will result in loss of data.


Perform the following procedure to reset the server to factory default settings.

Procedure


Step 1

In the Navigation pane, click Equipment.

Step 2

Expand Equipment > Chassis > Chassis Number > Servers.

Step 3

Choose the server that you want to reset to its factory default settings.

Step 4

In the Work pane, click the General tab.

Step 5

In the Actions area, click Server Maintenance.

Step 6

In the Maintenance dialog box, do the following:

  1. Click Reset to Factory Default.

  2. Click OK.

Step 7

From the Maintenance Server dialog box that appears, select the appropriate options:

  • To delete all storage, check the Scrub Storage checkbox.

  • To place all disks into their initial state after deleting all storage, check the Create Initial Volumes checkbox.

    You can check this checkbox only if you check the Scrub Storage checkbox. For servers that support JBOD, the disks will be placed in a JBOD state. For servers that do not support JBOD, each disk will be initialized with a single R0 volume that occupies all the space in the disk.

    Important 

    Do not check the Create Initial Volumes box if you want to use storage profiles. Creating initial volumes when you are using storage profiles may result in configuration errors.

  • To delete all flexflash storage, check the Scrub FlexFlash checkbox.

Cisco UCS Manager resets the server to its factory default settings.


Reacknowledging a Blade Server

Perform the following procedure to rediscover the server and all endpoints in the server. For example, you can use this procedure if a server is stuck in an unexpected state, such as the discovery state.

Procedure


Step 1

In the Navigation pane, click Equipment.

Step 2

Expand Equipment > Chassis > Chassis Number > Servers.

Step 3

Choose the server that you want to acknowledge.

Step 4

In the Work pane, click the General tab.

Step 5

In the Actions area, click Server Maintenance.

Step 6

In the Maintenance dialog box, click Re-acknowledge, then click OK.

Cisco UCS Manager disconnects the server and then builds the connections between the server and the fabric interconnect or fabric interconnects in the system. The acknowledgment may take several minutes to complete. After the server has been acknowledged, the Overall Status field on the General tab displays an OK status.


Removing a Server from a Chassis

Procedure


Step 1

In the Navigation pane, click Equipment.

Step 2

Expand Equipment > Chassis > Chassis Number > Servers.

Step 3

Choose the server that you want to remove from the chassis.

Step 4

In the Work pane, click the General tab.

Step 5

In the Actions area, click Server Maintenance.

Step 6

In the Maintenance dialog box, click Decommission, then click OK.

The server is removed from the Cisco UCS configuration.

Step 7

Go to the physical location of the chassis and remove the server hardware from the slot.

For instructions on how to remove the server hardware, see the Cisco UCS Hardware Installation Guide for your chassis.


What to do next

If you physically re-install the blade server, you must re-acknowledge the slot for the Cisco UCS Manager to rediscover the server.

For more information, see Reacknowledging a Server Slot in a Chassis.

Deleting the Inband Configuration from a Blade Server

This procedure removes the inband management IP address configuration from a blade server. If this action is greyed out, no inband configuration was completed.

Procedure


Step 1

In the Navigation pane, click Equipment.

Step 2

Expand Equipment > Chassis > Chassis Number > Servers > Server Name.

Step 3

In the Work area, click the Inventory tab.

Step 4

Click the CIMC subtab.

Step 5

In the Actions area, click Delete Inband Configuration.

Step 6

Click Yes in the Delete confirmation dialog box.

The inband configuration for the server is deleted.

Note 

If an inband service profile is configured in Cisco UCS Manager with a default VLAN and pool name, the server CIMC will automatically get an inband configuration from the inband profile approximate one minute after deleting the inband configuration here.


Decommissioning a Blade Server

Procedure


Step 1

In the Navigation pane, click Equipment.

Step 2

Expand Equipment > Chassis > Chassis Number > Servers.

Step 3

Choose the server that you want to decommission.

Step 4

In the Work pane, click the General tab.

Step 5

In the Actions area, click Server Maintenance.

Step 6

In the Maintenance dialog box, do the following:

  1. Click Decommission.

  2. Click OK.

The server is removed from the Cisco UCS configuration.


What to do next

If you physically re-install the blade server, you must re-acknowledge the slot for the Cisco UCS Manager to rediscover the server.

For more information, see Reacknowledging a Server Slot in a Chassis.

Removing a Non-Existent Blade Server Entry

Perform the following procedure after decommissioning the server and physically removing the server hardware. This procedure removes the non-existing stale entry of a blade server from the Decommissioned tab.

Procedure


Step 1

In the Navigation pane, click Equipment.

Step 2

In the Work pane, click the Decommissioned tab.

Step 3

On the row for each blade server that you want to remove from the list, check the check box in the Recommission column, then click Save Changes.

Step 4

If a confirmation dialog box displays, click Yes.


Recommissioning a Blade Server

Procedure


Step 1

In the Navigation pane, click Equipment.

Step 2

Expand Equipment node.

Step 3

Click the Chassis node.

Step 4

In the Work pane, click the Decommissioned tab.

Step 5

On the row for each blade server that you want to recommission, check the check box in the Recommission column, then click Save Changes.

Step 6

If a confirmation dialog box displays, click Yes.

Step 7

(Optional) Monitor the progress of the server recommission and discovery on the FSM tab for the server.


Reacknowledging a Server Slot in a Chassis

Perform the following procedure if you decommissioned a blade server without removing the physical hardware from the chassis, and you want Cisco UCS Manager to rediscover and recommission the server.

Procedure


Step 1

In the Navigation pane, click Equipment.

Step 2

Expand Equipment > Chassis > Chassis Number > Servers.

Step 3

Choose the server whose slot you want to reacknowledge.

Step 4

If Cisco UCS Manager displays a Resolve Slot Issue dialog box, do one of the following:

Option Description

The here link in the Situation area

Click this link and then click Yes in the confirmation dialog box. Cisco UCS Manager reacknowledges the slot and discovers the server in the slot.

OK

Click this button if you want to proceed to the General tab. You can use the Reacknowledge Slot link in the Actions area to have Cisco UCS Manager reacknowledge the slot and discover the server in the slot.


Removing a Non-Existent Blade Server from the Configuration Database

Perform the following procedure if you physically removed the server hardware without first decommissioning the server. You cannot perform this procedure if the server is physically present.

If you want to physically remove a server, see Removing a Server from a Chassis.

Procedure


Step 1

In the Navigation pane, click Equipment.

Step 2

Expand Equipment > Chassis > Chassis Number > Servers.

Step 3

Choose the server that you want to remove from the configuration database.

Step 4

In the Work pane, click the General tab.

Step 5

In the Actions area, click Server Maintenance.

Step 6

In the Maintenance dialog box, click Remove, then click OK.

Cisco UCS Manager removes all data about the server from its configuration database. The server slot is now available for you to insert new server hardware.


Turning the Locator LED for a Blade Server On and Off

Procedure


Step 1

In the Navigation pane, click Equipment.

Step 2

Expand Equipment > Chassis > Chassis Number > Servers.

Step 3

Choose the server for which you want to turn the locator LED on or off.

Step 4

In the Work pane, click the General tab.

Step 5

In the Actions area, click one of the following:

  • Turn on Locator LED—Turns on the LED for the selected server.
  • Turn off Locator LED—Turns off the LED for the selected server.
  • Turn on Master Locator LED—For the Cisco UCS B460 M4 blade server, turns on the LED for the master node.
  • Turn off Master Locator LED—For the Cisco UCS B460 M4 blade server, turns off the LED for the master node.
  • Turn on Slave Locator LED—For the Cisco UCS B460 M4 blade server, turns on the LED for the slave node.
  • Turn off Locator LED—For the Cisco UCS B460 M4 blade server, turns off the LED for the slave node.

Turning the Local Disk Locator LED on a Blade Server On and Off

Before you begin

  • Ensure the server, on which the disk is located, is powered on. If the server is off, you are unable to turn on or off the local disk locator LED.

Procedure


Step 1

In the Navigation pane, click Equipment.

Step 2

Expand Equipment > Chassis > Chassis Number > Servers.

Step 3

Choose the server for which you want to turn the local disk locator LED on or off.

Step 4

In the Work pane, click the Inventory > Storage > Disks tabs.

The Storage Controller inventory appears.

Step 5

Click a disk.

The disk details appear.
Step 6

In the Details area, click Toggle Locator LED.

If the Locator LED state is On, it will turn Off. If the Locator LED state is Off, it will turn On.
Step 7

Click Save Changes.


Resetting the CMOS for a Blade Server

Sometimes, troubleshooting a server might require you to reset the CMOS. Resetting the CMOS is not part of the normal maintenance of a server.

Procedure


Step 1

In the Navigation pane, click Equipment.

Step 2

Expand Equipment > Chassis > Chassis Number > Servers.

Step 3

Choose the server for which you want to reset the CMOS.

Step 4

In the Work pane, click the General tab.

Step 5

In the Actions area, click Recover Server.

Step 6

In the Recover Server dialog box, click Reset CMOS, then click OK.


Resetting the CIMC for a Blade Server

Sometimes, with the firmware, troubleshooting a server might require you to reset the CIMC. Resetting the CIMC is not part of the normal maintenance of a server. After you reset the CIMC, the CIMC reboots with the running version of the firmware for that server.

If the CIMC is reset, the power monitoring functions of Cisco UCS become briefly unavailable until the CIMC reboots. Typically, the reset only takes 20 seconds; however, it is possible that the peak power cap can exceed during that time. To avoid exceeding the configured power cap in a low power-capped environment, consider staggering the rebooting or activation of CIMCs.

Procedure


Step 1

In the Navigation pane, click Equipment.

Step 2

Expand Equipment > Chassis > Chassis Number > Servers.

Step 3

Choose the server for which you want to reset the CIMC.

Step 4

In the Work pane, click the General tab.

Step 5

In the Actions area, click Recover Server.

Step 6

In the Recover Server dialog box, click Reset CIMC (Server Controller), then click OK.


Clearing TPM for a Blade Server

You can clear TPM only on Cisco UCS M4 and higher blade and rack-mount servers that include support for TPM.


Caution

Clearing TPM is a potentially hazardous operation. The OS may stop booting. You may also see loss of data.


Before you begin

TPM must be enabled.

Procedure


Step 1

In the Navigation pane, click Equipment.

Step 2

Expand Equipment > Chassis > Chassis Number > Servers.

Step 3

Choose the server for which you want to clear TPM.

Step 4

In the Work pane, click the General tab.

Step 5

In the Actions area, click Recover Server.

Step 6

In the Recover Server dialog box, click Clear TPM, then click OK.


Viewing the POST Results for a Blade Server

You can view any errors collected during the Power On Self-Test process for a server and its adapters.

Procedure


Step 1

In the Navigation pane, click Equipment.

Step 2

Expand Equipment > Chassis > Chassis Number > Servers.

Step 3

Choose the server for which you want to view the POST results.

Step 4

In the Work pane, click the General tab.

Step 5

In the Actions area, click View POST Results.

The POST Results dialog box lists the POST results for the server and its adapters.

Step 6

(Optional) Click the link in the Affected Object column to view the properties of that adapter.

Step 7

Click OK to close the POST Results dialog box.


Issuing an NMI from a Blade Server

Perform the following procedure if the system remains unresponsive and you need Cisco UCS Manager to issue a Non-Maskable Interrupt (NMI) to the BIOS or operating system from the CIMC. This action creates a core dump or stack trace, depending on the operating system installed on the server.

Procedure


Step 1

In the Navigation pane, click Equipment.

Step 2

Expand Equipment > Chassis > Chassis Number > Servers.

Step 3

Choose the server that you want to issue the NMI.

Step 4

In the Work pane, click the General tab.

Step 5

In the Actions area, click Server Maintenance.

Step 6

In the Maintenance dialog box, do the following:

  1. Click Diagnostic Interrupt.

  2. Click OK.

Cisco UCS Manager sends an NMI to the BIOS or operating system.


Viewing Health Events for a Blade Server

Procedure


Step 1

In the Navigation pane, click Equipment.

Step 2

Expand Equipment > Chassis > Chassis Number > Servers.

Step 3

Choose the server for which you want to view health events.

Step 4

In the Work pane, click the Health tab

The health events triggered for this server appear. The fields in this tab are:

Name

Description

Health Summary area

Health Qualifier field

Comma-separated names of all the heath events that are triggered for the component.

Health Severity field

Highest severity of all the health events that are triggered for the component. This can be one of the following:

  • critical

  • major

  • minor

  • warning

  • info

  • cleared

Note 

The severity levels listed here are from highest to lowest severity.

Health Details area

Severity column

Severity of the health event. This can be one of the following:

  • critical

  • major

  • minor

  • warning

  • info

  • cleared

Note 

The severity levels listed here are from highest to lowest severity.

Name column

Name of the health event.

Description column

Detailed description of the health event.

Value column

Current value of the health event.

Details area

The Details area displays the Name, Description, Severity, and Value details of any health event that you select in the Health Details area.


Health LED Alarms

The blade health LED is located on the front of each Cisco UCS B-Series blade server. Cisco UCS Manager allows you to view the sensor faults that cause the blade health LED to change color from green to amber or blinking amber.

The health LED alarms display the following information:

Name Description

Severity column

The severity of the alarm. This can be one of the following:

  • Critical—The blade health LED is blinking amber. This is indicated with a red dot.

  • Minor—The blade health LED is amber. This is indicated with an orange dot.

Description column

A brief description of the alarm.

Sensor ID column

The ID of the sensor the triggered the alarm.

Sensor Name column

The name of the sensor that triggered the alarm.

Viewing Health LED Alarms

Procedure


Step 1

In the Navigation pane, click Equipment.

Step 2

Expand Equipment > Chassis > Chassis Number > Servers.

Step 3

Click the server for which you want to view health LED alarms.

Step 4

In the Work pane, click the General tab.

Step 5

In the Actions area, click View Health LED Alarms.

The View Health LED Alarms dialog box lists the health LED alarms for the selected server.

Step 6

Click OK to close the View Health LED Alarms dialog box.


Smart SSD

Beginning with release 3.1(3), Cisco UCS Manager supports monitoring SSD health. This feature is called Smart SSD. It provides statistical information about the properties like wear status in days, percentage life remaining, and so on. For every property, a minimum, a maximum and an average value is recorded and displayed. The feature also allows you to provide threshold limit for the properties.


Note

The Smart SSD feature is supported only for a selected range of SSDs. It is not supported for any HDDs.


The SATA range of supported SSDs are:

  • Intel

  • Samsung

  • Micron

The SAS range of supported SSDs are:

  • Toshiba

  • Sandisk

  • Samsung

  • Micron


Note

  • Power-On Hours and Power Cycle Count are not available on SAS SSDs.

  • Smart SSD feature is supported only on M4 servers and later.


Monitoring SSD Health

Procedure


Step 1

Navigate to Equipment > Rack-Mounts > Servers > Server Number > Inventory > Storage.

Step 2

Click the controller component for which you want to view the SSD health.

Step 3

In the Work pane, click the Statistics tab.

Step 4

Click the SSD for which you want to view the health properties.

You can view the values for
  • PercentageLifeLeft: Displays the duration of life so action can be taken when required.

  • PowerCycleCount: Displays the number of times the SSD is power cycled across the server reboot.

  • PowerOnHours: Displays the duration for which the SSD is on. You can replace or turn the SSD off based on the requirement.

    Note 

    If there is a change in any other property, updated PowerOnHours is displayed.

  • WearStatusInDays: Provides guidance about the SSD wear based on the workload characteristics run at that time.

    Note 

    These values are updated on an hourly basis.

    You can specify the threshold limit for the values and faults are raised when the value reaches or exceeds the threshold limit. Smart SSD feature tracks temperature and raises a fault as the temperature crosses the threshold limit (90°C) and moves the disk to the degraded state notifying the reason for degradation.