Troubleshooting Server Disk Drive Detection and Monitoring

This chapter includes the following sections:

Support for Local Storage Monitoring

The type of monitoring supported depends upon the Cisco UCS server.

Supported Cisco UCS Servers for Local Storage Monitoring

Through Cisco UCS Manager, you can monitor local storage components for the following servers:

  • Cisco UCS B200 M3 blade server

  • Cisco UCS B420 M3 blade server

  • Cisco UCS B22 M3 blade server

  • Cisco UCS B200 M4 blade server

  • Cisco UCS B260 M4 blade server

  • Cisco UCS B460 M4 blade server

  • Cisco UCS C460 M2 rack server

  • Cisco UCS C420 M3 rack server

  • Cisco UCS C260 M2 rack server

  • Cisco UCS C240 M3 rack server

  • Cisco UCS C220 M3 rack server

  • Cisco UCS C24 M3 rack server

  • Cisco UCS C22 M3 rack server

  • Cisco UCS C220 M4 rack server

  • Cisco UCS C240 M4 rack server

  • Cisco UCS C460 M4 rack server


Note


Not all servers support all local storage components. For Cisco UCS rack servers, the onboard SATA RAID 0/1 controller integrated on motherboard is not supported.


Supported Cisco UCS Servers for Legacy Disk Drive Monitoring

Only legacy disk drive monitoring is supported through Cisco UCS Manager for the following servers:

  • Cisco UCS B200 M1/M2 blade server

  • Cisco UCS B250 M1/M2 blade server


Note


In order for Cisco UCS Manager to monitor the disk drives, the 1064E storage controller must have a firmware level contained in a UCS bundle with a package version of 2.0(1) or higher.


Prerequisites for Local Storage Monitoring

These prerequisites must be met for local storage monitoring or legacy disk drive monitoring to provide useful status information:

  • The drive must be inserted in the server drive bay.

  • The server must be powered on.

  • The server must have completed discovery.

  • The results of the BIOS POST complete must be TRUE.

Viewing the Status of a Disk Drive

Viewing the Status of Local Storage Components in the Cisco UCS Manager GUI

Procedure
    Step 1   In the Navigation pane, click the Equipment tab.
    Step 2   On the Equipment tab, expand Equipment > Chassis > Chassis Number > Servers.
    Step 3   Click the server for which you want to view the status of your local storage components.
    Step 4   In the Work pane, click the Inventory tab.
    Step 5   Click the Storage subtab to view the status of your RAID controllers and any FlexFlash controllers.
    Step 6   Click the down arrows to expand the Local Disk Configuration Policy, Actual Disk Configurations, Disks, and Firmware bars and view additional status information.

    Viewing the Status of a Disk Drive in the Cisco UCS Manager CLI

    Procedure
       Command or ActionPurpose
      Step 1 UCS-A# scope chassis chassis-num  

      Enters chassis mode for the specified chassis.

       
      Step 2 UCS-A /chassis # scope server server-num 

      Enters server chassis mode.

       
      Step 3 UCS-A /chassis/server # scope raid-controller raid-contr-id {sas | sata} 

      Enters RAID controller server chassis mode.

       
      Step 4 UCS-A /chassis/server/raid-controller # show local-disk [local-disk-id | detail | expand]   

      The following example shows the status of a disk drive:

      UCS-A# scope chassis 1
      UCS-A /chassis # scope server 6
      UCS-A /chassis/server # scope raid-controller 1 sas
      UCS-A /chassis/server/raid-controller # show local-disk 1
      
      Local Disk:
          ID: 1
          Block Size: 512
          Blocks: 60545024
          Size (MB): 29563
          Operability: Operable
          Presence: Equipped
      

      Interpreting the Status of a Monitored Disk Drive

      Cisco UCS Manager displays the following properties for each monitored disk drive:

      • Operability—The operational state of the drive.

      • Presence—The presence of the disk drive, and whether it can be detected in the server drive bay, regardless of its operational state.

      You need to look at both properties to determine the status of the monitored disk drive. The following table shows the likely interpretations of the combined property values.

      Operability Status Presence Status Interpretation

      Operable

      Equipped

      No fault condition. The disk drive is in the server and can be used.

      Inoperable

      Equipped

      Fault condition. The disk drive is in the server, but one of the following could be causing an operability problem:

      • The disk drive is unusable due to a hardware issue such as bad blocks.

      • There is a problem with the IPMI link to the storage controller.

      N/A

      Missing

      Fault condition. The server drive bay does not contain a disk drive.

      N/A

      Equipped

      Fault condition. The disk drive is in the server, but one of the following could be causing an operability problem:

      • The server is powered off.

      • The storage controller firmware is the wrong version and does not support disk drive monitoring.

      • The server does not support disk drive monitoring.


      Note


      The Operability field might show the incorrect status for several reasons, such as if the disk is part of a broken RAID set or if the BIOS power-on self-test (POST) has not completed.


      HDD Metrics Not Updated in Cisco UCS Manager GUI

      Problem—After hot-swapping, removing, or adding a hard drive, the updated hard disk drive (HDD) metrics do not appear in the Cisco UCS Manager GUI.

      Possible Cause—This problem can be caused because Cisco UCS Manager gathers HDD metrics only during a system boot. If a hard drive is added or removed after a system boot, the Cisco UCS Manager GUI does not update the HDD metrics.

      Procedure
      Reboot the server.

      Disk Drive Fault Detection Tests Fail

      Problem—The fault LED is illuminated or blinking on the server disk drive, but Cisco UCS Manager does not indicate a disk drive failure.

      Possible Cause—The disk drive fault detection tests failed due to one or more of the following conditions:

      • The disk drive did not fail, and a rebuild is in progress.

      • Drive predictive failure

      • Selected drive failure on Disk 2 of a B200, B230 or B250 blade

      • Selected drive failure on Disk 1 of a B200, B230 or B250 blade

      Procedure
        Step 1   Monitor the fault LEDs of each disk drive in the affected server(s).
        Step 2   If a fault LED on a server turns any color, such as amber, or blinks for no apparent reason, create technical support file for each affected server and contact Cisco TAC.

        Cisco UCS Manager Reports More Disks in Server than Total Slots Available

        Problem—Cisco UCS Manager reports that a server has more disks than the total disk slots available in the server. For example, Cisco UCS Manager reports three disks for a server with two disk slots as follows:

        RAID Controller 1:
                   Local Disk 1:
                       Product Name: 73GB 6Gb SAS 15K RPM SFF HDD/hot plug/drive sled mounted
                       PID: A03-D073GC2
                       Serial: D3B0P99001R9
                       Presence: Equipped
                   Local Disk 2: 
                       Product Name:
                       Presence: Equipped
                       Size (MB): Unknown
                   Local Disk 5:
                       Product Name: 73GB 6Gb SAS 15K RPM SFF HDD/hot plug/drive sled mounted
                       Serial: D3B0P99001R9
                       HW Rev: 0
                       Size (MB): 70136
        

        Possible Cause—This problem is typically caused by a communication failure between Cisco UCS Manager and the server that reports the inaccurate information.

        Procedure
          Step 1   Upgrade the Cisco UCS domain to the latest release of Cisco UCS software and firmware.
          Step 2   Decommission the server.
          Step 3   Recommission the server.