This document describes several command-line interface (CLI) commands, as well as other troubleshooting techniques, that can help troubleshoot hard disk drive (HDD) issues. The best method for troubleshooting HDD issues is to use the LEDs, GUI, BIOS, LSI Option ROM / MegaRaid GUI, and logs. However, these options are not always available. In this case, you can use the CLI.
There are no specific requirements for this document.
This document is not restricted to specific software and hardware versions.
The information in this document was created from the devices in a specific lab environment. All of the devices used in this document started with a cleared (default) configuration. If your network is live, make sure that you understand the potential impact of any command.
Refer to Cisco Technical Tips Conventions for more information on document conventions.
Note: Some of the commands listed in this document depend on if you have an LSI MegaRaid controller as not all of them are supported by the 1064/1068e LSI controllers.
Enter the show pci-adapter command in order to view the product name. This example shows an LSI 1064e adapter.
ucs-c2xx-m1 /chassis #show pci-adapter Slot Vendor ID Device ID SubVendor ID SubDevice ID Product Name ---- --------- --------- ------------ ------------ ------------------------ M 0x1000 0x0056 0x152d 0x896d Cisco LSI 1064E Mezzan...
Enter the show hdd command in order to view the status of the HDDs.
ucs-c2xx-m1 /chassis #show hdd Name Status -------------------- -------------------- HDD_01_STATUS present HDD_02_STATUS absent HDD_03_STATUS absent HDD_04_STATUS absent
Enter the show virtual-drive command in order to view the status of the virtual drives. This command is useful since it does not require you to shut down the server and enter the BIOS to view the information.
ucs-c210-m2/chassis #scope storageadapter SLOT-5 ucs-c210-m2/chassis/storageadapter #show virtual-drive Virtual Drive Status Name Size RAID Level -------------- ------------------ ---------------------- --------- ---------- 0 Optimal 139236 MB RAID 1 1 Degraded 974652 MB RAID 5
Enter the show physical-drive command in order to view the status of the physical drives.
ucs-c210-m2 /chassis/storageadapter #show physical-drive Predictive Slot Failure Drive Coerced Number Controller Status Manufacturer Model Count Firmware Size Type ------ ---------- ------ ------------ ----------- ---------- -------- --------- ---- 0 SLOT-5 1 SLOT-5 online SEAGATE ST9146852SS 0 0005 139236 MB HDD 2 SLOT-5 online SEAGATE ST9146852SS 0 0005 139236 MB HDD 3 SLOT-5 online SEAGATE ST9146852SS 0 0005 139236 MB HDD 4 SLOT-5 online SEAGATE ST9146852SS 0 0005 139236 MB HDD 5 SLOT-5 online SEAGATE ST9146852SS 0 0005 139236 MB HDD 6 SLOT-5 online SEAGATE ST9146852SS 0 0005 139236 MB HDD 7 SLOT-5 online SEAGATE ST9146852SS 0 0005 139236 MB HDD 9 SLOT-5 online SEAGATE ST9146852SS 0 0005 139236 MB HDD 10 SLOT-5 online SEAGATE ST9146852SS 0 0005 139236 MB HDD
Enter the show error-counters command in order to view the number of correctable and uncorrectable errors.
ucs-c210-m2 /chassis/storageadapter #show error-counters PCI Slot SLOT-5: Memory Correctable Errors: 0 Memory Uncorrectable Errors: 0
Enter the show hw-config command in order to view the RAID controller configuration.
ucs-c210-m2 /chassis/storageadapter #show hw-config PCI Slot SLOT-5: SAS Address 0: 500e004aaaaaaa3f SAS Address 1: 0000000000000000 SAS Address 2: 0000000000000000 SAS Address 3: 0000000000000000 SAS Address 4: 0000000000000000 SAS Address 5: 0000000000000000 SAS Address 6: 0000000000000000 SAS Address 7: 0000000000000000 BBU Present: true NVRAM Present: true Serial Debugger Present: true Memory Present: true Flash Present: true Memory Size: 512 MB Cache Memory Size: 394 MB Number of Backend Ports: 8
Enter the show physical-drive-count command in order to view the number of HDDs.
ucs-c210-m2 /chassis/storageadapter #show physical-drive-count PCI Slot SLOT-5: Physical Drive Count: 9 Critical Physical Drive Count: 0 Failed Physical Drive Count: 0
In the event that you do not have access to the CLI, you can view the technical support file (/tmp/tech_support) in order to obtain information about the status of the HDDs. Here is an excerpt from the technical support file that shows the HDDs from the Intelligent Platform Management Interface (IPMI) sensors:
Querying All IPMI Sensors: Sensor Name | Reading | Unit | Status | LNR | LC | LNC | UNC | UC | UNR HDD0_INFO | 0x0 | discrete | 0x2181 | na | na | na | na | na | na HDD1_INFO | 0x0 | discrete | 0x2181 | na | na | na | na | na | na HDD2_INFO | 0x0 | discrete | 0x2181 | na | na | na | na | na | na HDD3_INFO | 0x0 | discrete | 0x2181 | na | na | na | na | na | na HDD4_INFO | 0x0 | discrete | 0x2181 | na | na | na | na | na | na HDD5_INFO | 0x0 | discrete | 0x2181 | na | na | na | na | na | na HDD6_INFO | na | discrete | na | na | na | na | na | na | na HDD7_INFO | na | discrete | na | na | na | na | na | na | na
Here is an excerpt from the technical support file that shows a breakdown of the HDD status:
Bit[15:10] - Unused Bit[9:8] - Fault Bit[7:4] – LED Color Bit[3:0] – LED State Fault: 0x100 – On Line 0x200 - Degraded LED Color: 0x10 – GREEN 0x20 – AMBER 0x40 – BLUE 0x80 – RED LED State: 0x01 – OFF 0x02 – ON 0x04 – FAST BLINK 0x08 – SLOW BLINK
Here is an excerpt from the technical support file that shows the HDD status (with a status code of 0x2181):
0x2181 Fault: 0x100 --- HDD is On Line LED Color: 0x80 --- RED LED State: 0x01 --- OFF
You have the option to use a battery backup unit (BBU) with some server deployments. The BBU is an intelligent battery backup unit that protects disk write cache data on the RAID controller for up to 72 hours during a power loss.
This example shows how to use the MegaCli in order to check the status of the BBU:
bash$ sudo /opt/MegaRAID/MegaCli/MegaCli64 -AdpBbuCmd -a0 -NoLog Password: . . . Battery Replacement required : Yes . . . Relative State of Charge: 99 % Absolute State of charge: 76 % . . . Date of Manufacture: 11/08, 2008 Design Capacity: 700 mAh Design Voltage: 3700 mV Specification Info: 33 Serial Number: 243 Pack Stat Configuration: 0x6cb0 Manufacture Name: LSI113000G Device Name: 2970700 Device Chemistry: LION Battery FRU: N/A
This example shows how to use the CLI in order to check the status of the BBU:
ucs-c200-m2 /chassis/storageadapter #show bbu detail Controller SLOT-7: Battery Type: iBBU Battery Present: true Voltage: 4.023 V Current: 0.000 A Charge: 100% Charging State: fully charged Temperature: 34 degrees C Voltage Low: false Temperature High: false Learn Cycle Requested: false Learn Cycle Active: false Learn Cycle Failed: false Learn Cycle Timeout: false I2C Errors Detected: false Battery Replacement Required: true Remaining Capacity Low: true
Revision | Publish Date | Comments |
---|---|---|
1.0 |
07-Dec-2012 |
Initial Release |