Hello everybody! Changing disks on the server and encountered a problem. The new disks worked for a week and failed with an error:"Disk 11 in Backplane 1 of Integrated RAID Controller 1 is not functioning correctly. The RAID Controller may not be able to read/write data to the physical disk drive indicated in the message. This may be due to a failure with the physical disk drive or because the physical disk drive was removed from the system." Replacement disks do not appear in logs and are not available in storage. The power signal on the disks is on. What could be the problem?
Also, some disks are marked as connected in the logs, but they are not displayed in the storage (Drive 7 is installed in disk drive bay 1).
Server: PowerEdge R730xd.
OS: VMware ESXi 6.0.0 build-3568940
BIOS Version: 2.1.5
Firmware version (and Lifecycle Controller Firmware too): 18.104.22.168
I believe the issue is based on compatibility, as I am not seeing those drives supported on any Poweredge servers beyond the 12th generation, where the R730xd is a 13th generation. Also, when you replaced the drives, what was the method used?
Let me know.
if I understand you correctly, under the method you mean since we installed the drives for replacement? During installation, we did not disable or restart the server (and have not yet tried to restart it). We just added new disks to the free slots. The disks in the screenshots were determined and installed correctly, but they only worked for a week. New disks that we are trying to install (the SATA interface) are not detected or do not appear in the storage (only log entries). I can check the information for disks that are not detected and upload the information here. Where can I see the list of supported disks for PowerEdge R730xd?
Thank you. So just to clarify, were you adding these drives, or replacing the existing drives with them? I was asking the procedure, because if you replaced the existing drives, I was confirming if you used the Replace member feature.
New information about the problem. Yesterday, ESXi fell with an error on the pink screen. A forced restart of the server has resulted in all disks being displayed now. Disks with errors (one Removed and one Failed) now have the status Online and Foreign, respectively. One new 8 TB storage disk is in the Blocked status and I plan to re-install it. I have no idea yet why the error might have occurred or why all disks are displayed correctly again. Do you have any thoughts on this? I'll check the ESXi logs for errors. Characteristics of the installed disks (which now work) are applied.
On the issue of adding disks: disk 3 was put to replace a damaged Dell disk, disk 11 was put in a free slot for hot replacement in the RAID, disks 2, 5 and 8 were put in free slots.
It may occur issues if the drives are non-Dell drives. The drive firmware and the RAID controller aren't compatible. A reboot of the server, would normally give a kick start the lifecycle controller to force detect and make an inventory of the drives, this may result a proper show of the drives being displayed in iDRAC as shown in your screenshots.
If you may have a chance to upgrade ESXi; do check the HCL list in vmware site for any mismatched firmware.