Highlighted
azkerm
1 Nickel

Sudden disk failure on an R720xd server [RAID 10]

Hi There,

I've bought an new R720xd server recently where all the disk were performing without any issues. I have created RAID, install OS, set-up necessary services.... etc without a hassle. However, recently server started throwing an error on one of the disk displaying a failure. I'm pretty sure there weren't any power failure or similar. My RAID configurations are;

12x 1TB disks at 7.2k (3.5") --- on RAID 10 (every two disks are grouped)

2x 250GB disks at 10k (2.5" rear) -- on RAID 1 (mirrored)

---------------

There were 6 spans starting from 0 to 5 and the failure is on span 2. I'm not sure what to do; thus seeking support from the experts. Meanwhile, I have some questions clarify as well;

  • Are these hot-swap disks? (meaning, removing while the server is running)
  • Can the same disk be re-inserted. If so, how do I rebuild the RAID? and can I use the Openmanage utility to perform this?
  • Do I need to take a complete backup before doing any? (usually all the DATA are backed-up except for the VM's -- they are on Hyper-V)

Advise me if I'm missing anything and also below seen are some screens taken out of Openmanage

Tags (1)
0 Kudos
4 Replies
Moderator
Moderator

RE: Sudden disk failure on an R720xd server [RAID 10]

Azkerm,

You will definitely want to start with a good and complete backup whenever yuo work with the raid controller or hdd's. Secondly, if the drives are hot swappable, then you can reseat the drive to start a rebuild. You can also select rebuild from OMSA under the Virtual Disk drop down menu.

Do you know if the drive is showing a YES under Predicted Failure in OMSA? If so then please refrain from trying to rebuild that drive any further, as that can cause bad blocks to be transfered from the drive to the Virtual Disks

Chris Hawk

Dell | Social Outreach Services - Enterprise
Get Support on Twitter @DellCaresPro 
Download the Dell Quick Resource Locator app today to access PowerEdge support content on your mobile device! (iOS, Android, Windows)

0 Kudos
azkerm
1 Nickel

RE: Sudden disk failure on an R720xd server [RAID 10]

Thank you the reply.

I'm taking backups of VM and Hyper-V image just to make sure I don't loose anything. Further, how do I check whether they're hot swappable?

OMSA doesn't give much of an information as the drive itself shows its failed instead of online. However, I'm not trying to rebuild with the failed disk, thus I'm ordering few drives as spares just to avoid any future failures..

I've missed to keep some hot spares which is a big mistake that I did...

Thank you once again for your advise.

0 Kudos
theflash1932
6 Indium

RE: Sudden disk failure on an R720xd server [RAID 10]

The 720 only has hot-swappable drives. Technically all SAS and SATA drives are hot-swappable (including cabled/internal drives), but the most practical test is if the drives are accessible in trays/carriers/caddies from outside the system (front), then they are "hot-swappable".

0 Kudos
azkerm
1 Nickel

RE: Sudden disk failure on an R720xd server [RAID 10]

Thanks for the replies, I'm waiting for my replacement disk and will hot-swap it once I have them received.

I will post back once they all gone smooth

0 Kudos