Start a Conversation

Unsolved

H

1 Rookie

 • 

27 Posts

282

August 16th, 2022 05:00

R610 RAID1 on LSI SAS1068E

We have an R610 with a PERC 6 (?) running ESXi 6.5.  The OS drives are RAID1 (drive 3+4) and partition scheme was the default configuration for VMFS.

I've noticed that I have a drive fault listed for drive 4 in the iDrac and one of my VMs show defects for the filesystem, remounting the / partition as read-only.  

Do failing drives in RAID1 replicate the bad sectors?  Is there an fsck for ESXi?  I'm worried this might destroy the .vmdk file somehow rendering the VM useless.  Is there a way to move the .vmdk data on the failing drive to good sectors, recover the "good data" from the other drive, and make sure my VMs and ESXi machine are ok?  Should I shut down the VM until I have a replacement for Drive 4 to replicate the good drive back?  Or should I shut down the VMs only?

I have seen something similar at my old job, where one of the drives in RAID 1 was failing in mdadm and destroyed the RAID1 OS layout of an old PE2900 I had running for NIS services, refusing to boot.

Moderator

 • 

8.8K Posts

August 16th, 2022 09:00

HpcTech,

 

To start would you confirm the status of the Physical Disk, specifically if it is online or offline, and if it is flagged as a Predicted Failure. If the drive is showing as a Predicted Failure, and is still Online, then there are steps we need to do in order to replace the drive and not transfer the bad blocks across the Virtual Disk. Now if it is not showing a Pred Fail, and is offline, then there shouldn't be any issue with replacing the drive without risk of the bad blocks transferring. 

As far as the VMs, if the server is using hot swappable drives on a backplane, then you should be able to do all this while the server and VM's are online. When you replace the drive and it starts rebuilding, you will see about a 30% performance drop from the controller, until the rebuild completes.

 

As far as the data, with the VD beig a Raid 1, all the data is mirrored on the drives, so no need to move anything.

 

Let me know if this helps and what you are seeing.

 

 

 

1 Rookie

 • 

27 Posts

August 16th, 2022 14:00

Chris,

Not sure - system is ESXi.  I would guess predictive, as it's still online and the VMs show filesystem issues as if it were a real drive, so something's going on.

In the iDrac, it just shows "Drive 4 reported a fault" two times, 24hr apart.  That was almost a month ago, and there hasn't been anything else in dmesg on the CentOS VM.  

I was told to simply replace it.  I was also told to check something and disable the consistency check, so it wouldn't mess up the member drive.  This is uncharted territory with me, so I appreciate the walk-throughs.

Regards.

Moderator

 • 

4.1K Posts

August 16th, 2022 20:00

Hi Would this help at all?

 

https://dell.to/3QtwUY0

1 Rookie

 • 

27 Posts

August 17th, 2022 06:00

Thanks for that.  Unfortunately, we are not wanting a consistency check as this might destroy the .vmdk on the ESXi host.  I was looking for a way to disable that check from running automatically.

I am just going to replace the faulty drive and hope the rebuild fixes the spots that were messed up from the faulty drive.

 

No Events found!

Top