Start a Conversation

Unsolved

This post is more than 5 years old

B

2909

March 5th, 2018 04:00

MD3200i - Failed disk, hot spare didn't swapping in + unreadable sectors

Hello all,

 

I have an out of warranty MD3200i with a failed disk. The hot spare didn't swap in, and I'm getting unreadable sectors.

There's a single disk group, with 8 virtual disks, and one of the virtual disks is showing as degraded. There's a reconstruction in progress, but it's failing continuously.

The following disks have a non optimal status in MDSML

Slot6 - Replaced - this was previously the hotspare

Slot 7 - Failed.

Slot 10 - Impending failure.

The raid level of the disk group is 5.

 

I've collected the support data.

 

Brian

 

 

 

 

 

 

 

 

 

Moderator

 • 

8.5K Posts

March 5th, 2018 11:00

Hi,

Can you private message me the service tag so we can get some additional information?

 

If it is failing the rebuild due to the drive with bad sectors you are going to need to replace the bad drives, create a new array and restore a backup. http://www.dell.com/support/article/us/en/4/SLN111497

March 5th, 2018 12:00

Hi Josh,

 

PM with service tag sent.

Thanks,

Brian

Moderator

 • 

8.5K Posts

March 5th, 2018 12:00

Thanks, Do you have a backup of the data?

March 5th, 2018 13:00

The virtual disk is a cluster shared volume in a hyperV cluster - I have a backups at the guest OS level, but not at the LUN level.

It's still operational, so I can assign a driver letter in the cluster host and take a backup of the virtual disk data that way when all the VMS are shutdown.

Recovery guru is only showing one virtual disk as degraded - I was unsure if other virtual disks were affected (i.e. did I need to back up the full storage array).

Can I get a quote for a replacement disk, or disks?

Is there any alternative to deleting the virtual disk?

 

Is it possible to tell whether the disk will continue to degrade to the point where the virtual disk will go offline?

 

Thanks.,

Brian

Moderator

 • 

8.5K Posts

March 5th, 2018 14:00

If it is not rebuilding there is not another way around recreating the array, since the bad blocks on the drive are preventing the rebuild. Unfortunately, I am not able to do quotes, it has to come from the sales department.  

March 5th, 2018 15:00

Thanks Josh,

 

>Recovery guru is only showing one virtual disk as degraded - I was unsure if other virtual disks were affected (i.e. did I need to back up the full storage array).

 

Is it possible that only one virtual disk is affected?

 

Moderator

 • 

8.5K Posts

March 5th, 2018 15:00

Yes, it depends on where the bad blocks are located, they could all be in one virtual disk.

March 6th, 2018 07:00

Hi Josh,

 

Would it be possible for someone to have a look at the support bundle so that I can be sure that the correct drives are replaced?

 

Thanks,.

Brian

Moderator

 • 

8.5K Posts

March 6th, 2018 09:00

Yes, I sent you a PM.

March 6th, 2018 12:00

Support bundle sent as requested.

 

Thanks,

Brian

 

Moderator

 • 

8.5K Posts

March 6th, 2018 13:00

Drive 7 is showing failed and drive 10 is impending failure, so those drives should be replaced. The other virtual disks look fine.

March 6th, 2018 14:00

Hi Josh,

 

OK, so just to confirm the process:

 

1) Copy my data off the degraded virtual disk

2) Delete the degraded virtual disk

3) Insert a new disk in slot 7 and assign it as the global hot spare

4) Fail the disk in slot 10 and the hot spare in disk 7 should replace it

5) Insert a new disk in slot 10

6) Create a new virtual disk

7) Copy my data back

 

I'm worried about the data on the other virtual disk - I though that all disks were striped across all drives (excluding the hot spare), so I can't have more that two failed drives  at any one time.

 

There is currently a reconstruction operation in progress (since I swapped in the original hot spare) Can that operation be cancelled?

 

Thanks,

Brian

Moderator

 • 

8.5K Posts

March 6th, 2018 15:00

It is one big raid 5 for the disk group, so not just that virtual disk will be affected by the replacement of the drives, they all will be. Sorry if I didn’t make it clear. The reason why only that one virtual disk was affected was the bad blocks just happened to be located on the same virtual disk.  So you will have to recreate all of the virtual disks and restore all of the data.

 

Since the drive is currently reconstructing you could let it finish then replace the drive that has the pending failure, then once that finishes rebuilding delete just that one virtual disk and then recreate it and restore the backup.

March 6th, 2018 16:00

Hi Josh,

 

The recovery operation has been running continuously for the past 36 hours, and I';ve never seen the status pass 30%.

 

>It is one big raid 5 for the disk group, so not just that virtual disk will be affected by the replacement of the drives, they all will be.

OK, this makes more sense to me.

 

I have a spare MD1200 with sufficient capacity. that I can connect to the MD3200.

 

Do you know if the virtual disk copy feature could be used? will it fail on bad sectors?

 

I have a software utility that I can use on the host, but the copy time for all data is going to be over 12 hours, so 24 hours in total - and that's assuming that there's not too much corrupt data, which really slows the copy time. (One vhd file that I tested with had 1.8 MB of bad data out of 20 GB total and the overall data transfer rate was 1/3 of the rate for a good 20 GB file).

 

Thanks,

Brian

 

 

No Events found!

Top