Start a Conversation

Unsolved

This post is more than 5 years old

19769

January 3rd, 2018 07:00

Failed Disk Replaced But New Disk Not Rebuilding

We have RAID 10 on PowerEdge R720 and one from the 8 disks fail ed. So we took out failed disk as usual and hotswapped with new disk but new disk is not rebuilding. In the Open Manage Server Admin, we just see options of Blink and no option to rebuild or bring it online. In the log, we see that it is giving message that new disk also has failed. How we can rebuild this new disk? There is option of Replace Member Disk but we are not sure about its functionality. 

Moderator

 • 

8.7K Posts

January 3rd, 2018 09:00

Hi,

It looks like it is seeing 3 disks in span 3 instead of 2. 0:1:7 and 1:1:7. Try removing the new disk and see if one of those drives disappears then reinsert it. Is the PERC firmware up to date?

16 Posts

January 3rd, 2018 09:00

Earlier in normal operation, span 3 had only 2 disks. One of them failed and we took it out, waited 5-6 minutes and then inserted new disk. But this new one was not rebuilt and we see now three disks there. We will remove the new disk and see what the span shows then.

PERC firmware is not updated.

16 Posts

January 3rd, 2018 21:00

Hi. We removed the new hard disk but still OSMA is showing 3 disks in Span 3. Do we need to clear the OSMA logs? What should we do now?

Moderator

 • 

8.7K Posts

January 4th, 2018 08:00

Are you able to do a reboot? You may want to do that, go into the controller and then try to put the drive in. 

16 Posts

January 4th, 2018 09:00

Yes, we can do reboot but do we insert the new hard disk before the reboot or after reboot? Also what you mean by "go into the controller and then try to put the drive in"?  Can you please explain it more?

Moderator

 • 

8.7K Posts

January 4th, 2018 10:00

During the reboot press ctrl+r when it prompts for the PERC and it should bring you to the controller config. When you are in here put the drive in again. 

16 Posts

January 4th, 2018 20:00

I dont want to reboot the server. Can we just restart the services of OSMA to see if shows correct data?

16 Posts

January 4th, 2018 23:00

Rebooted the server. Put the drive in and it automatically started rebuilding. Should I exit from the configuration now or wait till rebuild is complete and drive shows online?

16 Posts

January 5th, 2018 00:00

We have RAID 10 and 8 Physical Disks and 1 Virtual Disk. We recently replaced a failed disk and we are getting message that "Virtual Disk Has Bad Blocks". How we can clear these bad blocks? Should we run the "Check Consistency" or should we run the "Clear Virtual Disk Bad Blocks" ?

2 Posts

May 4th, 2018 04:00

The "Virtual Disk Has Bad Blocks" warning means that during the rebuild there were areas where it couldn't read the data it needed from the other disks, either due to read failures or the sectors on the other already being marked as "bad block". Since you have a RAID10 it's the other disk in the span (pair) it had issues with.

If you need to know which sectors this is the only way I know of is to read the entire virtual disk via the OS and see which sectors it reports failures on.

AFAIK the only way forward is to run the "Clear VBBD" which will clear the Failed flag but be aware that the data contained on those sectors (could be one, could be many) is permanently LOST (but now you can at least access the areas without failures).

Next step is to run either Check Consistency or run a manual patröl (forum doesn't allow correct term!) read, this is to clear out any real read failures on the original disk, any read failures should cause a rewrite which either fixes the sector (temporary failure) or reassigns a spare sector (permanent failure, done unless the disk has run out of spare sectors).

This process may trigger the original disk to go into Failed status, if so it should obviously be replaced ASAP. If not, check if there's any warning in the detailed status on the "span partner".

If your system is set up to do regular patröl reads (the default) it should be extremely rare to end up in this situation unless you leave it running with one or more failed disks for an extended period because patröl read should cause the bad sectors to either be repaired or cause the disk to fail unless the VD is already degraded.

As a result some people would replace the original disk (the span partner to the one you replaced earlier) out of an abundance of caution even if there's no further signs of issues, you'll have to weight cost vs small additional safety.

If do decide to do an pre-emptive replacement of the span partner and you have a free slot in the backplane you can use the "Replace Member Disk" option to replace it without additional risk. If you don't the only option is the standard replacement method which is fine as long as your other disk doesn't fail (so make sure it's been scanned a few times first).

No Events found!

Top