Start a Conversation

Unsolved

This post is more than 5 years old

1035

December 19th, 2017 01:00

10 months worth of files vanished after replacing a dead drive in a RAID 5 configuration in R320 with a PERC H310

An R320 with 4 x 2TB HDs in a RAID 5 configuration shut down on its own on Sunday Dec 18, 2017 and would not even boot up passed the Windows 2012 R2 logo (screen goes blank), which got me panicking.  Call Tech Support and paid $300 for nothing!!!  They wanted me trying to reinstall the OS, which I knew was not the first choice.  I ran Dell's diagnostics on my own and found out one of the drives was dead.  Early Mon morning bought another 2TB HD and got the bad one replaced and much to my relief the server booted up and got me to windows, only to find out that every file/folder created passed Feb 8, 2017 is no longer there!!!  This is just shocking!  This is just a colossal failure of some sort that I cannot even begin to think where....  I know no one has the answer, but if you have experienced the same thing, please express it here..... 

If you decide to write, please make sure to read it carefully, as many people love to ask stupid questions.... no, there was no restoring done from any backup, ... no, there was no reinstall/restore of the OS done whatsoever.

Moderator

 • 

8.5K Posts

December 20th, 2017 05:00

Itssi,

With the Virtual Disk being a Raid 5 then there shouldn't have been any data lost from replacing a single drive, as all that data would be redundant. One thing I can think of that could cause this would be if you possibly had Predicted Failure drives (HDDs that exceed bad block threshold) that had transferred across the Virtual Disk, which could explain the information loss.

Do you have, or did you have, drives flashing back and forth from Amber to Green? If so that is an indication of a Predicted Failure drive. If you do have flashing amber/green drives, or drives shown as YES under Predicted Failure in OpenManage, then I suggest you immediately backup all the data as soon as possible, as the chance for further data loss is present. 

If this does appear to be the case then we would have to delete the Virtual Disk, then recreate it and initialize the drives, then reinstall the OS and restore the data from backup. 

To really be able to tell we would need additional information, such as logs and a TSR report. 

Let me know what you see.

7 Technologist

 • 

16.3K Posts

December 20th, 2017 12:00

Ok, then I won't ask any stupid questions:

  • Did the drives come with the server from Dell? (I ask because non-certified drives are prone to random failures and corruption.)
  • What make/model of drive did you use to replace the failed one? (I ask for the same basic reason as above, but mixing them can be even worse.)
  • Did you power down to replace the drive? or did you hot-swap it? (I ask because the controller checks disks for imports configs when it starts, and there are instances where the wrong config can be imported, either automatically by the controller or from user error; and you should never power down to replace a hot-swap drive.)
  • After replacing the drive, what did you do? Rebuild (and/or assign as hot-spare)? Set as Online? Import? (I ask because what you did could have radically different outcomes on your data.)
  • Was more than one disk failed? and what was the date of the failure? Are you 100% SURE that there was not already a disk failed (or pred fail) before this went down on Sunday?

This is just shocking!  This is just a colossal failure of some sort that I cannot even begin to think where

You are a victim of the everyday chance we all take when using technology. It is not a plot or egregiously defective hardware, it is life with technology. Technology will eventually fail, at which point, it is our processes that make the difference.

  • RAID 5 should never be used on drives that large (if over 500-750GB, don't use RAID 5 - use RAID 10 or RAID 6 instead); as you are pretty much guaranteed data loss in a RAID 5 with 2TB disks during a drive rebuild.
  • I know you said you did not restore from backup, and you did not say you didn't have them, but this is the type of scenario where you would need your backups. RAID is not a backup and there may be times when you need to use your backups to recover from RAID failures (just like motherboard failures, flood, fire, etc.).

4 Operator

 • 

1.8K Posts

December 21st, 2017 06:00

Agree with Flash on all his questions/suggestions, best to answer them..

This may or may not be raid related, perhaps the OS had issues back to February, compounded by a failed disk. Can not remember a raid issue where data was selectively lost by date, size or name. Raid meta data is not concerned with such data.

No Events found!

Top