Start a Conversation

This post is more than 5 years old

Solved!

Go to Solution

29764

February 5th, 2015 09:00

R610 SAS6IR with SSD RAID 1 File System Corruption

Hello everyone!  I have been working on an ongoing issue with some refurbished Dell R610 servers with the SAS6IR controller ruhnning a pair of Kingston 256GB SSDs in RAID1 configuration.  I have 5 of these servers and all are identical and all have the same issue which is why I am leaning--at the point anyway--to more of a settings issue rather than hardware since I have swapped the SSDs around among servers as well as tested them in an R710 and the issue arises in all R610s but the 710 using the same drives does not exhibit the issue.

So, on to the actual issue: file system corruption after being shutdown and powered off by the OS--not power failure or power button.

I have tried using Ubuntu Linux, RHEL7, Windows Server 2k8 and have seen the issue in the Linux O/Ss but not Windows--even XenServer worked flawlessly after a 4 day weekend shutdown, which lends me to think it is Linux related.  Everything shows up fine in Main and 6IR bios as far as all drives/RAID volumes and running diagnostics on the drives shows they are all in good health.  I can install and reboot until the cows come home and there is no issue.  But when I shutdown the system completely (this is a test lab environment and not yet in in production environment) using the shutdown option in the OS and gracefully shut down the system and power off, but not unplug from the power outlet/UPS when I power back on--usually after being shut down overnight, but also does the same after an hour or so as well--Everything POSTs fine and starts the boot sequence.  The RAID array and Volume is found and the OS starts booting.  It is then that the system will either come to a command prompt instead of the GUI or will halt with a file system corruption error saying there are deleted inodes.  I am in the process of downloading the Dell Diagnostic DVD and will see what it comes up with, but this issue has been frustrating me for almost 2 months now as I struggle to find what the cause is.

Any further info needed, please let me know.  Thank you in advance.


Scott

February 11th, 2015 07:00

I found out that the issue was indeed the RAID controllers in the R610.  The particular version of the PERC 6 was not compatible with the Ubuntu Kernel and was this not handling the SSDs properly.  After replacing them with the PERC 6/i with the battery backup, everything survived an entire weekend shut down.  Thanks everyone for the suggestions and help.

Moderator

 • 

8.5K Posts

February 5th, 2015 12:00

Hi Scott,

Does the system boot after getting the corruption errors? Have you tried with non SSD drives? It sounds like there may be an issue between the drives and Linux where it is shutting off the system before the filesystems have been unmounted and the file buffers are not being written to disk. If the drives are not in a RAID 1 and are standalone does the issue still occur?

 

Is the SAS6/ir firmware and the system BIOS up to date?

Bios: http://www.dell.com/support/home/us/en/19/Drivers/DriversDetails?driverId=C6MRW&fileId=3287843619&osCode=RH60&productCode=poweredge-r610&languageCode=EN&categoryId=BI

 

SAS6/ir http://www.dell.com/support/home/us/en/19/Drivers/DriversDetails?driverId=C1NFP&fileId=3176100186&osCode=RH60&productCode=poweredge-r610&languageCode=EN&categoryId=SF

February 5th, 2015 13:00

The system would sometimes boot if it would make it to a command prompt, but getting into the GUI of either Linux flavor was no go.  I tried the SSD drives in an R710 to confirm the drives were good and had no issue with them.  Further, all 5 of the R610s are identical and swapped parts.  I did manage to find the firmware update for the 6iR and applied it this morning.  I will know in the morning if firmware did the trick as this is usually when I would get the error after being powered down overnight.

February 6th, 2015 09:00

OK: Update.

I managed to figure out which of the 6/iR controllers were in my R610s and got the firmware update to go through.  After being powered off for a night, everything is looking promising as the servers came up without incident.  However, my hope is tempered by the knowledge that I have seen this before only to have it reappear after being off for the weekend.  I will know Monday if this is actually fixed or not.

February 6th, 2015 09:00

One question does come to mind regarding these R610s compared to the R710s we have: is it possible to use the 6/iR that have the battery backup plugged into them like our 710s do?  I know this would mean replacing the current cards in the 610s, but I am not sure if it is a good idea to pull one of the 710 cards and try it out.  Advice?

February 9th, 2015 07:00

OK, well Monday is here and I powered on the servers.  RHEL 7 booted no problem.  Zentyal/Ubuntu still has the file system corruption.  I am wondering if it something with Zentyal/Ubuntu that is having the issue.  That is why I am wondering if the battery backup supported RAID controller in my R710 would be a better way to go as it seems like something with the Ubuntu system that is having the issue.  I tried the Zentyal/Ubuntu installation with the SSDs in one of the 710s and did not have an issue, so I am wondering if the RAID controller in the R610 not having a battery backup like the ones in the R710 may be the cause.

No Events found!

Top