I found out that the issue was indeed the RAID controllers in the R610. The particular version of the PERC 6 was not compatible with the Ubuntu Kernel and was this not handling the SSDs properly. After replacing them with the PERC 6/i with the battery backup, everything survived an entire weekend shut down. Thanks everyone for the suggestions and help.
Does the system boot after getting the corruption errors? Have you tried with non SSD drives? It sounds like there may be an issue between the drives and Linux where it is shutting off the system before the filesystems have been unmounted and the file buffers are not being written to disk. If the drives are not in a RAID 1 and are standalone does the issue still occur?
Is the SAS6/ir firmware and the system BIOS up to date?
The system would sometimes boot if it would make it to a command prompt, but getting into the GUI of either Linux flavor was no go. I tried the SSD drives in an R710 to confirm the drives were good and had no issue with them. Further, all 5 of the R610s are identical and swapped parts. I did manage to find the firmware update for the 6iR and applied it this morning. I will know in the morning if firmware did the trick as this is usually when I would get the error after being powered down overnight.
I managed to figure out which of the 6/iR controllers were in my R610s and got the firmware update to go through. After being powered off for a night, everything is looking promising as the servers came up without incident. However, my hope is tempered by the knowledge that I have seen this before only to have it reappear after being off for the weekend. I will know Monday if this is actually fixed or not.
One question does come to mind regarding these R610s compared to the R710s we have: is it possible to use the 6/iR that have the battery backup plugged into them like our 710s do? I know this would mean replacing the current cards in the 610s, but I am not sure if it is a good idea to pull one of the 710 cards and try it out. Advice?
OK, well Monday is here and I powered on the servers. RHEL 7 booted no problem. Zentyal/Ubuntu still has the file system corruption. I am wondering if it something with Zentyal/Ubuntu that is having the issue. That is why I am wondering if the battery backup supported RAID controller in my R710 would be a better way to go as it seems like something with the Ubuntu system that is having the issue. I tried the Zentyal/Ubuntu installation with the SSDs in one of the 710s and did not have an issue, so I am wondering if the RAID controller in the R610 not having a battery backup like the ones in the R710 may be the cause.
thenodemaster
6 Posts
0
February 11th, 2015 07:00
I found out that the issue was indeed the RAID controllers in the R610. The particular version of the PERC 6 was not compatible with the Ubuntu Kernel and was this not handling the SSDs properly. After replacing them with the PERC 6/i with the battery backup, everything survived an entire weekend shut down. Thanks everyone for the suggestions and help.
DELL-Josh Cr
Moderator
•
9.4K Posts
0
February 5th, 2015 12:00
Hi Scott,
Does the system boot after getting the corruption errors? Have you tried with non SSD drives? It sounds like there may be an issue between the drives and Linux where it is shutting off the system before the filesystems have been unmounted and the file buffers are not being written to disk. If the drives are not in a RAID 1 and are standalone does the issue still occur?
Is the SAS6/ir firmware and the system BIOS up to date?
Bios: http://www.dell.com/support/home/us/en/19/Drivers/DriversDetails?driverId=C6MRW&fileId=3287843619&osCode=RH60&productCode=poweredge-r610&languageCode=EN&categoryId=BI
SAS6/ir http://www.dell.com/support/home/us/en/19/Drivers/DriversDetails?driverId=C1NFP&fileId=3176100186&osCode=RH60&productCode=poweredge-r610&languageCode=EN&categoryId=SF
thenodemaster
6 Posts
0
February 5th, 2015 13:00
The system would sometimes boot if it would make it to a command prompt, but getting into the GUI of either Linux flavor was no go. I tried the SSD drives in an R710 to confirm the drives were good and had no issue with them. Further, all 5 of the R610s are identical and swapped parts. I did manage to find the firmware update for the 6iR and applied it this morning. I will know in the morning if firmware did the trick as this is usually when I would get the error after being powered down overnight.
thenodemaster
6 Posts
0
February 6th, 2015 09:00
OK: Update.
I managed to figure out which of the 6/iR controllers were in my R610s and got the firmware update to go through. After being powered off for a night, everything is looking promising as the servers came up without incident. However, my hope is tempered by the knowledge that I have seen this before only to have it reappear after being off for the weekend. I will know Monday if this is actually fixed or not.
thenodemaster
6 Posts
0
February 6th, 2015 09:00
One question does come to mind regarding these R610s compared to the R710s we have: is it possible to use the 6/iR that have the battery backup plugged into them like our 710s do? I know this would mean replacing the current cards in the 610s, but I am not sure if it is a good idea to pull one of the 710 cards and try it out. Advice?
thenodemaster
6 Posts
0
February 9th, 2015 07:00
OK, well Monday is here and I powered on the servers. RHEL 7 booted no problem. Zentyal/Ubuntu still has the file system corruption. I am wondering if it something with Zentyal/Ubuntu that is having the issue. That is why I am wondering if the battery backup supported RAID controller in my R710 would be a better way to go as it seems like something with the Ubuntu system that is having the issue. I tried the Zentyal/Ubuntu installation with the SSDs in one of the 710s and did not have an issue, so I am wondering if the RAID controller in the R610 not having a battery backup like the ones in the R710 may be the cause.