Highlighted
lmpit
6 Indium

T310 PERC S100 RAID-5 showing wrong capacity/size after HDD replacement

Hello.

Today I replaced a failed HDD, in a RAID-5 array, with a new HDD.

As far as I can tell, everything is working fine. The new HDD has been assigned to the array and it is in the process of being rebuilt while the system is online and working seemingly fine. Yet, there is something that is troubling me: OMSA is reporting the virtual disk size as 8,187.84GB (8TB) instead of the previous and actual 930.48GB. This is crazy, the array is composed of 3x 500GB HDDs and I can't  understand why this is happening... can this be a corruption of OMSA itself or is the RAID array at risk?  At the moment the rebuild process is about 50% in, but I doubt that the outcome will change, and it is worrying me.

Windows is reporting the correct disk capacity though. Not sure what to make of it anyway. I hope someone can give me some suggestions as to why OMSA is reporting this ludicrous disk capacity which has no correlation whatsoever with the hardware.

Thank you!

0 Kudos
15 Replies
Moderator
Moderator

RE: T310 PERC S100 RAID-5 showing wrong capacity/size after HDD replacement

Hello

The error and status reporting features of the S-series controllers is very limited. If any changes are made to the virtual disk it may not be reported correctly in OMSA. Restarting OMSA or the server usually refreshes the information.

I would suggest restarting the OMSA services once the rebuild completes. If the issue persists then I would restart the server during your next maintenance window. If you filter your services by description there should be up to four services that start with DSM SA. Those are the OMSA services.

Thanks

Daniel Mysinger
Dell EMC, Enterprise Engineer

0 Kudos
lmpit
6 Indium

RE: T310 PERC S100 RAID-5 showing wrong capacity/size after HDD replacement

Thanks for the reply.

Unfortunately that didn't solve the issue. I've restarted the services, rebooted the server, uninstalled OMSA, rebooted and then reinstalled OMSA (v8.5) again, but it still reports ~8TB instead of ~1TB. By the time it finished rebuilding the array I had left the premises so I can't boot into the controller config now and make sure that the reported space there is correct . At least in Windows it is still correct and everything seems fine, but I'm not sure if I can rely on this array like this, yet I am trying to avoid having to backup the whole system, delete/recreate the array and restore - it would take me a whole day.

I suspect that part of the problem is because I'm using a different drive, which seems to work correctly but must have some peculiar difference from the others to have this as a result. I'm attaching a few screenshots with more details. But at this point it would be fair to assume that I have no choice but delete the entire thing and create it from scratch, right?

Alerts from the log

 

List of disks in array (0:0 replaced)

 

Array state and capacity (before/after)


0 Kudos
8 Krypton

RE: T310 PERC S100 RAID-5 showing wrong capacity/size after HDD replacement

I would guess this to just be a communication glitch between OMSA and the AWFUL S100 controller. Restart OMSA, restart the server, update server hardware/drivers, update OMSA would be my recommended path to fix the displayed information. I do not think this is an issue any greater than that.

0 Kudos
8 Krypton

RE: T310 PERC S100 RAID-5 showing wrong capacity/size after HDD replacement

Hmm ... for some reason I didn't see your reply. 

What make/model is the disk you put in?

0 Kudos
lmpit
6 Indium

RE: T310 PERC S100 RAID-5 showing wrong capacity/size after HDD replacement

First of all, thanks for chiming in, I appreciate the help.

I'm not sure why you didn't see my previous reply, but it can be because of the way moderation works in this forum. Any post, or update to a post, is subject to moderation before being visible. Not sure if it's related to my account being very recent (in fact, it was, I just made this post and it was not subject to moderation).

Regarding your first reply, and as I mentioned previously, I already restarted/uninstalled/reinstalled OMSA, restarted the server - no chance. Currently installed OMSA is the latest version (v8.5) found in the download section for this server's model. As for drivers, PERC S100 is using latest v2.0.0.162, but I'm not sure if something else should be updated. Now, what is possibly the problem is the replacement HDD which might not be completely supported by the RAID controller. I've checked Nautilus release notes and the model of the replacement HDD in mentioned nowhere so I have little faith that there is any official support/firmware for the new disk despite it being more that appropriate for the purpose. If you check my previous reply I have a few screenshots with the OMSA information table, to see them full resolution right-click and open in a new tab because I made the mistake of ticking the "lightbox" option when inserting the images which makes them display in a barely visible resolution.

Here is the same screenshot:

Basically I replaced

0:0 WDC WD5003ABYX-18WERA0 r01.01S03

with

0:0 WDC WD5003ABYZ-011FA0  r01.01S03

I'm not very experienced with RAIDs in general, and I know that using a disk that is not 100% certified is a rookie mistake, but Dell also pushed me a bit too much. The server is currently out of warranty but 2 years ago, almost to the day, Dell replaced disk 0:0 which is exactly the same disk that failed this week. When I asked for a quote from Dell they presented me with a 250$ HDD, and when I asked if it was new or refurbished they replied: "might be new, might be refurbished, there is no way of knowing for sure". And so I decided to buy a brand new WD RE4 of the same family of the existing ones and hope for the best. Lets' say that I'm 50% sure that I did the right or wrong thing, depending how you look at it.

Now, I'm not exactly sure what I'm looking at here. If the problem is just the numbers that OMSA shows me, I can live with that. But I worry that the problem may be more deeply rooted and cause me serious trouble down the line. As far as OMSA is concerned, the array is in perfect condition.

0 Kudos
8 Krypton

RE: T310 PERC S100 RAID-5 showing wrong capacity/size after HDD replacement

Looks like I had opened it a while ago and never got around to responding and saw that tab first 🙂

A couple of things:

WD5003ABYX looks like a WD RE4, which was validated on the PERC's BY Western Digital. I could be wrong, but I'm not aware of Dell ever certifying models they would have shipped for the PERC controllers specifically.

WD5003ABYZ looks like a WD RE (newer version than the 4), which was NOT validated on OEM hardware at all by WD.

Without testing by either Dell or WD, these become hit/miss on hardware of all types where the device manufacturer did not validate them on their systems (Intel, Synology, maybe some RAID controller mfg's, etc.).

I asked if it was new or refurbished they replied: "might be new, might be refurbished, there is no way of knowing for sure"

Front-line agents don't usually know where they come from, because they NEVER interact with dispatched hardware or parts in general. Because they come in bulk packaging from the various manufacturers, Dell cannot sell them as "new". In some cases, parts are pulled from overstocked systems, but no drive they sell or send as a replacement was ever in extended use in anyone else's system.

I know that using a disk that is not 100% certified is a rookie mistake, but Dell also pushed me a bit too much

Don't fall into the trap of boycotting Dell parts after they make you unhappy, either with untrained/unskilled support or too-high prices. You have a Dell server. Do yourself a favor and outfit it with the proper hardware. You don't have to buy Dell parts directly FROM Dell! Buy Dell parts from suppliers and resellers - they are MUCH cheaper and are the same validated and certified parts you would buy from Dell. Xbyte and ServerSupply are two good places to start with. Many people say "I'm not spending $350 on a 2TB drive from Dell!" Then turn around and buy a desktop or NAS drive for $150 and have lots of issues with it, either immediately or down the road, when they could have bought a certified drive from a supplier for $170 and been problem free.

The S100 is an extremely low-end RAID "solution" and is based on a modified version of Intel's chipset RAID. I would always recommend you use the H-series controllers, no matter the function of the server. Next best option would be to use Windows Disk Management to manage a mirrored setup - mutch better than the S-series controllers. Updating the "firmware" (reliability, performance, and function) of the controller is done through the BIOS updates. I would suggest making sure the system firmware (BIOS, iDRAC/LCC, etc.) is up to date.

0 Kudos
lmpit
6 Indium

RE: T310 PERC S100 RAID-5 showing wrong capacity/size after HDD replacement

WD5003ABYZ looks like a WD RE (newer version than the 4), which was NOT validated on OEM hardware at all by WD.

Without testing by either Dell or WD, these become hit/miss on hardware of all types where the device manufacturer did not validate them on their systems (Intel, Synology, maybe some RAID controller mfg's, etc.).

You're right, and as I bought this new disk I knew there was a risk of it either not working at all or giving me issues, but the risk isn't catastrophic... yet.

Front-line agents don't usually know where they come from, because they NEVER interact with dispatched hardware or parts in general. Because they come in bulk packaging from the various manufacturers, Dell cannot sell them as "new". In some cases, parts are pulled from overstocked systems, but no drive they sell or send as a replacement was ever in extended use in anyone else's system.

Don't fall into the trap of boycotting Dell parts after they make you unhappy, either with untrained/unskilled support or too-high prices.

Sure, the price isn't appealing, but this server was my first experience with Dell, which I inherited from whomever decided to buy it. And it was not really about boycotting either, the main reason why I didn't buy from Dell was not based on price, but on a perceived unreliability, even though I admit the small window of experience with Dell. From my perspective, a brand new server which has 1 of it's 3 disks fail after 2 years and then Dell's replacement also fails another 2 years later. This made me reluctant to buy Dell's as a first choice - forgetting the irony that the other 2 disks in the array are still alive and kicking.

The S100 is an extremely low-end RAID "solution" and is based on a modified version of Intel's chipset RAID. I would always recommend you use the H-series controllers, no matter the function of the server. Next best option would be to use Windows Disk Management to manage a mirrored setup - mutch better than the S-series controllers. Updating the "firmware" (reliability, performance, and function) of the controller is done through the BIOS updates. I would suggest making sure the system firmware (BIOS, iDRAC/LCC, etc.) is up to date.

Oh, I've realized how low-end that controller is, mainly from other forum threads where people have a multitude of issues with it, ranging from performance to stability. Also, thanks for the tip about using Windows Disk Management, I may end up doing just that at a later time. By the way, do you know if there is any other software beside OMSA that can read the S100 array from within Windows?
Just for kicks, on Monday I'll boot into the RAID controller config and see if it tells me that I have a virtual disk of 8TB or if it's just a problem specific to OMSA, in which case I might ignore it for the time being.
0 Kudos
lmpit
6 Indium

RE: T310 PERC S100 RAID-5 showing wrong capacity/size after HDD replacement

Curiously, I've tried running Dell System E-Support Tool (DSET) and then look at the log created. Surprisingly it shows the correct virtual disk capacity. Not too sure what to make of it but it is a bit reassuring nonetheless.

0 Kudos
8 Krypton

RE: T310 PERC S100 RAID-5 showing wrong capacity/size after HDD replacement

any other software beside OMSA that can read the S100 array from within Windows

"Probably" not. I would say no, but there is a small chance that Intel's software can connect and read it, but because the S100 is a rebrand of Intel's controller logic, I suspect it will not be able to talk to it natively.

I think the best options are:

  1. Replace the disk with an RE4.
  2. Updating the BIOS.

Although I think it also safe to ignore it - above best options are if you want to fix it.

(Firmware/driver updates should be kept up to date as a best practice, and given the age of the system, if updates have not been kept current, the firmware could be pretty old and there is a better than average chance that bringing it current will help.)

0 Kudos