Start a Conversation

Unsolved

This post is more than 5 years old

3202

January 9th, 2017 17:00

PE T110 with SAS 6/iR - replaced DR1 of a mirrored set, status shows as DEGRADED for over 4 days...

Hi All,

I have a client with a Dell PE T110 server that has two 300 GB SAS 15K drives arranged as a mirror.  DR1 was reporting Event ID 7's (bad blocks) to guest OS, so I bought a band new identical  Seagate drive, downed the server, removed DR1, powered up server, went into SAS RAID utility and saw status was DEGRADED.  Powered off, then installed new DR1.  Powered on, went into BIOS SAS RAID utility and told it to SYNCHRONIZE DRIVE, exited BIOS utility and rebooted.  FOUR DAYS later and OMSA still reports the Virtual Disk, Disk 1 state as REBUILDING.  Did I do something out of order or incorrectly?  I assumed that when I selected SYNCHRONIZE DRIVE (in the BIOS SAS RAID utility ), that I could then exit the utility and the sync/rebuild would run in the background?

Thanks for any help,

-Mike

26 Posts

January 10th, 2017 07:00

Hmmm, no replies yet.  Maybe I posted to an incorrect forum?

9 Legend

 • 

16.3K Posts

January 10th, 2017 09:00

There is a RAID forum that would probably have been a better choice, but usually the same people will see all the Server posts.

What OS? What does OMSA say as the status for the individual Physical Drives and the Virtual Drives? Which controller? I'm assuming you aren't using a controller with a log to be able to look and see what is going on - probably the SAS 6/iR.

Is the replacement drive a certified Dell drive?

26 Posts

January 10th, 2017 13:00

Sorry, I should have listed more details: running ESXi 4.1 as the host and Windows SBS 2008 R2 as the only guest OS.  Windows 2008 was reporting the bad blocks on DR1.

Just pondering: how would the guest OS detect these bad blocks on a mirror ?  Doesn't the hardware RAID take care of any errors, essentially keeping it hidden from the guest?  Or does it pass along the drive error so the guest will be aware of it?

9 Legend

 • 

16.3K Posts

January 10th, 2017 14:00

With OMSA installed on the baremetal OS, those types of errors can be communicated to the OS if OMSA is installed. Guests do not have direct access to the hardware, let alone the individual drives in a RAID 1. The SAS 6 also supports passthrough/non-RAID ... are you sure it is configured on the SAS 6 as an IM (RAID 1) and not passing the drives through to the guests?

26 Posts

January 10th, 2017 15:00

I did install the Dell ESXi OMSA agent piece of this ESXi host.  It's definitely configured as a RAID 1 (mirror).  There's only two identical 300GB SAS drives, 1 volume.  During the last boot I did, I took a screenshot which I pasted below.  This is first time I ever attached a pic, so don't know if I did it right.  Upload was successful, but no image appears when I click Preview...

 

1 Attachment

26 Posts

January 10th, 2017 15:00

Oh, I see the pic is above the message...

The fact that it states it is resyncing during POST makes me think I did it correctly.  Probably a problem with one of the drives or a bug in the RAID utilities?

26 Posts

January 10th, 2017 16:00

In this new VM world we live in, is there a way to check/fix a disk like the gold old days of CHKDSK?

26 Posts

January 10th, 2017 18:00

The guest OS, SBS2008, only sees one drive - which is a Simple Basic NTFS volume (C:) with size 200GB (my ESXi datastore allocates 200GB of the physical 300GB mirror to this OS).  Only ESXi sees the 2 physical drives.  You might have been saying that, I wasn't sure...

I just ran a   CHKDSK C: /V  on SBS and did not see any errors (screen shot attached).

So I believe my original hunch was correct - especially after your comment that the .VIB sent disk error info from ESXi back to the guest OS - that physical drive 1 had bad blocks.  I've replaced that drive, but it never syncd.

So I guess my next step will be to go to the client's office after hours and boot into the SAS RAID utility and try to force another resync - and leave that utility open and watch to see if it ever increments past 0% - or maybe displays a usefule error message...  Any other ideas more than welcome :)

1 Attachment

9 Legend

 • 

16.3K Posts

January 10th, 2017 18:00

CHKDSK has only ever been an OS file system utility. It can't repair a faulty disk, nor can it repair a corrupt RAID array.

It is possible that the OMSA VIB is sending it to the guest. If it weren't for the fact that it could see both drives, I would have suggested a CHKDSK, as it was much more probable that it was referring to its file system than the physical disk masked behind several layers of abstracted technology.

26 Posts

January 11th, 2017 18:00

Went to customer site.  Interestingly, I found a conflict between what VMware vSphere Client reported vs OMSA as far as server health.  vSphere showed the 2 SAS drives as "Optimal" (healthy) whereas OMSA (v8.4 - the latest) reports the drive I had replaced as being Degraded (state = Rebuilding).  So I downed the server, rebooted ESXi host and then OMSA reported things as normal/healthy.  So looks like a bug with OMSA, or lack of capabilities between it and this T110 hardware.  Good old Dell software :(

No Events found!

Top