Start a Conversation

Unsolved

This post is more than 5 years old

I

1639

September 1st, 2017 12:00

R920- Single SSD reads as "Disk.DIrect" when part of a RAID 10

Good day

I am having some trouble trying to figure out why one SSD in a RAID 10, eight drives total, is reading as "disk.direct" while all others, including in another RAID set, all read "disk.bay."

All drive bays are populated. 24 disks total. The other 16 are also in a RAID 10, but normal HDDs. Using a PERC H730P (X4TTX)

I don't have access to the physical server, but these are the details I was given..

The "bad" disk readout (actual disk tests good in diagnostics and shows 100% life remaining in management, no faults in iDRAC)

 2 2 4711 1 DCIM_PhysicalDiskView InstanceID Disk.Direct.22:RAID.Integrated.1-1 string DCIM_PhysicalDiskView DeviceDescription string Disk 22 on Integrated RAID Controller 1 Disk 22 on Integrated RAID Controller 1

The rest of the disks readout like this:

2 2 4711 1 DCIM_PhysicalDiskView InstanceID Disk.Bay.23:Enclosure.Internal.0-1:RAID.Integrated.1-1 string DCIM_PhysicalDiskView FQDD string Disk.Bay.23:Enclosure.Internal.0-1:RAID.Integrated.1-1 Disk.Bay.23:Enclosure.Internal.0-1:RAID.Integrated.1-1

Why would disk 22 be labeled as "disk.direct" and not "disk.bay?" Also, why would it read as "on integrated RAID controller" while the other disks all read "Enclosure.Internal....."

Other drives have been inserted into slot 22 but the same thing happens. The drive is blinking amber but only After boot up and doesn't light up with the rest of the drives On bootup. 

Maybe a bad backplane slot? I wanted to ask here first as swapping out a backplane will be a whole production that we are, of course, trying to avoid if not needed. 

Thanks for any help / insight

Moderator

 • 

6.2K Posts

September 1st, 2017 13:00

Hello

I would suggest reviewing the controller/TTY log. If no drives are working in that slot then you need to be sure you are testing known good drives.

Thanks

5 Posts

September 1st, 2017 14:00

Thank you for replying Daniel.

The other drives tested in the slot are known to be good and working, as well as the drive currently in the slot. I will check controller logs and see what happens. I didn't know there were logs for the controller itself. Thank you for this suggestion.

System logs show when a drive is removed from the slot, but no errors at all, with any component. The drive in that slot is read normally in iDRAC, aside from it reporting "..on integrated RAID controller 1" and even in Dell diagnostics there are no failures. But the drive, and any other drive that is put into that slot, does the same thing. Blinks amber after boot, and shows as "disk.direct" when that inventory is reported. 

I will update when I go through the logs.

Thank you again.

5 Posts

September 1st, 2017 17:00

Just got the log file.. 

I'm not quite sure what I'm looking for but I am noticing some things that seem out of place.. 

The drive bay numbers are off.

The log file is too large to attach, here is a link to view it..

https://drive.google.com/file/d/0B85dldJnb-IXdDZsQ3dhOUJhc0U/view?usp=sharing 

Any insight on what I would be looking for related to this issue would be appreciated. 

Thank you

Moderator

 • 

6.2K Posts

September 1st, 2017 18:00

It is detecting that something is inserted into the slot, but it is unable to fully communicate with it. These are the log messages and the drive it is referring.


09/01/17  8:59:12: C0:SES_MarkBadElement: Undetected device on enclPd=20, StsCode=5, elmtType=17, elmtIndex=16, bayNum=16
09/01/17  8:59:12: C0:EVT#06495-09/01/17  8:59:12: 184=Enclosure PD 20(c None/p1) sensor 22 bad

09/01/17 14:45:39: C0:SES_MarkBadElement: Undetected device on enclPd=20, StsCode=5, elmtType=17, elmtIndex=16, bayNum=16
09/01/17 14:45:39: C0:EVT#06642-09/01/17 14:45:39: 184=Enclosure PD 20(c None/p1) sensor 22 bad

T15: C0:16   f1400005 00020 00   6fc81aaf      1 0 0 0 ATA      KINGSTON SEDC400 32.I 0 0 500056b31234abd6 00   16  0020   0  NA   NA


I would be hesitant to replace the backplane. You have several non-certified drives in the system, including the drive in slot 22. Very odd behavior can occur when you use unsupported hardware. I would remove all of the non-certified drives and use a certified drive to test slot functionality. If that drive functions in all slots except 22 then it is a backplane issue.

Thanks

No Events found!

Top