Start a Conversation

Unsolved

This post is more than 5 years old

1242760

November 23rd, 2014 12:00

R730XD H730P - F/W in in Fault State

When booting up, I get the following message from the PERC Controller:

F/W is in Fault State - MFI Register State 0xF0010000

Adapter at Baseport is not responding
No Adapter


We use a bunch of SSD's.
Any ideas?

Thanks
Michel

Moderator

 • 

6.2K Posts

November 23rd, 2014 16:00

Hello Michel

It looks like the system BIOS is able to detect that an adapter is present, but it is not able to fully communicate with it. I suspect the firmware is locked up or receiving a lot of I/O that is causing delayed responses. The error indicates partial communication is occurring.

Do you have errors on any of the drives? There could be a failed drive that is flooding the the PERC with requests. I would suggest removing any failed drives. If there are no failed drives then I would shut the server down and disconnect all drives. Power the system on with all drives disconnected. If the issue goes away then the problem is with a drive, slot, or cable. If the issue persists then the problem is with the PERC, slot(system board), cable, or backplane. It could also be almost any hardware attached to the system.

Reseating the PERC would be a good place to start. If that does not resolve the issue then you will need to start disconnecting components to rule out points of failure. If you have known good parts to swap then it will be a lot easier to narrow down the problem.

Thanks

November 23rd, 2014 21:00

Michel,

Could you please provide the following information for all the SSDs, which are part of the failing system?

  1. Model number
  2. Serial number
  3. Firmware revision

Thanks,

Deepu.

42 Posts

November 23rd, 2014 21:00

Hello Daniel

Thanks for the reply.
We have tried several of these options. We actually have two identical servers with identical problems.

We have tried:
- Running the Controller without drives -> works OK
- Running the Controller with some drives -> works OK
- Popping in too many drives -> does not work
- Removing other drives previously working (leaving the added "non-working" in) -> works OK

Also, everything is brand new.
When installing the drives, we had the Server running. We popped in the drives one by one, and all got detected. We even created a RAID from them. Only after a reboot, it started to behave "weird".

Thanks
Michel

42 Posts

November 23rd, 2014 22:00

Deepu

The SSD's are Samsung 850 Pro, all with the same Firmware. They are not DELL branded drives.
If the exact Serial Numbers and Firmware still matters, I can get you them (need to drive to the Datacenter, as they cannot be read out currently due to the controller issue)

Thanks
Michel

November 24th, 2014 11:00

Hi,

Exact serial number might not be required. But firmware version does matter. Since the drives are not Dell certified, let us not worry about that @ this point of time. Could you please provide the below requested information?

  1. Is the issue seen with Dell certified SSDs?
  2. Complete system details(back plane connected) and for which back plane drives are added if system is dual back plane capable
  3. Drive form factor(2.5” or 1.8”)
  4. Boot mode(BIOS or UEFI)
  5. Is the issue seen consistently irrespective of reboot mechanism.
  6. Controller boot mode(HBA or RAID mode)

Thanks,

Deepu

42 Posts

November 24th, 2014 12:00

  1. Is the issue seen with Dell certified SSDs?
    1. We do not have Dell certified SSDs to test
  2. Complete system details(back plane connected) and for which back plane drives are added if system is dual back plane capable
    1. R730XD, 24x 2.5" backplane, with 2x read-disks, 2 CPU, 384 GB RAM, all drives are connected to the backplane (i don't know if there are 2 backplanes). Using 12 drives seems to work, using 16 drives (0-11 / and the last 20-23) works as well. Using more drives does not work. Different combinations yield different results, so it does not seem to be a single slot/disk. Also, we can reproduce this on 2 different R730XD systems (not using the same disks)
  3. Drive form factor(2.5” or 1.8”) 
    1. 2.5"
  4. Boot mode(BIOS or UEFI)
    1. BIOS
  5. Is the issue seen consistently irrespective of reboot mechanism.
    1. Yes
  6. Controller boot mode(HBA or RAID mode)
    1. RAID

42 Posts

November 25th, 2014 23:00

Is there anything we can do about this?
Should I try a support case with DELL?

I know that the drives are not certified, but I did not expect that the server would not even boot with them...

Moderator

 • 

6.2K Posts

November 26th, 2014 09:00

Is there anything we can do about this?
Should I try a support case with DELL?

I would not recommend calling support for issues with 3rd party equipment.

The last thing I would suggest is to take one drive and test it in each slot to make sure there is not a bad slot.

I know that the drives are not certified, but I did not expect that the server would not even boot with them...

When you install equipment that has not been validated to function with the system you should expect this type of behavior. We validate equipment so that you don't have to do it yourself. If you decide to do it yourself then functionality will be trial and error.

Thanks

42 Posts

November 26th, 2014 11:00

Thanks Daniel

This is disappointing :(
I think I'm going to try to plow in stock LSI controllers and see if they work. This way, I can also make sure there are no bad slots.

42 Posts

April 16th, 2015 07:00

Yes and No.

We had the same problem when popping in a 9361-8i. We were fighting it a little bit and found that a firmware from December 2014 works on them.

Unfortunately, we haven't got it to work with the H730P yet :(

4 Posts

April 16th, 2015 07:00

Hi Michel,


Did you manage to solve this issue ?
I got the same problem here :)

4 Posts

April 16th, 2015 08:00

Which exact firmware are you using ? I'm gonna try to do the same... 

42 Posts

April 20th, 2015 03:00

On the H730P we have Firmware Package Version: 25.2.2-0004

On the 9361-8i we have 24.7.0-0026


Be careful when updating the H730P when the 9361-8i is installed as well. The H730P update basically bricked the 9361-8i. So you might want to just remove the H730P from the system.

6 Posts

April 24th, 2015 14:00

Same problem here,  with the same troubleshooting steps that you have performed.   The h730p just does not work reliably when Samsung 850 PRO drives are used.  (In large quantity)  It makes we worried about the other 730XD servers we have that only have a few 850's installed. (and running production)

Also disheartening is the recommendation from Dell support that you install single drives to "find a bad slot" (sorry Daniel..)  This is happening on Multiple servers!  and different slot combinations do not matter.  It is almost like we are exceeding some maximum SSD threshold of the 730's bios.

Michelz2:  what specific 850 do you have?  We are trying the 1TB model.  I suspect not many people have done this due to cost, but as SSD continues to drop in price Dell will eventually have to address this issue.

42 Posts

April 24th, 2015 23:00

We are using the 1 TB model as well

No Events found!

Top