Moderator

 • 

4.7K Posts

October 4th, 2022 13:00

Hello Dawnfang,

 

Is this an error that just started happening?

Were any changes done on the server when this started happening?

 

Check that system firmware is up to date: iDRAC, BIOS, PERC, Backplane

 

Try these steps and let me know results with the answers for above questions also:

 

drain flea power (shut down, disconnect power cables and Network cables, hold in power button 20 seconds with cords removed).  After flea power drain, system has to set for 3 minutes for DRAC to reset without any power plugged in, (do the reseating steps below now)

 

Reseat all backplane cables

Reseat H330 mini controller and cable

 

 

Then plug in NIC and power cables but wait 2 minutes before power on to give DRAC time to initialize.

Check results

 

October 4th, 2022 14:00

The problem is probably with your riser config.  Look over Dell's documentation for the R740xd and determine which Riser config is required for the chassis config you are attempting to build (Dell EMC PowerEdge R740xd Installation and Service Manual | Dell US).  I had the same error when changing my chassis config when removing the rear 4x2.5" module.  Once I purchased and installed the correct Risers for the config I was building, this error went away. 

1 Rookie

 • 

3 Posts

October 6th, 2022 11:00

I believe you're right about the riser. Thinking the raid controller somehow contributed to the issue seemed to be a red herring. We have a GPU plugged into one of our pcie risers. I just checked the riser and it has x8 pcie connectors even though our GPU has a x16 connector on it. I'll try a riser with an x16 connector and let you know if that fixes it.

1 Rookie

 • 

10 Posts

October 7th, 2022 06:00

I am also having issues with the same thing. could the problem maybe be the part number of the risers are are incorrect for the configuration? I am running dual gpus in slots 1, 8 and dual nvme bridge cards in slot 3,4. the riser part numbers in order from 1-3 are GHGTP, J7W3K, and DTTHJ. ( I have also tried MDDTD for riser 1)

October 7th, 2022 08:00

@usedto your riser 2 part number is incorrect if you want your bridge cards in slots 3 and 4.  J7W3K has an incorrect physical label of "Riser 1A/1D".  It's only 1A.  1D is RN1V2.  However, RN1V2 is designed to only work with 24x2.5" NVMe backplane and to my knowledge Dell sales won't sell you the other correct parts to get that config functioning.  So until someone figures it out in the aftermarket, you're stuck with running Riser Config 1D, 2A, 3A, which means if you want 2 GPUs, you can only have 1 bridge card.  Or if you want 2 bridge cards, you can only have 1 GPU.  You could try Riser 1A, which would give you an extra x16 slot and see if you can get all 4 cards in there at once.  Probably not, but you could try all the different combinations of cards in slots and see if your errors go away

 

@Dawnfang what Riser config are you running, and with what backplane?

1 Rookie

 • 

3 Posts

October 7th, 2022 09:00

We currently have a 1B+2C configuration with rear storage. We have a Nvidia Tesla M10 plugged into slot 1. This is an x8 slot, but we want it to plug into an x16 slot. Our only x16 slot on the 2c riser is low profile and half height so the GPU won't fit there. Would we need to change our riser configuration to a (1A+2A) configuration, and is there any chance that having the gpu in the x8 slot would be causing this error?

October 8th, 2022 04:00

As long as you have rear storage you won't be able to run your M10.  It's not possible to install any of the #1 Risers that have x16 slots.  The #1 risers with x16 slots (1A and 1D) are longer than 1B and plug into the slot that your mini PERC is currently using. You'll have to pull the rear storage, pull the mini PERC card and it's mounting bracket and cabling, and install either risers 1A+2A+3A or 1D+2A+3A.  However, since the M10 is double width, go with 1A+2A+3A because that's what Dell 'supports' for double width GPUs.  You'll then also have to purchase, install, and cable a PERC H740P in slot 6 of riser 2A.  Get a low-profile H740P because that's a low profile slot. Your errors will hopefully go away - I haven't built what you have so no promises   But at least you'll have your card in a x16 slot and the properly supported riser setup

No Events found!

Top