Unsolved
This post is more than 5 years old
8 Posts
0
7559
March 10th, 2019 23:00
Nvidia T4 GPU's with R740 - System BIOS has halted
Hi, I'm trying to add 4 x NVIDIA T4 GPU's to a brand new R740 with 2 x CPU's Three cards are acceptable by the system but as soon i try to add the 4th card, BIOS keep crashing (System BIOS has halted msg in iDRAC) Existing cards location: PERC: #6; GPU's: #1, #4, #8 (working) Trying to add the forth to slot #7 or #2 with no success BIOS is latest version 1.6.13 2 x 1600W PSU are used while the usage reported in iDRAC is below 400W when using all three cards Appreciate any ideas Thanks,
No Events found!


Daniel My
12 Elder
•
6.2K Posts
0
March 11th, 2019 14:00
Hello
There are specific riser configurations for GPU support. You can find information on supported riser configurations and slot priority in the system manual.
http://www.dell.com/support/
You can also check the hardware to see if more information is listed. I would also check thermals in the system to see if anything is overheating. You should be using performance fans and have the GPU shroud installed.
Thanks
Beef23
8 Posts
0
March 12th, 2019 02:00
Hi,
I'm using riser config #6, which is the right one according to the manual
no alerts are visible in iDRAC (even with all 4 cards), all is green, its just stuck in a loop and cannot proceed
savvy2
4 Apprentice
•
2.5K Posts
0
March 12th, 2019 07:00
what are you doing a BIT coin mining job?
with 4 GPU video cards,why would any old R7xx server do that?
if mining they sell motherboards just for that,
4x GPU was my only hint at the job or goal here, not a file server at all. right?
each card uses 70watts,
this new GPU, which Nvidia designed for inference workloads in hyperscale data centers, leverages the same Turing microarchitecture as Nvidia's forthcoming GeForce RTX 20-series gaming graphics cards.
no infernence load stated, nor OS you run,
the card is no video card, but is compute engine alone, and very powerful, but Intel can beat it.
I don't see how any riser card can do 140watts, on one card alone.
oops just RTM read my manual on R7xx and it can not do that power level, not even close. sorry.
IDK. or make the BIOS HAPPY ever.
Each card is like 75watt incandescent lamp, said for you to feel that heat there, safely back not touching.
now x4 that. yah lots and lots of heat, and huge currents flowing in the riser and to the motherboard.
it was never designed for that JOB.
Beef23
8 Posts
0
March 12th, 2019 08:00
Daniel My
12 Elder
•
6.2K Posts
0
March 12th, 2019 09:00
How did you come to that conclusion? Did you test the card card in another slot, and did you get a card to function in the two slots you are trying to add the 4th card?
Beef23
8 Posts
0
March 13th, 2019 00:00
Yes,
I cycle the cards, cycle the slots, and tried any possible location and cards combinations
the end result is the same, 3 cards works, 4 does not
I believe it has something with the BIOS.
the server should support up to 6xP4 which has similar characters to the T4, so 4 should not be a technical issue
savvy2
4 Apprentice
•
2.5K Posts
0
March 13th, 2019 07:00
I have little facts that T4 card set, has a ROM (firmware) extension in it. (the data sheet is very weak)
but shows 75watts per card.
If it does then the BIOS in this server has limits on that, allowing OProm extension 1 to 3? or less?
The drac, and raid all have extensions and are part of the total count I bet. (max)
the risers are not rated for that current flow, (watts) P= V x A.
BIOS limits, and power limits. nor is the card dell certified, i bet.
page 39 in my R7xx manual states and I quote and high light in red.
"
System support for 25 W maximum power for the first two cards and 15 W for the third and fourth cards (lower power support on third and fourth cards due to system thermal limitations)
Optional x16 riser to accommodate interface cards for external GPU boxes that supports a maximum power of 25W (use of riser reduces the number of PCI Express slots from four to three)"
this says even the most important thing fails. POWER, no power no joy, in the world of electronics.
Daniel My
12 Elder
•
6.2K Posts
0
March 13th, 2019 08:00
Yes, I show we have validated the system for P4, but I don't show we have validated it for T4. It is a validated configuration according to Nvidia.
It could be a compatibility issue or a configuration issue. Since the system was not ordered in this configuration the issue may be that a piece of hardware is missing or there is hardware in the system that is not supported with that many GPGPUs. I suggest reviewing all of the documentation to make sure the hardware requirements are being met.
Vallihar
2 Posts
0
June 21st, 2019 04:00
jmcculloch4
5 Posts
0
September 6th, 2019 09:00
Can you recommend BIOS setting configuration for an R740xd and four or six T4 GPUs