PowerEdge Hardware General

Last reply by 09-28-2022 Solved
Start a Discussion
2 Bronze
2 Bronze
3092

R720 GPU installed, PSU blinking amber

So as I was able to find, installing GPUs will require the riser and the kit and the whole nine yards.

I got the 1100W PSUs, I updated bios, updated iDRAC to latest versions.

Installed GPU and used the riser power cable to do it. I enable power to the machine and there it is, blinking amber light. If I disconnect the power from the GPU, even without completely removing it, the system works just fine. Power back to GPU? Machine won't even start. iDRAC is still accessible through the network but that's about it, it just refuses to boot and nothing regarding the issue shows on any logs. 

When I go to power configuration, the system is capped at 461 watts for some reason. The input wattage is over 1200w and output (of PSUs) is 1100w. There is more than enough for them to work properly. 

Why is power capped at 461 when I have two 1100W PSUs? 

I reset iDRAC to default twice and it is still capped at 461 and the machine refuses to boot only when GPU power is connected.

Solution (1)

Accepted Solutions
2 Bronze
2 Bronze
2909

These are the actual cables out of my server.  The long cable is the one that you probably have.  The short Y-cable is the one that you need.  The white plug on the long cable goes into the riser card.  The Y-cable plugs into the Tesla K-80.  The R730 GPU cable meant for the K80 is essentially this configuration, without the mess of adapting, or extra plugs.

Cable 1.jpgCable 2.jpgCable 3.jpg

View solution in original post

Replies (20)
5 Tungsten
1720

Can you share the picture of power cap which you referring? You can also check below link for details on GPU card installation on R720 and ensure all prerequisite and connection are proper. Can you also let me know the model and part number of GPU card you are connecting in the server?

https://www.dell.com/support/manuals/en-us/poweredge-r720/720720xdom/gpu-card-installation-guideline...

https://www.dell.com/support/manuals/en-us/poweredge-r720/720720xdom/installing-a-gpu-card?guid=guid... 


Thanks,
DELL-Shine K
#IWork4Dell

1715

Hello Shine, 

Here is the power cap thing from iDRAC menu: 

ammarsalman94_0-1629250089050.png

 

I have two 1100W PSUs, tried in redundant mode, tried without it. Power cap policy is disabled. But if I do enable it, I cannot raise the cap above 461 Watts. 

Regarding the set up: 

I have the riser expansion and shields all installed according to the guidelines. I have 2xE5-2640 0 processors, TDP is 95 (well below the 115 in the guideline). 64GB of RAM is on the system. 

The power cables I used were the 09H6FV which are designed for this particular purpose. The GPUs are Tesla K80 (both). I tried installing only one at a time, same exact issue. 

I upgraded BIOS, iDRAC and Lifecycle Controller. I directly installed latest version from the Dell driver support website (I did not download versions progressively and install each separately all the way to the latest). 

4GB IOMMU is enabled in BIOS. But I doubt this has any effect given I can't even boot the system up.

If I remove GPUs, I can boot and use everything fine. 

1690

Minimum and Maximum powercap will come in to picture only when you enable power cap. System can utilize full power as long as you have power cap disabled. 

Telsa K80 is not a supported graphics card on R720. That does not mean it will won't work. We seen user using unsupported card on the server without any issues. We can not guarantee that it will work as it is not validated card.

Can you check Post Code and Lifecycle Log on iDRAC when you try to power on the server with  graphics card on it.


Thanks,
DELL-Shine K
#IWork4Dell

1688

Yes I know the K80 is not on the official list but I have seen many examples where two were installed and worked as you mentioned. 

There is no POST code (0x0) and the log on iDRAC does not have anything. The system does not power on at all. iDRAC is still accessible somehow, but it shows the power status as OFF and when I send the command to turn it on it's just ignored (same with pressing the power button). 

1684

Which slot GPU is connected? Below link have details on slot where GPU cards are supported

https://www.dell.com/support/manuals/en-us/poweredge-r720/720720xdom/expansion-card-installation-gui... 


Thanks,
DELL-Shine K
#IWork4Dell

1681

They're double width so they can only fit in slots 4 and 6 as I connected them. 

1666

I tried once more and there was a log on the Lifecycle controller, it said: 

PSU0036

The error code was for both PSUs. 

I don't understand what is causing this. There is more than enough power and the cables I brought were just new. This only happens when the GPUs are connected. 

1667

Hi,

 

I'm Joey from the enterprise social support, and I'm been eyeing on your post about the GPU installation. It's great that you are in the same page as in K80 is not in the official list of R720. 

 

From the error codes that you have provided, it seems to be pointing on the GPU cable that you have, 9H6FV. I probably suggest to try obtain another one, yes knowing it's new, but probably unknown reason, the cable might be faulty. 

 

Last year, there is 1 user have the same question and did manage to get the card working on R720. Let me tag here, to try see if could get an opinion about your issue. 

 

Hi @OldSchoolAdm1n, would you be able to let us know your opinion on the K80 installation on your previous post last year in R720?


DELL-Joey C
Social Media and Communities Professional
Dell Technologies | Enterprise Support Services
#IWork4Dell

Did I answer your query? Please click on ‘Accept as Solution’. ‘Kudo’ the posts you like!

1655

I was wondering how would 2 new cables be corrupted at once. I will try another cable nonetheless.

Latest Solutions
Top Contributor