1 Rookie

 • 

1 Message

3155

March 10th, 2024 03:03

R730 Graphics card?

"I own a Dell PowerEdge R730 2U device. Which graphics cards can I install in it?"

Moderator

 • 

3.5K Posts

January 22nd, 2025 14:07

Hi,

The "ctf error" you're experiencing, followed by a reboot or shutdown, strongly suggests a problem with the stability of the GPU or its power delivery under load.

Here are a few things you should check:

1. Power Supply:

  • Adequacy: The AMD FirePro S7150x2 can draw a significant amount of power. Ensure your PowerEdge r730xd's power supplies are sufficient to handle the additional load. Check your server's power calculator and the card's specification to determine the total wattage needed. If you have redundant power supplies, are both of them working correctly?
  • Connections: Double-check that the dual power cable is securely connected to the GPU and the server's power distribution board. Also, make sure that the power cables are fully inserted into the GPU's power connector.

2. Cooling:

  • Airflow: Make sure the server has adequate airflow to cool the GPUs. The R730xd is designed with specific airflow pathways; ensure there are no obstructions.
  • Ambient Temp: Verify that your server environment is not too hot. High ambient temperatures can exacerbate heat issues and cause components to fail.

3. BIOS/Firmware:

  • Latest Updates: Ensure your server's BIOS and iDRAC firmware are up to date. Sometimes, older firmware versions have bugs that can cause unexpected issues with PCIe cards.

4. PCIe Slot:

  • Slot Configuration: Verify that the PCIe slot configuration in the BIOS is set to the correct speed and lane width. Check the server's documentation to confirm that the slot is compatible with the GPU's PCIe version.
  • Slot Functionality: Although the slot appears to be working initially, there may be a hardware issue. Try testing the GPU in another slot if possible.

5. Driver/Software:

  • GPU Drivers: While the issue seems to be crashing the system, verify the proper drivers are installed in the VM.
  • Pass-Through Configuration: Recheck your KVM pass-through configuration settings in the hypervisor. There may be a specific setting to be enabled/disabled, and the VM should have adequate resources configured to the allocated GPU.

6. Hardware Issues:

  • GPU Health: It's possible that one or both GPUs in the S7150x2 have a hardware problem. If you have another system, try testing the GPUs there if possible.
  • Power Cable/Connectors: The dual power cable could be faulty. Try another cable, if possible.

7. "ctf error":

  • Error Context: Usually, a "ctf error" is related to the Component Transfer Framework. It can be triggered by device driver issues, power, or hardware issues.
  • Error Logs: Review the system's event logs, iDRAC logs, and hypervisor logs for more detailed error messages.

I would suggest starting with the power and cooling checks, and then moving on to firmware and driver updates. If those don't resolve it, then you may have a hardware issue

Moderator

 • 

5.3K Posts

March 11th, 2024 01:32

Hello thanks for choosing Dell and welcome to our community!

I’m afraid we do not support graphic cards with server models- only GPUs.

Respectfully,

3 Apprentice

 • 

482 Posts

March 12th, 2024 04:13

R730 GPUs supported at RTS
Nvidia AMD Intel Phi
Nvidia K80 AMD S9150 Intel Phi 7120P
Nvidia M60 AMD S7150 Intel Phi 3120P
Nvidia M40 AMD S7150x2
Nvidia K40M AMD S/W9100
Nvidia GRID K1

Nvidia GRID K2

Now so you know, the R730 can support two 300W, full-length, double-wide or four 150W, single width GPUs. The GPUs are installed on the PCIe x16 Gen3 interfaces available on Riser2 and the GPU-Optional Riser3. For installing internal GPUs in the system, GPU-Optional Riser3 has to be present.

1 Rookie

 • 

1 Message

October 11th, 2024 15:44

@DELL-Young E​ 

as a moderator I expect your responses to be more in depth. You shut down Laptec without providing any additional information. 

you knew what he was asking about but chose to defy the request based on linguistic bias. 

because he didn’t use the proper “terminology” he does not deserve a thoughtful answer?

terrible moderator. 

thank you so much to @Praveen.Singh  for providing excellent information not just for Laptec but for anyone else looking for this kind of information. 

if you continue this type of behavior I foresee you having a strong and fruitful career Praveen. Thank you. 

1 Rookie

 • 

2 Posts

January 22nd, 2025 06:15

In regards to the supported GPU cards, I have a PowerEdge r730xd and have installed an AMD FirePro S7150x2 in the double space riser showing in iDrac as PCI slot 4 with a Dell dual power cable, in setting up for pass through both GPU's show up and my KVM resource mapping confirms them, but after a short while they get a ctf error and reboot or shutdown the system. Any idea if I am missing something?

1 Rookie

 • 

2 Posts

January 22nd, 2025 19:40

@Dell-Martin S​ Thanks for the reply, I am leaning toward out of box hardware issue as I have validated about everything you have suggested. I have 2 750W PSU's and the internal temp I maintain between 35-55c. There should be no load on the GPU's as I have yet been able to complete the final setup due to the reboots and the GPU's should be idle. The KVM reports to have the drivers and I validated the AMD amdgpu drivers and mesa tools myself as well as the rom file for the VM. I just wanted to be sure I hadn't missed anything. I hope this helps with anyone else trying to trouble shoot a similar issue. Thanks again for the reply.

No Events found!

Top