I added a PCIe card to my r730xd. The fans went from 3,800 RPM to 17,000.
The Lifecycle log showed:
The suggested solution to this is:
If a lower fan speed is required, contact your service provider for the appropriate IPMI commands to reduce the default fan speed response for new PCIe cards.
I have been working with my local Dell service contact but they so far are not able to provide me with the IPMI sequence I require to not scream the fans.
FWIW, my t410 has the pair of this card (a Mellanox 40/56GB dual port card) and it has not adjusted the fans in the least. Mellanox describes this as a low power card. Whenever I remove it from the screamer, the heat sink is not even warm to the touch.
Anyone have any luck getting this info?
Solved! Go to Solution.
What is the card you had installed? The reason that the fans max out is normally due to a unsupported, or unrecognizable card being installed. The card may be supported in the T410's, but not the R730XD. Now in regards to the IPMI command, this won't actually be necessary as you can access the Drac, then select Fans, at the bottom of that page you can adjust the fans. Now this is not suggested, as adjusting the fans here takes the thermal regulation out of the servers hands, and places it in yours. Also, if the fan is throttled back too much, then the system can overheat and possibly cause damage.
Let me know if this helps.
The card in question is a Mellanox MCX314A-BCBT.
What is meant by 'supported'? I have PCIe-3.0 slots. I plug cards in. The OS loads a driver. I have been writing PCI based device drivers for decades, and this concept of Dell supporting a card I insert into an industry standard slot is puzzling.
As far as the DRAC, that will allow me to set a minimum speed. As the fans are running at maximum, this is not useful.
My current profile is 'minimum power'.
I have selected the default minimum fan speed: 0% PWM.
Here are the commands they reluctantly provided (exactly as they supplied)
ipmitool.exe -I lanplus -U root -P calvin -H <iDRAC IP> raw 0x30 0x30 0x01 0x00
ipmitool.exe -I lanplus -U root -P calvin -H <iDRAC IP> raw 0x30 0x30 0x02 0xff 0x46
Once I determined them to be pretty much useless, they provided some other command which supposedly made it go back to the way it was.
ipmitool.exe -I lanplus -U root -P calvin -H <iDRAC IP> raw 0x30 0x30 0x01 0x01
Thank you for the detailed answer! Not sure if my commands fix spinning RPMs - I've stuffed this 2U body with 16 * 8Tb HSGT helium drives (4 are mid-body above RAM modules), they're going through background init of two 8 disk RAID6 volumes, I'm a bit concerned now that drive #17 is already 54 C..
0 through 11 are 33-37. HSGT specs say operational ambient T could be up to 60, so drive's internal could be even higher.
I was also looking at SuperMicro but prefer Dell's stability. There's no 16 disk 2U case available from SuperMicro yet. SuperMicro build wasn't much cheaper for comparable config, but I didn't explore it well enough.
Thanks for the update. Dell support here (Australia) was profoundly inept. It took me several days to convince them that the error message 'ask dell for IPMI codes' meant 'ask Dell for IPMI codes' and that that was what I was doing, and that they should respond.
Then they supposedly had some IPMI codes to fix the problem, but refused to supply them unless I first allowed them to replace my motherboard. As I knew there was nothing wrong with my motherboard, I nevertheless foolishly allowed them to do so. The field tech did an otherwise super job of doing all the silly steps Dell wanted (boot without the cards, run diags, get data, boot with cards, ..., then replace motherboard, do same pointless steps). However, he failed to connect the system fans up after replacing the motherboard, so the initial boot, all the way up into system management, was with no cooling. I am still somewhat *** off regarding this.
They then supplied the codes, but the codes simply set all the fans to a percentage. There is no longer any 'intelligent thermal management'. Plug in foreign card, either scream at 110% forever or run forever at fixed percentage. it was an utter waste of time, plus potentially damaged my system.
Dell used to be fantastic. This is most likely my last Dell server. I have just put together a SuperMicro-based one, and I can put whatever PCIe 3.0 cards into the PCIe 3.0 slots and the system works and the intelligent thermal management works. What a concept!
I would have much preferred to have a complete turn-key well designed Dell server with correct shrouding and well designed thermal management AND non-brain-dead firmware. As non-brain-dead firmware no longer seems to be an option, I will have to look elsewhere. Good bye Dell. It has been a great 15? 20? years or so. Too bad you have lost the plot.
I can relate.
When installing my new R730xd (replacement for a R710) I was already thinking that Dell put more work in the firmwares and less work in the hardware.
Things that were metal are now plastic (2.5" caddies, top-lock) and it shows. Right behind the harddisks there's a 1mm gap that just shouldn't be there. I can close it by moving the lid forward, but eventually it'll just slide back and it'll start to *** air through it. Tolerances aren't what they used to be.
As for the fans, same problem here! I added two Samsung PCIe SSDs to put in RAID0 for the software cache solution that now replaces CacheCade (these are even recommended by the software maufacturer!), and I have the screaming fans now as well.
If there's really no solution to this then I will have to sell the server and get something else. I need the fans to be as low as possible, but scale up with ambient temperature, I'm not going to change the fan speed every summer & winter... The server is in my house, so the sound level matters to me.
Seriously Dell, get your act together. I've been a customer for 15+ years as well, and spent millions of dollars on your hardware, but I guess that doesn't mean as much as it used to. You're losing your edge.