Just got a R740 server for a customer and found an issue when the server loses power. The server has 4 Broadcom network ports. Two ports are SFP+ and two ports are Gigabit ethernet. I installed two DELL SFP+ 10GB Fiber transceiver modules in the SFP+ slots. Everything seems to be working normal at the beginning. However, when I disconnect the server completely from the power source to reorganize the cables, the Fiber modules seems to be dead after the server gets power. The server startup normal, but the Fiber transceiver modules remain no power and stay down even after the OS is fully booted. Simply reboot the server doesn't fix it. The only way to fix the problem is unplug the fiber modules from the slot and insert it again. As soon as the module is re-inserted, the module gets power and the interface makes connection.
I believe there is a firmware bug either in the server motherboard or the four port network daughter board that does know apply power to the SFP+ slots after the server powers up initially until the slot detects a module is inserted.
This is going to be a big trouble if the customer site loses power. It will cause the server not be able to communicate with the network after the server get power again.
DELL support just trying to drive me to trouble shoot at the OS level such as the drivers, getting the VMware support bundle. However, I know this is not the problem in OS because the Fiber transceiver doesn't get power before the OS is loaded, and it is supposed to have power as soon as you hit the power button on the server. They even question the fiber cables because we didn't buy the fiber cable from DELL, but "hey" the fiber cable is not going to cause the Fiber module to lose power.
Does anyone have the identical server and experience the same problem?
Having either the same problem or a close relative, new R740 with the same BCOM fiber/copper onboard NICs.
When I first boot up the fiber ports are lit up but not working: the iDRAC shows them as "up" but with the "Switch Connection ID" and "Switch Port Connection ID" as "Not Available."
If I unplug and replug the fiber into the transceiver the connection recovers and the switch information appears in the iDRAC.
While the ports are down, the switch shows no connection. The OS (Ubuntu18LTS but I've also tried xcp-ng 8.0 and CentOS 7) shows errors to the console that the interface(s) are down.
Firmare is updated to the latest (although it's a crap shoot even getting the lifecycle controller to connect for updates). RDMA and DCBX are disabled.
Anybody else fighing this? Anybody have any luck?
I was having the same issue with a pair of R7525's, one with a dual-port and one with a quad-port Intel X710 connected to a pair of Juniper EX4550s. The fix for us was to boot with the DACs unplugged, go into BIOS Setup on the NICs, and disable LLDP. Once that was turned off, plug up the DACs, boot, and everything worked just fine. Definitely sounds like a firmware issue to me. We'll be telling Dell about this as well.
thanks for the feedback I will log the issue on our system.
In one specific incident, this is turning out to be an issue related to auto-negotiate settings and was apparently resolved by hard setting the "Operational link speed" to 10G.
Boot to F2>Device settings>Broadcom NIC >Device/NIC configuration>Operational link speed or link speed. change from AutoNegotiate (default) to 10G and then BACK>FINISH>Save settings>reboot.
This is just a rough walkthrough and might vary depending on the features offered on NIC.
If you try this, I would request you to quote this response and share your results with the NIC specification/ Model.
Edit:- The Broadcom NIC on a 14th Generation PowerEdge machine usually work with "IEEE 802.3by" so switch has to have a native support for this.
Check for the Auto-negotiate protocol settings and confirm that the switch is also set to use the same protocol. Find a common Auto-negotiate protocol between the switch and network card and that will ensure a successful auto-negotiation. Manually setting link speed is always an option though, if changing the protocol means breaking auto-negotiate for any other already connected device.
In my case, the network stops working after reboot. The OS thinks the link is established and is able to "send" packages but cannot receive them, while the switcher shows the link as not connected (LED off).
It comes back on when I unplug and plug the fibers (not the module).
Turning off auto-negotiation solved the problem on R740 + Broadcom 57414.