PowerEdge Hardware General

Last reply by 05-03-2021 Unsolved
Start a Discussion
2 Bronze
2 Bronze
12898

R740 Fiber Module issue

Just got a R740 server for a customer and found an issue when the server loses power.  The server has 4 Broadcom network ports.  Two ports are SFP+ and two ports are Gigabit ethernet.  I installed two DELL SFP+ 10GB Fiber transceiver modules in the SFP+ slots.   Everything seems to be working normal at the beginning.  However, when I disconnect the server completely from the power source to reorganize the cables, the Fiber modules seems to be dead after the server gets power.   The server startup normal, but the Fiber transceiver modules remain no power and stay down even after the OS is fully booted.  Simply reboot the server doesn't fix it.  The only way to fix the problem is unplug the fiber modules from the slot and insert it again.  As soon as the module is re-inserted, the module gets power and the interface makes connection.  

I believe there is a firmware bug either in the server motherboard or the four port network daughter board that does know apply power to the SFP+ slots after the server powers up initially until the slot detects a module is inserted. 

This is going to be a big trouble if the customer site loses power.  It will cause the server not be able to communicate with the network   after the server get power again.

DELL support just trying to drive me to trouble shoot at the OS level such as the drivers, getting the VMware support bundle.  However, I know this is not the problem in OS because the Fiber transceiver doesn't get power before the OS is loaded, and it is supposed to have power as soon as you hit the power button on the server.   They even question the fiber cables because we didn't buy the fiber cable from DELL, but "hey" the fiber cable is not going to cause the Fiber module to lose power.  

Does anyone have the identical server and experience the same problem?

 

Replies (11)
2 Bronze
2 Bronze
10774

Hi,

we're having the exact same issue. Did you manage to get a resolution for this?

2 Bronze
2 Bronze
10630

We kinda have the same issue, purchased the same r740 server. However when you plug the sfp in the port dies. Just like everyone, I'm sure, you got the recent version off the Dell website and from the vendor's website and updated the driver. Even after updating the driver, the port still dies. Took a look at device manager and noticed when we plug in the the sfp we get an error on the port. Disable, remove the sfp and then enable it, the port shows back up normal. We have tried multiple times with the Dell specific sfp+ transceiver, and other compatible transceivers, still with the issues.
10506

Hey guys.

Quick question: are you by any chance enabling RDMA and DCBX on those Broadcom interfaces?

Regards,

Giovani

10295

Having either the same problem or a close relative, new R740 with the same BCOM fiber/copper onboard NICs.

When I first boot up the fiber ports are lit up but not working: the iDRAC shows them as "up" but with the "Switch Connection ID" and "Switch Port Connection ID" as "Not Available."

If I unplug and replug the fiber into the transceiver the connection recovers and the switch information appears in the iDRAC.

While the ports are down, the switch shows no connection. The OS (Ubuntu18LTS but I've also tried xcp-ng 8.0 and CentOS 7) shows errors to the console that the interface(s) are down.

Firmare is updated to the latest (although it's a crap shoot even getting the lifecycle controller to connect for updates). RDMA and DCBX are disabled.

Anybody else fighing this? Anybody have any luck?

5163

I was having the same issue with a pair of R7525's, one with a dual-port and one with a quad-port Intel X710 connected to a pair of Juniper EX4550s.  The fix for us was to boot with the DACs unplugged, go into BIOS Setup on the NICs, and disable LLDP.  Once that was turned off, plug up the DACs, boot, and everything worked just fine.  Definitely sounds like a firmware issue to me.  We'll be telling Dell about this as well.

5135

Hello Kwulf,

thanks for the feedback I will log the issue on our system.

Thanks

Marco


Marco B.
Social Media and Communities Professional
Dell Technologies | Enterprise Support Services
#Iwork4Dell

Did I answer your query? Please click on ‘Accept as Solution’
‘Kudo’ the posts you like!
2 Bronze
2 Bronze
10246

Hi 

In one specific incident, this is turning out to be an issue related to auto-negotiate settings and was apparently resolved by hard setting the "Operational link speed" to 10G.

Boot to F2>Device settings>Broadcom NIC >Device/NIC configuration>Operational link speed or link speed. change from AutoNegotiate (default) to 10G and then BACK>FINISH>Save settings>reboot.

This is just a rough walkthrough and might vary depending on the features offered on NIC.

If you try this, I would request you to quote this response and share your results with the NIC specification/ Model.

Edit:- The Broadcom NIC on a 14th Generation PowerEdge machine usually work with "IEEE 802.3by" so switch has to have a native support for this.

Check for the Auto-negotiate protocol settings and confirm that the switch is also set to use the same protocol. Find a common Auto-negotiate protocol between the switch and network card and that will ensure a successful auto-negotiation. Manually setting link speed is always an option though, if changing the protocol means breaking auto-negotiate for any other already connected device.

Not applicable
10176

This resolved our issue on the R640.  We had been fighting this until we switched from auto negotiate.

6605

In my case, the network stops working after reboot. The OS thinks the link is established and is able to "send" packages but cannot receive them, while the switcher shows the link as not connected (LED off).

It comes back on when I unplug and plug the fibers (not the module).

Turning off auto-negotiation solved the problem on R740 + Broadcom 57414.

Latest Solutions
Top Contributor