Highlighted
Kaustav Bose
1 Copper

Problems with upgrading to Broadcom NIC firmware v7.4.8

Has anybody had issues with upgrading to the latest Broadcom NIC firmware v7.4.8.

I have a Dell M620 blade server which has the Broadcom BCM57810 LOMs. The current firmware is 7.2.20. Apparently the combination of this NIC (it's a 10Gbps CNA type), RHEL 6.2 64bit and Force10 MXL blade switches, has an issue where the NIC ports keep flapping (go up and down very rapidly). Although I have setup bonding across the two LOMs (and I am using NPAR) in an active-standby configuration (mode=1 in Linux) this flapping activity is causing certain processes to go crazy. Atleast I have seen the Oracle SIP application server and oracle 11gR2 RAC behave crazy and crash and reboot or just shutdown.

Long story short, the solution provided by Dell - upgrade to v7.4.8 of the NIC firmware. Ok, sounds reasonable. I am still searching for the firmware release notes.

The firmware upgrade is easy and I just down the .bin file and execute it like a shell script. At the end, it reboots. But after the reboot, the interfaces do not come back.

The workaround which I discovered and also seems to work is to power down the server and reseat it in the M1000e chassis. I think it must be causing the NIC adapters to re-initialize and the new firmware version to take effect; not sure just my guess.

Has anybody else seen this and know why this is happening. Dell hasn't got back to me. I have the latest CMC (4.31) and iDRAC (1.37.35) versions.

0 Kudos
3 Replies
Moderator
Moderator

Re: Problems with upgrading to Broadcom NIC firmware v7.4.8

Hello kbose

I agree with your assumption of the NIC not having enough time to fully initialize prior to boot after the firmware update. If that is the case then another restart should have resolved the issue. If you attempted restarting again prior to reseating the blade then that would not likely be the the case. If a reseat was required then there was something hung in cache or memory that was stopping the NIC from initializing. There may have been processes sitting prior to the firmware update that were stalling the initialization. The reseat you performed is the equivalent of draining flea power, so it would have cleared temporary memory throughout the system.

Regarding your original problem, if it is not resolved with the firmware update then I would recommend verifying the port configuration on the switch is compatible with the NIC configuration on the server. If you have the NIC bonded/teamed within the operating system then the ports should not be lagged on the switch. If the ports are lagged on the switch then it will cause what you are describing since the two transmission methods will conflict.

Thanks

Daniel Mysinger
Dell EMC, Enterprise Engineer

Get support on Twitter @DellCaresPRO

0 Kudos
Kaustav Bose
1 Copper

Re: Problems with upgrading to Broadcom NIC firmware v7.4.8

Hi Daniel,

Thanks for replying. I have tried rebooting multiple times, even cold reboot from the iDRAC with no success. Re-seating the blade was the last option. This behavior is very consistent (at least with three blades in the same chassis).

I believe that in order to use NPAR with the 4 partitions carrying 4 different VLAN traffic, we need to have the switch port designated as a VLAN trunk. On the OS if the server is going to deal with multiple VLAN traffic we have to use tagged interfaces. For example, I am using tagged interfaces like bond0.700 and bond1.800 where 700 is my application VLAN and 800 is my management VLAN. bond0 is em1_1 and em2_1 and bond1 is em1_2 and em2_2. the configuration works fine. If we do not designate the switch port as a trunk, then we cannot use tagged interfaces on the server and conversely, we if use tagged interfaces on the server/OS, we need to use switch port as a trunk. If both ends dont follow this rule, there would be no communciation at all. The orignal problem definitely seems to be hardware related and specifically server NIC related because I have verified that the Force10 MXL blade switch did not have any errors.

0 Kudos
Moderator
Moderator

Re: Problems with upgrading to Broadcom NIC firmware v7.4.8

For example, I am using tagged interfaces like bond0.700 and bond1.800 where 700 is my application VLAN and 800 is my management VLAN. bond0 is em1_1 and em2_1 and bond1 is em1_2 and em2_2. the configuration works fine.

Yeah, that is fine. The switchport mode trunk just allows all VLAN traffic to pass across that interface. I was referring more to link aggregation, but it doesn't sound like you have the interfaces lagged together.

The issue I was referencing is in how link aggregation groups transmit data versus how NIC teams/bonds transmit data. If you don't have the interfaces lagged then it is not an issue.

Thanks

Daniel Mysinger
Dell EMC, Enterprise Engineer

Get support on Twitter @DellCaresPRO

0 Kudos