Start a Conversation

Unsolved

D

29 Posts

2615

July 11th, 2019 02:00

S4048-ON S4048-ON:1 %KERN-2-INT: ismt0: smbus reset done

Hi I have strange problem.

When i ping my dell S4048-ON switch ( OS 9.14(2.0) ) i have latency<1ms but sometimes this time increase to 800-1000ms . When this happen in log i have information :

S4048-ON:1 %KERN-2-INT: ismt0: smbus reset done

This log entry repeats every 1-2 minutes.

What this mean ?

Moderator

 • 

8.7K Posts

July 11th, 2019 10:00

Hi,

The smbus is the system management bus, so something low level is rebooting. I would try to reflash the firmware and fully power off the switch and boot it up and see if it changes.

29 Posts

July 11th, 2019 23:00

Hi

By writing re-flash the firmware you mean format and install again FTOS OS 9 using ONIE , or something elese ?

 

Moderator

 • 

8.7K Posts

July 12th, 2019 06:00

Yes, that is what I mean.

July 31st, 2020 10:00

Sorry for late reaction:

The message means that the I2C / SMBUS reset logic was activated because one of the I2C communication channels was blocked.

On the S4048-ON switches this is nearly always caused by a SFP/QSFP that is to 'chatty' on the I2C communication channel and blocking communication on the entire shared bus - which can then cause other components in the switch not being able to 'talk' to the switch management system (the OS running on the CPU) and in worst case the CPU might think that the rest of the system is 'unresponsive' as it doesn't get any data from the sub-components and might do a 'hardware watchdog reboot'.

When the OS notices that no communication is coming in on the I2C communication channel (or sees actively that one component keeps the channel busy) he will try to reset the I2C communication and do a smbus reset: and the message above is the notification that the reset logic was started and completed.

If you see this message often it is in nearly all cases a bad optic: it might work perfectly for communication - but reading out fabrication data (content of EEPROM in the SFP) and operational data (eg temperature, optical TX and RX power levels etc) and this bad communication locks up the smbus

Do check the optics in your switch via: show inventory media

and check any SFP/SFP+/QSFP that is not Dell qualified.  If possible replace them for Dell qualified versions of that optics.

And if you see any 1GbaseT SFP's do check them out: there are some old (Dell branded) 1GbaseT SFP's that do cause problems on Dell switches and on the S4048-ON they can cause issues. If you want to know if these cause the issue just remove them for 5-10 minutes and see if you still see the smbus reset messages.

If you find a Dell branded 1GbaseT SFP that causes these messages consider opening a case with Dell.  Do get the partnumber and/or PPID number prior to calling: you will need to check on the SFP itself for the partnumber (6 digits starting with a 0) or PPID as printed on the side of the SFP.

Also do make sure you run 9.14.2.5 or later (with correct CPLD for that release) as that code has a further refinement to reset the SMBUS as some optics could cause such hard 'lock ups' of the smbus that the earlier resets didn't always clear the bus.

At this moment (31 JUL 20) I would go for DNOS 9.14.2.7 with system cpld 15.2 (master 12; slave 5)

 

Hope this answers your questions.

July 31st, 2020 10:00

For THIS specific case I don't think you should re-image the flash / DNOS unless you run on a version prior to 9.14.2.5 or your CPLD is outdated for that version (system: 15.2 ; master 12; slave: 5)

As explained below: this is most likely caused by a SFP that hangs up the I2C smbus and the NOS tries to reset the bus to avoid further problems

No Events found!

Top