Moderator

 • 

9.5K Posts

August 26th, 2019 09:00

Hi,

If everything is the same it should establish the heartbeat and sync up without a reboot fine. Page 1085 has some failure information https://downloads.dell.com/manuals/all-products/esuprt_ser_stor_net/esuprt_networking/esuprt_net_fxd_prt_swtchs/force10-s4048-on_setup-guide8_en-us.pdf

12 Posts

August 30th, 2019 06:00

Thanks for the reply.

I've finally got a serial console physically connected to the switch, and below is the prompt showing that it somehow has gone into "debugger" mode:

db{1}>

Running a dmesg shows:

WARNING: 3 errors while detecting hardware; check system log.
boot device: wd0
root on md0a dumps on wd0l
dump_misc_init: max_paddr = 0x7f800000
WARNING: clock lost 5578 days
WARNING: using filesystem time
WARNING: CHECK AND RESET THE DATE!
NMI ... going to debugger

Above makes it seem that the switch might have hit the networking clock signal bug in the ATOM CPU component (not sure though, since I only see the phrase "clock lost..." and connect it to "clock signal bug" )

"Once the component has failed, the system CPU will stop functioning but traffic may continue to flow. Once encountered it is likely that the unit will not boot, and will not be recoverable. Typically the system or card will stop functioning and will hang or reboot continuously. The issue may not be observed until a reboot or power cycle occurs." <ADMIN NOTE: Broken link has been removed from this post by Dell>

Can above warnings happen without the CPU bug? (everybody I've been talking to have never experienced the bug or heard of anyone who had the bug)

Thanks

Moderator

 • 

9.5K Posts

August 30th, 2019 07:00

It could be that bug, can you private message me the service tag?

12 Posts

August 30th, 2019 11:00

Hi Josh

Done. Sent you log also. Would be nice to know if it is that bug.

12 Posts

September 1st, 2019 10:00

An update.

First did a "reboot" command on the debugger command-line interface, this froze the switch instantly.

After that, I did a power cycle (waited a few minutes), but the switch didn't show anything on the serial console, and all ports lights went off. The switch is dead and I believe the cause is the networking clock signal bug in the Atom CPU since all signs of the bug are present.

I had a spare switch which was loaded with the same configuration (configured for a VLT setup). Powered off the failed switch, and powered off the replacement switch, changed over the cables to the replacement switch and turned on the replacement switch, and everything went into a good state again. Uptime on the other switch which has been working all the time is: Up Time : 3 yr, 0 wk, 4 day, 5 hr, 29 min

So I expect this switch will fail at some point Luckily VLT works so no interruption

 

No Events found!

Top