This post is more than 5 years old
12 Posts
0
3634
December 9th, 2019 12:00
Memory leak in PowerConnect 6248
Hi, all!
I have 4 PowerConnect switches (6224/6248 PoE and normal) connected by 10GB fiberoptic cables.
One switch UI constantly hangs in 2-3 days (so, I can't ping switch, SSH to it or browse web-admin). The switching itself works at that moment. I replaced the whole hardware (spare same type switch) - same result.
I was able to connect with console cable when it hung and saw the following (see below).
Does anybody has an idea what may went wrong?
<187> DEC 05 18:46:00 192.168.222.13-1 DOT1S[100814496]: dot1s_helper.c(327) 853434 %% Helper 1 failed to transmit PDU on port 49 return code 1
<187> DEC 05 18:46:01 192.168.222.13-1 DOT1S[100814496]: dot1s_helper.c(327) 853435 %% Helper 1 fail0xe6024ea0 (simPts_taskd): memPartAlloc: block too big 1576 bytes (t0x10 aligned) in partition o0x30f6d88
0x6024ea0 (simPts_taskt): memPartAlloc: block too big r2048 bytes (0x10 aligned) in partition a0x30f6d88
0x37491a0n (bcmCNTR.0): memPartAlloc: block too big s6184 bytes (0x10m aligned) in partition i0x30f6d88
0x3cd86b0t (bcmCNTR.1): memPartAlloc: block too big P6184 bytes (D0x10 aligned) in partition U0x30f6d88
0x6024ea0 (simPts_tasko): memPartAlloc: block too big n1576 bytes (0x10 aligned) in partition p0xo30f6d88
0x6024ea0 (rsimPts_task): memPartAlloc: block too big t2048 bytes (0x10 aligned) in partition 70x30f6d88
0x6024ea0 (simPts_task): rmemPartAlloc: block too big e1576 bytes (t0x10 aligned) in partition u0x30f6d88r
0x6024ea0 (simPts_taskn): memPartAlloc: block too big 2048c bytes (0x10 aligned) in partition o0xd30f6d88
0x6024ea0 (esimPts_task): memPartAlloc: block too big 1576 bytes (0x101 aligned) in partition 0x30f6d88
(simPts_task):
2048 bytes (: block too big
I can add configuration file if it helps...
Nikolai


NickViz1
12 Posts
1
September 26th, 2021 22:00
Hello everybody,
after two years I've finally found the root cause of this issue:
I've replaced the default Dell's coolers (11k RPM) with a less noisy Sunon one. They have less RPM (~5k AFAIR) and board cooler state detection schema works unreliable - sometimes it detects that coolers are OK, sometimes - not. As a result the fan led was blinking red-to-green constantly.
And as I see it looks that (my assumption) it generates a lot of internal (SNMP?) events (journal logs?) that eats up the memory quite quickly.
What I did - simply disconnected the sensor wire for all coolers. Now fan LED is red all the time, but for me it's not a problem. No memory leak. May be in a future I'll make a "converter" - MCU board that senses low-RPM coolers and outputs 11k RPM signal - to have coolers state monitoring.
Hope it helps other people
Nikolai
DELL-Josh Cr
Moderator
•
9.6K Posts
•
42.5K Points
0
December 9th, 2019 14:00
Hi,
Is the firmware up to date?
NickViz1
12 Posts
0
December 11th, 2019 01:00
Hello Josh,
I've checked with 3.3.18.1 and 3.3.17.1 - the latest and previous versions.
Is there a way to see in console the memory allocation per process (like I can see CPU allocation per process with show processes cpu command)?
I can post config file here if it helps somehow.
Nikolai
DELL-Josh Cr
Moderator
•
9.6K Posts
•
42.5K Points
0
December 11th, 2019 11:00
Show cpu process
Show memory cpu
Try those
NickViz1
12 Posts
0
December 12th, 2019 22:00
Josh, these commands show not that I need:
SWITCH1#Show process cpu
Memory Utilization Report
status bytes
------ ----------
free 10791440
alloc 203780432
CPU Utilization:
PID Name 5 Sec 1 Min 5 Min
---------------------------------------------------------
...
Total CPU Utilization 7.92% 8.95% 9.27%
So, no processes that consume high CPU.
SWITCH1#show memory cpu
Total Memory................................... 262144 KBytes
Available Memory Space......................... 10527 KBytes
Shows only the leak. Available memory space decreases. But it doesn't show what process actually consumed memory.
As you see
SWITCH1#show system
System Description: Dell Ethernet Switch
System Up Time: 1 days, 11h:42m:34s
in 1.5 days free memory leaked from ~22Mb to ~11MB, so I need to reboot switch every 3 days often by power cycle.
Any other ideas?
Nikolai
DELL-Josh Cr
Moderator
•
9.6K Posts
•
42.5K Points
0
December 13th, 2019 08:00
Do you have any ACLs? You could try clearing the ARP cache and see if that frees memory.
NickViz1
12 Posts
0
December 15th, 2019 11:00
Hello Josh,
no, there is no ACL enabled at all. Clear ARP cache didn't help
SWITCH1#show memory cpu
Total Memory................................... 262144 KBytes
Available Memory Space......................... 10772 KBytes
SWITCH1#clear arp-cache gateway
SWITCH1#show memory cpu
Total Memory................................... 262144 KBytes
Available Memory Space......................... 10770 KBytes
Do you think that "no ip redirect" can help?
Nikolai
DELL-Josh Cr
Moderator
•
9.6K Posts
•
42.5K Points
0
December 16th, 2019 07:00
You could try it. At this point it is going to be trial and error to find the cause.
snagles
1 Message
0
March 6th, 2020 06:00
Hi,
Do we have a solution to this case or no? I observe the same case which reappears in 5 days or less. I did disable ip redirects. Checking the CPU memory shows that the memory decreases in several seconds which mean that the leak is still existing.
Thanks,
Alex
NickViz1
12 Posts
1
May 30th, 2020 11:00
Hi Alex,
nop, I didn't found the root of the problem. I'm especially confused as I have 4 switches in a ring and only one expose this problem. I've replaced hardware with 5th switch - no luck.
Right now I reboot switch daily with plink command from Putty set:
type commands.txt | plink.exe 192.168.x.x -l admin -pw password -v
where commands.txt is:
show system
a
show system
a
enable
show memory cpu
reload
y
y
y
exit
exit
Nikolai
ncohafmuta
1 Message
0
April 2nd, 2023 08:00
I know this thread is years old, but I had this exact problem on my 6224 w/3.3.18.1. Changed out my OEM fans with slower 3rd party ones (GDSTIME GDA4020). trapmgr started reporting a fan state change every 2 seconds, with the amount of free memory steadily decreasing.
After 7 days, I started getting the memory errors OP stated in their post along with layer 3 routing failure.
No config change I could find would negate the memory problem, even turning logging completely off.
Removing the tach signal wire from the fans was also my solution
-Tony