Thierry_fr

2 Posts

151659

January 17th, 2014 14:00

ipMapForwardingTask very high CPU

Hello all,

We have a stack of 7 units of Dell PowerConnect 6224 with 2.2.0.3 in firmware version.

For couple of day, we have very high CPU on our switch :

Task                    Utilization
----------------------- -----------
osapiTimer                    1.20%
bcmL2X.0                      0.65%
bcmCNTR.0                     0.15%
bcmTX                         0.05%
bcmLINK.0                     0.35%
bcmRX                         2.10%
bcmATP-TX                     0.65%
bcmATP-RX                     0.20%
MAC Send Task                 0.15%
MAC Age Task                  0.70%
dtlTask                       0.15%
hapiRxTask                    0.15%
SNMPTask                     35.85%
radius_timer_task             0.05%
unitMgrTask                   1.05%
dot3ad_timer_task             0.20%
dot3ad_lac_task               0.05%
spmTask                       0.15%
ipMapForwardingTask          55.05%
OSPF Protocol                 0.10%
BXS Req                       0.05%
OSPF Receive                  0.15%
Kernel/Interrupt/Idle         0.80%

Total                        100.00%

The process ipMapForwardingTask use the CPU and we don't understand why.

This platform is in production for several years without any issue

Do you know if the are any tools to find the packets who use this process.

Please reply, If you have any idea about our issue

Thank you very much

Regards,

T

Responses(26)

KNKA

21 Posts

0

January 23rd, 2014 07:00

Set the MTU to the maximum value on p39... But still ~ 1000 "Packets too Long" countings added per seconds on just that port 39.

I checked the switch which is connected on port 39, but i did not find any notice about which edge port is responsible for this too long packets...

KNKA

21 Posts

0

January 23rd, 2014 23:00

# no ip redirect

is not the reason for this issue. The amount of too long packages are increasing with or without these option set.

Yes we are using iSCSI in our enviroment, but not directly on that main routing switch. One iSCSI Array (PV 3620i) is connected to a 5548, another PV 3620i to another 6248PC Switch .

But from the buggy Uplink Port g39, there is no iSCSI in the background at all.

KNKA

21 Posts

0

January 24th, 2014 01:00

Ping the Default VLAN 1 Interface Address still has some delays and timeouts:

Reply from 10.2.X.X: bytes=32 time=1ms TTL=64
Reply from 10.2.X.X: bytes=32 time=1ms TTL=64
Reply from 10.2.X.X: bytes=32 time=1ms TTL=64
Reply from 10.2.X.X: bytes=32 time=1ms TTL=64
Reply from 10.2.X.X: bytes=32 time=1ms TTL=64
Reply from 10.2.X.X: bytes=32 time=1ms TTL=64
Reply from 10.2.X.X: bytes=32 time=156ms TTL=64
Request timed out.
Reply from 10.2.X.X: bytes=32 time=175ms TTL=64
Reply from 10.2.X.X: bytes=32 time=182ms TTL=64
Reply from 10.2.X.X: bytes=32 time=212ms TTL=64
Reply from 10.2.X.X: bytes=32 time=210ms TTL=64
Reply from 10.2.X.X: bytes=32 time=178ms TTL=64
Request timed out.
Reply from 10.2.X.X: bytes=32 time=174ms TTL=64
Reply from 10.2.X.X: bytes=32 time=234ms TTL=64
Reply from 10.2.X.X: bytes=32 time=210ms TTL=64
Reply from 10.2.X.X: bytes=32 time=133ms TTL=64
Reply from 10.2.X.X: bytes=32 time=163ms TTL=64
Request timed out.
Reply from 10.2.X.X: bytes=32 time=1ms TTL=64
Reply from 10.2.X.X: bytes=32 time=1ms TTL=64
Reply from 10.2.X.X: bytes=32 time=1ms TTL=64
Reply from 10.2.X.X: bytes=32 time=1ms TTL=64
Reply from 10.2.X.X: bytes=32 time=1ms TTL=64
Reply from 10.2.X.X: bytes=32 time=1ms TTL=64

But the overall situation is really really better, particularly the ipForwardingTask!

But there are no strange logging output since the things i posted yesterday... No high Toplogy Changes which could indicate an STP problem...

Thierry_fr

2 Posts

0

January 24th, 2014 01:00

Thank you for your reply

We have set "no ip redirects" but our issue is not fixed.

Task                    Utilization
----------------------- -----------
LOG                           0.05%
osapiTimer                    1.20%
bcmL2X.0                      0.85%
bcmCNTR.0                     0.25%
bcmLINK.0                     0.60%
bcmRX                         1.10%
bcmNHOP                       0.05%
bcmATP-TX                     0.10%
bcmATP-RX                     0.10%
MAC Send Task                 0.55%
dtlTask                       0.25%
hapiRxTask                    0.05%
RMONTask                      0.10%
unitMgrTask                   0.20%
dot3ad_timer_task             0.15%
ipMapForwardingTask          52.20%
BXS Req                       0.10%
OSPF Receive                  0.10%
Kernel/Interrupt/Idle        42.00%

Total                        100.00%

and

PING 172.16.199.253 (172.16.199.253) 56(84) bytes of data.
64 bytes from 172.16.199.253: icmp_seq=1 ttl=61 time=2.04 ms
64 bytes from 172.16.199.253: icmp_seq=2 ttl=61 time=2.21 ms
64 bytes from 172.16.199.253: icmp_seq=3 ttl=61 time=2.60 ms
64 bytes from 172.16.199.253: icmp_seq=4 ttl=61 time=2.27 ms
64 bytes from 172.16.199.253: icmp_seq=5 ttl=61 time=2.17 ms
64 bytes from 172.16.199.253: icmp_seq=6 ttl=61 time=499 ms
64 bytes from 172.16.199.253: icmp_seq=7 ttl=61 time=41.9 ms
64 bytes from 172.16.199.253: icmp_seq=8 ttl=61 time=48.4 ms
64 bytes from 172.16.199.253: icmp_seq=9 ttl=61 time=62.6 ms
64 bytes from 172.16.199.253: icmp_seq=10 ttl=61 time=155 ms
64 bytes from 172.16.199.253: icmp_seq=11 ttl=61 time=36.8 ms
64 bytes from 172.16.199.253: icmp_seq=12 ttl=61 time=42.4 ms
64 bytes from 172.16.199.253: icmp_seq=13 ttl=61 time=54.1 ms
64 bytes from 172.16.199.253: icmp_seq=14 ttl=61 time=315 ms

Thank you for your help

regards,

A

Anonymous

5 Practitioner

•

274.2K Posts

0

January 24th, 2014 07:00

KNKA, I would think even with the iSCSI devices attached to other switches in the network that there would still be iSCSI traffic passing through this 6200 switch. But monitoring port 39 you say you do not see any iSCSI traffic? Are you able to see which traffic is coming in on the large size?

On the 6200 a more accurate ping test is to ping a device in VLAN 1 and not the VLAN 1 ip address. If you ping a device in VLAN 1 what kind of results do we see?

Thierry, were you able to get the firmware updated, I think this may help out in this case. Once that is done can you post up your running config?

KNKA

21 Posts

0

January 27th, 2014 04:00

Hello Daniel,

Port 39 is an Uplink to another Switch, which has just workstations, telephones patched... No servers, no iSCSI devices. We are not able to see which endpoint is causing these large sizes...

If I ping a device in VLAN 1, everything is fine permanently (< 1 ms), there are no problems. Just the IP addresses of the different routed VLAN interfaces got that problems sometimes.

Maybe everything is fine (Pinging to device in VLAN 1 is without problems), but even if there are no problems with the PowerConnect switch, it would be interessting what causes the delays and timeouts in that VLAN interface ip addresses.

A

Anonymous

5 Practitioner

•

274.2K Posts

0

January 27th, 2014 10:00

I looked up some basic info on nagios, what I found shows you can monitor:

Packet loss, round trip average

SNMP status information

Bandwidth / traffic rate

I did not see anything about packet monitoring. Is nagios able to do so? If not I would proceed with configuring port monitoring on the port and use wireshark to see exactly what oversized packets are going across that port.

If nagios does, what packets does it show?

S

skehoe

6 Posts

0

February 11th, 2014 12:00

I have the same problem on multiple switches running at layer 3. Same exact symptoms where the routed interfaces are not accessible but all users on the routed subnets work fine. This only effects managment on the devices running at layer 3. For us the problem seems to be related to multinetted vlan interfaces as we do not see this issue on switches without multinetted vlan interfaces. I've also worked extensively with Dell engineering over the last year and we are still experiencing this problem. Not sure why this issue is so difficult for Dell to resolve!

A

Anonymous

5 Practitioner

•

274.2K Posts

0

February 11th, 2014 13:00

Skehoe, have you tried running #no ip redirect? Are you at the latest firmware?

S

skehoe

6 Posts

0

February 11th, 2014 13:00

Daniel, yes we have the 'no ip redirect' command configured on all vlan interfaces and we are running version 5.1.2.3 on all of our PowerConnect gear.

KNKA

21 Posts

0

February 12th, 2014 00:00

We still got these problems from time to time...

Routing Switch is on latest Firmware and #ip redirect is disabled...

I saw that one Uplink to another PC6248 Switch has many "Recieved Pause Frames" on XG/1...

Disabled flow control (on our four PC6248 Core Switches) as mentioned in that article:

http://monolight.cc/2011/08/flow-control-flaw-in-broadcom-bcm5709-nics-and-bcm56xxx-switches/

„There is a design flaw in Broadcom’s “bnx2″ NetXtreme II BCM5709 PCI Express NICs. These NICs are extremely popular, Dell and HP use them throughout their PowerEdge and ProLiant standalone and blade server ranges.

The flaw is in the flow control (802.3x) implementation and results in a switch-wide or network-wide loss of connectivity. As is common in major failures, there is more than one underlying cause.”

„If you can’t disable flow control on all switches, at least disable it on your core switches. If you use it in the core, you’re Doing It Wrong™.”

“Do not use BCM56314 and BCM56820-based OEM switches (e.g. Dell PowerConnect 6248, M8024, 8024F). Get your switches from a respectable network hardware vendor“

did not solve the issues!

1
2

View All

No Events found!

Networking General

ipMapForwardingTask very high CPU