Unsolved

This post is more than 5 years old

6 Posts

27511

August 27th, 2009 13:00

Very poor throughput on powerconnect 6248

Hi - I have a cluster of 104 Dell PowerEdge nodes connected by a 1Gbit ethernet with 4 powerconnect 6248 switches.  The network throughput has been very poor, resulting in a huge bottleneck in internode communication - to the point that our most used code, VASP, doesn't speed up at all when running on more than one node.  Even when using only nodes connected on the same switch.  VASP was compiled with intel fortran and openmpi.  I performed a test of the tcp layer for communication between two nodes using NetPIPE and saw that the throughput leveled off at 850 Mb/s at packet sizes of about 100 Kb.  At packet sizes above 1 Mb, the throughput oscillates wildly between 850 Mb/s and values as low as 200 Mb/s.  This test was run with no other network traffic.  When I ran simultaneous NetPIPE tests, the throughput vs. packet size curve was the same shape but the oscillations started at packet sizes of about 10 kB where the throughput was 400 Mb/s.  It seems like there is an underlying problem with the tcp layer - so I need to fix it before I even worry about the mpi.  Is this typical behavior of this switch or is there some problem with the configuration.  Users on other forums have said that our switch is not high performance and this is what I can expect from it.  Thanks

909 Posts

August 27th, 2009 13:00

Please post the output of "show running-config"

6 Posts

September 2nd, 2009 16:00

Sorry for the delay, but I had trouble connecting to the switch through the console port (bad cable!!!).  Here is the output of show running-config

!Current Configuration:
!System Description "PowerConnect 6248, 2.2.0.3, VxWorks5.5.1"
!System Software Version 2.2.0.3
!
configure
stack
member 1 2
exit
ip address 192.168.2.1 255.255.255.0
username "admin" password e8a829e0cd19e42316270290c9147018 level 15 encrypted
exit

909 Posts

September 3rd, 2009 08:00

Most likely the switch is dropping packets because of periodic over subscription of the ports.  Dropping TCP traffic has a huge impact on application performance and TCP performance test results. 

I suggest enabling flow control on your network and see if the performance improves.  This means all switches and NICs.   The NICs most likely have flowcontrol enabled by default, but this version of PowerConnect 62xx firmware has flowcontrol disabled by default.  Here are the commands to enable flowcontrol on the 62xx.

console> enable

console# configure

console (config)# flowcontrol

Please let us know the results of your tests with flow control enabled.

6 Posts

September 3rd, 2009 14:00

That worked!!!  I'm now getting tcp throughput of 900 Mb/s between nodes with 10 pairs of nodes communicating simultaneously.  Thanks!  Why doesn't flow control come enabled by default?

No Events found!

Top