Start a Conversation

Unsolved

This post is more than 5 years old

1521

March 23rd, 2011 12:00

Need help on building out cluster network. 10gigE?

I should preface this by saying I'm not a full fledged sysadmin, more of a self taught guy who has fallen into the role of sysadmin for a group at my university that provides computer services to the college of science(we're not IT dept). So I'm in charge of a 50 node cluster, a mix of sun and dell blades and a lustre file system. We'd like to make it bigger but I feel the network is our bottleneck. Right now its three gigabit switches, one as a root switch with two children switches connected via a 4port LAG(no connection between the children switches). The file system plugged into the root switch, and then nodes in every available port. I think its a mess, but working with what we got.

So my idea to make things better would be to have a 10gigE root switch, go 10gigE on the file system and have that plug into the root switch, then 10gigE connections out to children switches. The nodes would only then plug into the children switches at one gigabit. So the PowerConnect 8024 looks like a good root swtich with PowerConnect 6248 as top of rack switches for the nodes. My issue is I get lost in the terminology. What kind of modules would I need to hook up the 6248 to the 8024 via 10gigE? Should it be copper or fiber?

Also is going 10gigE a good idea? Or would I be better off saving that money and just stick with gigabit, but build out the layout I described above?


(FYI I got the idea for this layout from reading about 'fat tree' networks. Though I hear the idea is mainly for infiniband type stuff, thats way to expensive. We're not looking to be the next super computing center.)

72 Posts

March 24th, 2011 03:00

HI,

I should say I've never worked with clusters, so what I'm going to say is just about network.

First of all, have you some evidences that network throughput between switches is an issue? If so, yes, it's always better to get more speed than more ports, because 4 port LAG can't give you 4 gigabits. Depending on traffic distribution mechanism you can get 70-80% of theoretical maximim. And 10GigE is good for you in this case. Moreover some suppose that 10GigE can win war agains either Infiniband and Fibre Channel.

About modules: all depends on the length. You can use both copper (twinax) and fiber (10GBase-SR) with multimode optics. Maximum length of twinax is 10m, but Dell sells only 7m cables, SR range differs and can be up to 300m.

With twinax modules you get cheap complete solution with two SFP+ modules and cable, and they use less power. Optics modules are more expensive and you need to buy separate cable.

Hope this helps

 

No Events found!

Top