Unsolved

This post is more than 5 years old

6 Operator

 • 

2.1K Posts

762

November 23rd, 2007 07:00

FCC Agent performance

I have an ECC 5.2 sp5 environment with two FCC agents deployed (at multiple sites) and I'm managing about 40 FC switches and directors across Canada. These switches are all McData and are managed by two DS-M Connect instances at our two primary data centers.

The problem we are having (and have been for some time) is with the performance of using ECC to manage the FC fabrics. More often than not operations time out before they can complete successfully. Everything from forcing an import of the zoning databases to pushing a new zone will time out after an hour or more of trying.

EMC has suggested that our solution might come from deploying more instances of DS-M Connect, but even deploying them on ESX incurs costs for operating system licenses and backup client licenses. I don't really want to go down this road without any verification that it will help.

Does anyone have a similar environment (or have had in the past)? Does anyone have any direct experience that would support this suggestion? EMC is saying that we should have no more than 8-10 switches managed by a single instance of DS-M Connect, which means I would likely need to deploy a total of 5 instances of DS-M Connect. This is going to cost us in the range of $6k. I don't mind justifying the money if there is reason to believe it will work, but it is a lot to ask for to find out.

2 Intern

 • 

385 Posts

November 26th, 2007 05:00

This would be hard to answer without some more detail.

1) What sort of latency do you have between the sites with the switches and the main data centers? I think this would have the biggest impact on this setup.

2) What models/types of switches are you running?

3) Are you running WLA against these? Have you tried disabling this? There are some known issues with certain firmware levels of McData gear and WLA causing significant timeout issues (experienced these)

4) Assuming these remote switches are each their own fabrics so that you don't fabric traffic/timing issues?

Sorry for the basic questions, just trying to understand the environment.

Only had to use DS-M Connect for about 8 switches at one sight so can not speak to its scalability. We had full fledged Connectrix Manager servers at both of our primary sites to manage another 20 switches/directors.

6 Operator

 • 

2.1K Posts

November 26th, 2007 13:00

Well, starting at the top...

1) Very low latency for the most part. There is (effectively) no latency between the FCC agents and the DSM Connect hosts as they sit on the same LAN subnets. There is some minor latency (~40ms response time to a PING) as a worst case between DSM Connect hosts and some of the remote fabrics.

2) All of the switches are McData 4500, 4700, or 3232

3) Yes to running WLA. Have tried turning frequency down (well below EMC recommendations) and off. No effect though.

4) We only have one very small fabric that crosses sites for DR mirroring. All other fabrics are isolated to a single physical site each with the longest ISL link running about 30 meters.

And don't worry about the questions. Some of this stuff you have no way of knowing if I've thought about or not. I'd rather answer questions here in hopes of getting a solution, then pretend I'm offended at being asked basic questions.

2 Intern

 • 

385 Posts

December 4th, 2007 05:00

What about your version of firmware on the switches? I know there was in issue with early releases of CX 9.x and some serious performance issues.

Otherwise nothing looks obviously wrong with your configuration unless there really just is an obvious bottleneck like a low thread count of the such designed into the slimmed down DSM Connect.

The only other 2 things I could suggest:

Install the MS PowerTools so you could monitor the DSM Connect / ECC Processes while the update is happening. This would at least show where the bottleneck is but is obviously not a simple/easy thing to interpret.

Check the FCC agent logs - I've seen cases with other agents (and this agent in particular) where things seem to hang but if you have debugging turned-up a bit and check the ECC log you'll see a very specific error where it gives-up trying to do something, but for whatever reason does not time-out the operation. Had this very recently with a Symmetrix agent trying to use a SE 64-bit installation where the operations hung, but the ECC log clearly said the process was aborting.

59 Posts

December 12th, 2007 10:00

Check your Data Collection Policies. If you have things like WLA revolving, WLA Daily, WLA Analyst, or custom DCP policies setup then that will slow things down alot. If you must have these things then increase the scheduled time on them.

Install more FCC agents at the sites that have multiple switches. This will reduce the amount of traffic being sent across the wire.

59 Posts

December 12th, 2007 10:00

Just saw you reply about WLA. I would say install more FCC agents. You may have to stop and restart them in order for the primary at each site to pick up correctly. Rather than each switch sending data across the wire it will send it to the local FCC agent, which will then send it across. This will probably only be helpful if you have more than 3+ switches at a site.

6 Operator

 • 

5.7K Posts

December 14th, 2007 00:00

I found out that it doesn't have to be a local FCC agent that picks up a switch that is local to that agent. I am seeing FCC agents from hundreds of miles way picking up switches and the other way around as well. How can you dedicate FCC agents to a certain set of switches to prevent data going the "long way" ? I haven't found anything to steer this the way I'd like.

59 Posts

December 17th, 2007 13:00

I think you can do it from DCP you may have to create more than one for each site. I think it was the discovery scan DCP or something like that. When I get back to the office I will take a look. Check the templates that are available in DCP then create a DCP for each site and make sure you include only the switches you want at each site.

I know I did simular things with the symm, and CX agents with remote sites, but have not tried it with the FCC agent. If it is possible to do then it would be done via DCP.

DCP = Data Collection Policy

6 Operator

 • 

5.7K Posts

December 19th, 2007 00:00

ok, I let the FCC agent pick the switches themselves. I did not choose which agents should manage which switches. That's ofcourse another possibility.

6 Operator

 • 

2.1K Posts

January 17th, 2008 11:00

So, it appears that the notes I found on DSM Connect being the bottleneck (with too many switches managed) were correct. I finally broke down and spread the load over 5 ESX VM hosts and now things seem to be running quite well.

6 Operator

 • 

5.7K Posts

January 18th, 2008 03:00

That's good to know ! :)

6 Operator

 • 

2.1K Posts

January 18th, 2008 07:00

Yes, but it is a pain too. It cost us an extra several thousand dollars (for OS licenses for the VMs) to get this working the way it should. Maybe we'll just have to pull out all the McData switches and replace them with Brocades now (not that they aren't the same thing now or anything :-)

0 events found

No Events found!

Top