Start a Conversation

Unsolved

This post is more than 5 years old

2874

June 7th, 2013 09:00

VNX File Level Replication

I have been using replication between two VNX 5300 SANs for just over a year.  I cannot seem to narrow down a best practice for changing the DM interconnect schedule/bandwidth.  I have my schedule setup to basically go 50Mbps between 8am and 5pm Mon through Fri, and then use 99Mbps (I have a 100Mbps link) the other times and days.  But, it seems from monitoring the network port at the DR site, it does not burst up to faster speeds at night like I want it to.  Is there a certain way to make these changes so they take effect?  Seems like I had this working at a point, but I am not sure what I did different when it was working...  One idea I have is to stop all my current replication entries and then pause the DM Interconnect, change my bandwidth times and limites, then unpause the DM Interconnect and then restart all of my replication entries... help!

July 14th, 2013 23:00

Please consider moving this question as-is (no need to recreate) to the proper forum for maximum visibility.  Questions written to the users' own "Discussions" space don't get the same amount of attention and questions can go unanswered for a long time.

You can do so by selecting "Move" under ACTIONS along the upper-right.  Then search for and select: "VNX Support Forum" which would be the most relevant for this question.

Seeing though that over a month has elapsed since you posted, maybe you have already found the answer (possibly opened up a ticket with support)?  If so, once relocated, consider sharing the resolution and also marking the question as "Answered" so that it can help others that have the same question.  If not, once relocated, it will then have more visibility by the community of dedicated customers, partners, and EMC employees that are eager to assist.

You should not have to stop the sessions and pause the interconnect in the manner you propose to realize the updated scheduling.  A few things come to mind:

1) Timezone of the local data movers are set correctly?

Sorry, just looking to rule this out and maybe it is taking affect sooner or maybe later than expected?

2) Can you provide the "bandwidth schedule" output of the following command:

nas_cel -interconnect -info

a) As a reminder, the value you set in the interconnect settings should be in Kb/s (bits)

Again, simply ruling it out, but if you were to provide the value in bytes when configuring the schedule instead of bits, you would have provided a lesser throttle than you originally proposed (of course 1/8 to be exact).  Also, generally speaking for networking the multiplier is 1000 (instead of 1024) when calculating between the various metrix prefixes so what I'll be looking for from the output above are:

50 Mbps = 50000 Kbps

100 Mbps = 100000 Kbps

b) Are the order of your rules from most-specific to least-specific order (like firewall rules)?

What I'll be looking for is maybe the following based on what you described:

MoTuWeThFr08:00-17:00/50000,/99000

3) Is it possible you aren't even pushing that much bandwidth from the source?

a) At anytime between the hours of 8am - 5pm (Mon - Fri) are you even seeing 50Mbps being sent so in effect there isn't anything to throttle?

b) Outside the hours of 8am - 5pm (Mon - Fri) does it consistently level out or peak at more-or-less around 50Mbps which would suggest it isn't recognizing the "at all other times" rule that "/99000" represents?

19 Posts

July 18th, 2013 06:00

I have not resolved his yet nor opened a ticket.  Thank you for the advice on locating this post.  That explains why it has been SO LONG...

When I am in Unisphere, the data and time are the same as my PC.  When I go to CS properties, it says Current Timezone = America/Chicago, which is correct.  We are in the Kansas City area.

I ran the command and here is what i got:

[nasadmin@VNXA-CS0 ~]$ nas_cel -interconnect -info repl1

id                                = 20003

name                              = repl1

source_server                      = VNXA-DM1

source_interfaces                  = 172.16.0.160

destination_system                = VNXB-CS0

destination_server                = server_2

destination_interfaces            = 172.16.0.180

bandwidth schedule                = uses available bandwidth

crc enabled                        = yes

number of configured replications  = 10

number of replications in transfer = 1

status                            = The interconnect is OK.

I see where the bandwidth schedule says use available bandwidth.  I was afraid that I setup was limiting replication, so I removed the schedule I had in there to see what would happen.  I did have a schedule once that was set to 50000Kbps during working hours and then I assume outside of that time it would have been using all available bandwidth.

I understand your question about times and bandwidth.  I was thinking of decreasing my time out of sync and see how my traffic changes.  Right now it is at 4 hours.  We want to guarantee a 4 hour RPO and 4 hour RTO,  We have a 100Mbps link via a local metro ethernet provider.

Also, we have recoverpoint replicating on this link too.  Since the two replication systems do not know about each other, there is not a common bandwidth control option... so, I have tried to limit the replication sets in recoverpoint so they only add up to 60Mbps if they are all replicating at once...

Also, we have VNX Monitoring and Reporting running to collect stats.  Here is a screen scrape from this morning of the replication stats (see attachment).

My biggest concern is the transfer rates.  I would think they would burst higher at times, but maybe it is just keeping up.  What do you think about me decreasing my time out of sync?  Shouldn't that generate more traffic since it will not be waiting as long to replicate?

Thanks for your help.

1 Attachment

19 Posts

July 18th, 2013 08:00

Also, if I monitor the data mover network ports on each SAN (primary and secondary), that should show me the traffic output from the primary and the traffic input on the secondary SAN, right?  I want to make sure I am monitoring the correct ports... replication will use the data mover ports at each end,right?  I have 4 ports in an LACP setup on each data mover at each site.  We use Cacti to monitor our network switches, so I am trying to group those together and see what the bandwidth usage is on them...

19 Posts

September 6th, 2013 14:00

For who might want to know, I did resolve this by setting the port speed on the data mover at the DR site to 100Mbps, then BOOM!  Speeds increased and life has been good.  I guess the idea is that since we changed the port speed from 100Mbps to 1000Mbps (by the fact that we change out the network switch) AND the actual link speed between sites is limiter to 100Mpbs, that the data mover did not like this and was trying to adjust the network speeds (I think this is basically what EMC support told me...).  Anyway, making the port speed match my uplink speed fixed this issue...

No Events found!

Top