39 Posts

May 15th, 2010 07:00

Sync is typically limited to 100km. Async is dependent on the amount of bandwidth between the sites.

Thank You,

Denny Cherry

Manager of Information Systems

Awareness Technologies

dcherry@awarenesstech.com

310-881-3050 x410 (Office)

213-97-DENNY (Cell)

4 Operator

 • 

5.7K Posts

May 17th, 2010 06:00

Synchronous replication is dependent on response times and latency. So if you have a hig quality line between two sites 100km can be reached easily. In fact: I used to work with 2 DMX3's replicating over 370km with SRDF/S !!! Latency was 6ms and that was just about the max we could handle I was told .

I must add that we used Cisco MDS 9216i's over FCIP, but still: 370km (that's something like 250 miles) !!!!!

37 Posts

May 21st, 2010 11:00

A couple of thoughts:

Actually, it all depends on your application. Theoretically, if your application doesn't care, you could go as far as you wanted (as long as you stay below the SCSI and FC timeouts values which are customizable). If you don't know or dare, I'd say stay below 10 ms end to end for synchronous.

The crucial factor for synchronous replication is the latency of your network. Network engineers will never understand that and always try to impress you with a big pipe instead. I had it once where I was setting up SRDF/A between Washington, DC and Hagerstown, MD but the fibre loop went through Chicago. In other words, it is not the distance between your sites but the length of "the network cable" between your sites that you should look out for.

106 Posts

May 21st, 2010 12:00

I agree with Uwe-Ud0Mk.  The overall latency is very important.  An example is to use the simple Ping command from any Windows desktop.  If you do the following you can get a pretty good idea.

Open a command prompt and type the commands below:

Ping the local gateway address:

C:\>ping 192.168.1.1

Pinging 192.168.1.1 with 32 bytes of data:

Reply from 192.168.1.1: bytes=32 time=1ms TTL=64
Reply from 192.168.1.1: bytes=32 time<1ms TTL=64
Reply from 192.168.1.1: bytes=32 time<1ms TTL=64
Reply from 192.168.1.1: bytes=32 time=1ms TTL=64

Ping statistics for 192.168.1.1:
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 0ms, Maximum = 1ms, Average = 0ms

Then ping the remote server's address:

C:\>ping 10.241.223.127

Pinging 10.241.223.127 with 32 bytes of data:

Reply from 10.241.223.127: bytes=32 time=108ms TTL=114
Reply from 10.241.223.127: bytes=32 time=107ms TTL=114
Reply from 10.241.223.127: bytes=32 time=110ms TTL=114
Reply from 10.241.223.127: bytes=32 time=106ms TTL=114

Ping statistics for 10.241.223.127:
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 106ms, Maximum = 110ms, Average = 107ms

Now ping using a more typical packet size:

C:\>ping -l 1500 10.241.223.127

Pinging 10.241.223.127 with 1500 bytes of data:

Reply from 10.241.223.127: bytes=1500 time=110ms TTL=114
Reply from 10.241.223.127: bytes=1500 time=111ms TTL=114
Reply from 10.241.223.127: bytes=1500 time=113ms TTL=114
Reply from 10.241.223.127: bytes=1500 time=111ms TTL=114

Ping statistics for 10.241.223.127:
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 110ms, Maximum = 113ms, Average = 111ms

Now ping using a 32000 byte packet:

C:\>ping -l 32000 10.241.223.127

Pinging 10.241.223.127 with 32000 bytes of data:

Reply from 10.241.223.127: bytes=32000 time=141ms TTL=114
Reply from 10.241.223.127: bytes=32000 time=140ms TTL=114
Reply from 10.241.223.127: bytes=32000 time=144ms TTL=114
Reply from 10.241.223.127: bytes=32000 time=145ms TTL=114

Ping statistics for 10.241.223.127:
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 140ms, Maximum = 145ms, Average = 142ms

Now ping using a large packet size:

C:\>ping -l 64000 10.241.223.127

Pinging 10.241.223.127 with 64000 bytes of data:

Request timed out.
Request timed out.
Request timed out.
Request timed out.

Ping statistics for 10.241.223.127:
    Packets: Sent = 4, Received = 0, Lost = 4 (100% loss),

Notice that as the packet size goes up, the time=xxx value normally goes up very fast until you begin to get timouts.  In an asynchronous connection you should try to maintain latency times of less than 100ms.  Now for a synchronous connection you should shoot for something no more than the 5-10ms range.

Why?  That is because each synchronous packet "must" be acknowledged by the destinaton node before the sender will send the next packet.  If the sender has to wait for 10ms for each packet to be acknowledged you will see a very slow sender application response.  If it goes much highter than 10ms the sender may start timing out and need to resend the packets.

An asynchronous process does not have to wait for a response from the destination node before sending the next packet.  It therfore does not impact performance on the sender side, but the destination will lag behind the sender and data loss can occur if the communications channel is lost and the user must rely on the data on the destination machine.  The higher the latency, the farther behind the destination node will be.

In a perfect world the network would have zero latency and the data would stream at the speed the customer is paying for.  A T1 connection speed is rated at 1.54M "bits" per second.  If the provider cannot provide less than 10ms end to end latency, a synchronous connection just may not work at all.  The sad thing is that an OC12 (very fast) connection that cannot provide a 10ms or less end to end latency may also timeout an fail.

The most critical statistic is the latency, not the rated (zero latency) speed as mentioned by Uwe-Ud0Mk.  Make sure you test the latency and speed before paying for it and trying to use any kind of replication methodology.  No matter what replication use choose, it will perform poorly if there is high latency.

4 Operator

 • 

5.7K Posts

May 25th, 2010 06:00

Uwe,

I thought both storage arrays can only operate within certain latency limits ? So I actually thought that a latency of about no more than 10ms is the limit for both arrays ? and of course the hosts need to be able to deal with that latency, but AFAIK that wouldn't be a problem (with 10ms that is).

If the arrays time and time again break up the replica's because of bad latency, there's no host involved just yet and the initial question was about replication, not about host connectivity !.

37 Posts

May 26th, 2010 11:00

I didn't say I would ever do synchronous replication across the Atlantic but I wanted to communicate the point that the 10 ms latency is not a requirement to make SRDF work but to make your application work. Most applications have been written to work with a storage subsystem that have a latency of below 15 ms on average and they do not go down when it briefly spikes at, let's say, 100 ms. But when you replicate synchronously over long distances, your disk latency will consistently be higher but SRDF will function anyway up to a certain link timeout value that I don't remember.

EMC engineering will say that they want the network latency below 10 or 15 ms but they don't say this to make SRDF work but to ensure that the replication will not adversly effect the applications that access the replicated storage.

No Events found!

Top