Unsolved

This post is more than 5 years old

78 Posts

626

May 26th, 2010 05:00

Replistor 6.4 - Exchange Server

I am replicating my servers to image collectors.  I am testing my Exchange server and I see where the Exchange database (*.edb) is constantly starting over.

This happens because I am getting the follwoing:

Blocking site IMAGE-COL-2 at (RcvIamAlive). Binding . Reason: Site IMAGE-COL-2 comm failed due to Communications error. Windows-text: 10054 - An existing connection was forcibly closed by the remote host.
Windows-text: 10054 - An existing connection was forcibly closed by the remote host.

Description
The site is now blocked for the reason specified in the Windows error
text. This is usually a communications failure of some sort. represents what
the send process was doing at the time of the failure. <> is the protocol in use
at the time of the failure. is the reason the site is blocked if not due to a
Windows error code.

Now I see this often on other servers, but the servers always recover and finish their sync.  However, the Exchange store is very large (like most Exchange databases), and when I watch the detail on the sync, everytime I see the above message, the sync starts over.  Is this actually starting over from a 0 byte size or is it starting over from where it left off?  If it is starting over from 0, this sync will never end.  Time remaining is displayed as 914 hours to complete.

Not sure what could be causing the above message.  This is the only server running replistor right now across a private 10MB point to point VPN.

2 Intern

 • 

106 Posts

May 26th, 2010 06:00

As you have mentioned, the default behavior of RepliStor is to continue replication from the place it left off. This would be true with the Exchange environment as well. Unless you use the copy on close setting, then RS queues the changes that come in while the Target Site is blocked.

However, if you are getting frequent Site blocking I recommend you contact EMC Support. There are some heartbeat timeouts that may help reduce or eliminate this timeout behavior. It also seems that the link may have very high latency if it is taking a very long time to send data. While 914 hours as an estimated time to complete may not be entirely accurate, that also means that you would have to wait for 24 days before the sync would complete. That is, in my opinion, not acceptable from an RPO. If that is even close to accurate I don’t think this overall solution is going to provide the DR capability that you are looking for. I would recommend you check to see how long a known size file/files take to get into sync and then compare that with your DR goals.

What I am saying is that this entire approach may give an unrealistic expectation for a DR solution. Specifically, an email system that has data that is 24 days old is probably not meeting your expectations. In fact, I suggest that even if the sync finishes, RS may not be able to keep up with the daily rate of change. Under perfect conditions 10Mb may seem adequate but if you take latency and shared bandwidth into consideration, that may not be adequate.

I think a few tests would validate whether your environment is capable of meeting your RPO. If it is not capable of meeting the RPO and RTO then you may need to consider other options.

Based on this Q/A thread, it seems that you may want to take time to run some validation tests before going too much further.

Just my 2 cents worth…JS

2 Intern

 • 

106 Posts

May 26th, 2010 09:00

Copy on Close is used in special cases where the file cannot be copied while it is being actively used. In the case of a Database file that would prevent RS from sending the file until it was closed on the Source node. I doubt you are using that in the Options of the Specification.

I was just saying that if you set up a specification to replicate say 100 MB of data and time that, then you can get an idea of how long it would take to replicate an entire Exchange database.

Also, I need to correct something I mentioned in passing before. RS does pick up where it left off while mirroring a specification that is already in-sync. However, if you are sync’ing a file (such as the Exchange Database) then RS will start over because it is not already in-sync when the blocked site occurs.

Once a spec is in-sync then the file operations are queued and resent if the site is blocked and unblocked.

From your earlier comments, it sounds like RS may not be a able to keep up with Mirroring alone, even if the specs can be sync’ed. That is why I suggested you do some testing.

78 Posts

May 26th, 2010 09:00

Copy on Close setting?

Also can you eloborate on what you mean by validation tests.  Are these specific test within Replistor that can be run, or are you referring to just running a test sync?

Thanks  

No Events found!

Top