106 Posts

August 14th, 2012 06:00

Ashish3004,

Without guessing about how many Target connections each of the Source nodes have allocated to the Target (Gateway) node it seems to me that you have reached the limit of what is a practical number of Sources sending to the Gateway server. As a general rule 10 to 15 Source Servers going to a single RS Gateway server has been found to be a good range for a many-to-one arrangement. Of course that is just a general rule and not a set rule. There is no coded limit regarding how many Source nodes can replicate to a single Target. In other words there is a practical limit not a set limit. Below is a description of what happens: (the rest is just details about how RS works in the environment. Feel free to skip the rest unless you are interested.)

Each Source node can have multiple connections configured so RS can work on more than one file at a time. As an example, let’s say you are using 5 Target connections on each of the 40 Source nodes. This means the RS Target (Gateway) server must deal with data coming from 5*40=200 queues. There is a performance data point on the Target node called Target Queue Count. This Target Queue has to identify each update Source and determine where the data should be stored. This is pretty compute intensive and it also means the OS Disk Queue, on the Target, is seeking and writing all over the physical drives. The Storage Processor of the Celerra box in turn gets busy very fast. Next RepliStor must decrypt and unzip the incoming data before pushing it to the Target node’s storage buffers. This is the place where RS does not have full control over the next steps.

Once the data has been processed and placed on the local RS cache, the final trip to the Celerra switches to a CIFS protocol. The nature of the CIFS (UNC Path) changes from Asynchronous to Synchronous. That change means that each packet/bunch of data has to be sent to the Celerra and the Celerra then acknowledges it got the data by sending a packet back to the RS server. The RS Target server which is now controlling the data transfer, using the Windows file system, which in turn uses SMB (Server Message Block) type transfers then moves on to the next block of data to the Celerra and the Celerra sends back either an Ack or a Nck depending on whether the transfer was successful. While this process is very chatty over the network, it has been refined over the years and does ensure accurate data transfer.

The Celerra is just a black box as far as RS is concerned at this point but depending on the connection speed and latency between the RS server and the Celerra can be slower than the RS server to another RS server. That is one reason why RS highly recommends having a RS on each end point of remotely connected servers. To be more clear, having RS replicate directly to the remote Celerra is much slower and very prone to be problematic. It can be done but is not recommended. For very small amounts of data which does not change rapidly it can be a reasonable alternative but does not scale to large amounts of data which changes rapidly.

Once it leaves the RS cache, RS no longer controls the data/file. RS cannot impose Target File Protection (exclusive locking). Therefore the data may be different between the two nodes. It is the responsibility of the user to keep the files on the Celerra from being accessed.

RS can be run on a VMware client node and is some cases can run faster than on a physical machine. However, EMC does not have any recommendations regarding that question. Finally, EMC has declared the end of service life and no longer sells new RS licenses and will no longer provide support after June 30, 2013. EMC recommends you start looking for a replacement product by that date. EMC does not have any recommendations for customers. The closest EMC product which is available is RecoverPoint which is an appliance that connects via the network. I have no experience with that method so I recommend you look at that and other solutions from other vendors.

Sorry for the extensive explanation but I felt it was important for you to know a bit more details to better understand how RS may not perform well past a certain point.

JS

1 Attachment

7 Posts

August 15th, 2012 07:00

Thanks for the detailed answer.

4 Operator

 • 

5.7K Posts

August 16th, 2012 04:00

Could you please mark the question as answered or do you need more information?

No Events found!

Top