"Some online resources have referred to ensuring the compression type is the same on both source and destination. I have not been able to find this information on the DDs."
This information can be seen on the enterprise manager, Data Management>filesystem>configuration.
i am not sure what kind of data you replicate but when i replicate Oracle RMAN backups, i get around 150MB/s from DD880 to DD890 (10G interfaces) on layer 2 network.
The network switch issue has been addressed. In fact, we just bypassed it and ran the cables between the network ports of the two DDs (auto sensing ports).
While there has been a slight increase in replication performance, the average peak is only 25MiB/s. This is still not what I was expecting to see.
The Estimated Completion Times under the Replication Summery in System Manager now states about 14 days.
Before attempting to troubleshoot, I would like to get answers to my original questions;
What kind of replication rate should I expect to see under an ideal configuration?
Is this the best way to transfer this much CIFS directory data between two Data Domain?
Thank you dynamox. That's the kind of throughput that I would expect to see.
The data being replicated is the resulting backup files of our exchange serves that are backed up via EMC SourceOne. Many of the files are the email attachments, while the others are container files of the email text. Those files range from KB to about a GB.
I can understand how different data types can be processed at different speeds, but I would be surprised that this data type is the reason for this kind of slow performance. The CPU does not appear to be working very hard, averaging between 4 and 8%. In fact, none of the monitored resources on either the source or destination DD appear to be very busy.
Might you have any suggestions on what else to look for as a replication or network configuration mistake, or a better ways to move this data?
as a data point, could you create a directory a drop a couple of 2G ISO files (different flavors of Windows or Linux) and try to create directory replication just of that folder. Curious to see what kind of throughput you would get ?
Are you sure that replication traffic is going through the interface you think it's going ?
Hi, If your target DD is new and has no other data on it, then a migration or collection replication will be faster - less chatter - it just gets sent.
system show stats view net interval 5
That will give you specific interface throughput figures.
Going direct connect with 2 cables will probably make no difference, I suspect only 1 port will still be used.
If you are or were using LACP aggregation, check the above for throughput - if the 2 cables are going between 2 different switches then you may find only one path is used unless you use Vlag or VPc between the switches.
Your initial configuration was "failover" right? In that case only one interface is ever "active" anyway, so probably nothing changed anyway from your changes.
If your deduplication ratio is low, it will slow down the speed because it all has to come from disk, especially if the target has never seen this data or type of data before, though this may start to "pick up" as it starts to seed more and more data.
This would make the actual "NIC" speed a bit of a red herring.
There is not really a default answer to how fast it can be or should be - sadly "it depends".
You may find all traffic was/is going over the "management" interface anyway and maybe it's still going via your switch and thus your Veth0 interface is not being used at all.
Put in a route to force it but you'll need to go back on your switch to use LACP.
I've made masses of assumptions here, but hopefully you can pick some stuff out of this to help you.
bryan_washburn
42 Posts
1
December 15th, 2014 10:00
Thank you, rprnairj.
Both DDs appear to be set as LZ compression.
rprnairj
1 Rookie
•
45 Posts
0
December 15th, 2014 10:00
Just wanted to share,
"Some online resources have referred to ensuring the compression type is the same on both source and destination. I have not been able to find this information on the DDs."
This information can be seen on the enterprise manager, Data Management>filesystem>configuration.
rprnairj
1 Rookie
•
45 Posts
0
December 15th, 2014 11:00
Also, as you mentioned, that you have redirected the replication traffic, you see traffic moving on that specific interface?
rprnairj
1 Rookie
•
45 Posts
0
December 15th, 2014 11:00
what is your stream utilization on both source and destination data domain,
you can check this using #sys sh perf
rprnairj
1 Rookie
•
45 Posts
0
December 15th, 2014 12:00
it would be under rd/wr/r+/w+
bryan_washburn
42 Posts
0
December 15th, 2014 12:00
Yes, traffic is passing through each of the three aggregated ports at bot ends.
However, a coworker suggested switching to LACP from round robin.
I will give that a shot shortly.
bryan_washburn
42 Posts
0
December 15th, 2014 12:00
Sorry, I am not sure which columns of information you are looking for.
I have collected the output of the last few hours from each DD to a log.
Trying to post it.
1 Attachment
DD Sys Perf.log
bryan_washburn
42 Posts
0
December 15th, 2014 13:00
No difference when changing the aggregate protocol.
However, I just learned there may be a problem with the switch that I am using.
I will let you know.
bryan_washburn
42 Posts
0
December 15th, 2014 13:00
For the last entries of each:
Source shows 0/ 1/ 0/ 0
Dest, shows 0/ 0/ 0/ 0
A sys perf output file is posted. You may want to look at that.
dynamox
9 Legend
•
20.4K Posts
0
December 16th, 2014 11:00
i am not sure what kind of data you replicate but when i replicate Oracle RMAN backups, i get around 150MB/s from DD880 to DD890 (10G interfaces) on layer 2 network.
bryan_washburn
42 Posts
0
December 16th, 2014 11:00
The network switch issue has been addressed. In fact, we just bypassed it and ran the cables between the network ports of the two DDs (auto sensing ports).
While there has been a slight increase in replication performance, the average peak is only 25MiB/s. This is still not what I was expecting to see.
The Estimated Completion Times under the Replication Summery in System Manager now states about 14 days.
Before attempting to troubleshoot, I would like to get answers to my original questions;
Thank you.
bryan_washburn
42 Posts
0
December 16th, 2014 13:00
Thank you dynamox. That's the kind of throughput that I would expect to see.
The data being replicated is the resulting backup files of our exchange serves that are backed up via EMC SourceOne. Many of the files are the email attachments, while the others are container files of the email text. Those files range from KB to about a GB.
I can understand how different data types can be processed at different speeds, but I would be surprised that this data type is the reason for this kind of slow performance. The CPU does not appear to be working very hard, averaging between 4 and 8%. In fact, none of the monitored resources on either the source or destination DD appear to be very busy.
Might you have any suggestions on what else to look for as a replication or network configuration mistake, or a better ways to move this data?
dynamox
9 Legend
•
20.4K Posts
0
December 16th, 2014 14:00
Bryan,
as a data point, could you create a directory a drop a couple of 2G ISO files (different flavors of Windows or Linux) and try to create directory replication just of that folder. Curious to see what kind of throughput you would get ?
Are you sure that replication traffic is going through the interface you think it's going ?
bryan_washburn
42 Posts
0
December 16th, 2014 14:00
Yep, I am sure the replication traffic is going over the private network. That one I have checked a few times.
I will setup some other misc. types of test data and run that through tonight. I’ll let you know tomorrow.
jbrooksuk
208 Posts
1
December 17th, 2014 01:00
Hi,
If your target DD is new and has no other data on it, then a migration or collection replication will be faster - less chatter - it just gets sent.
system show stats view net interval 5
That will give you specific interface throughput figures.
Going direct connect with 2 cables will probably make no difference, I suspect only 1 port will still be used.
If you are or were using LACP aggregation, check the above for throughput - if the 2 cables are going between 2 different switches then you may find only one path is used unless you use Vlag or VPc between the switches.
Your initial configuration was "failover" right? In that case only one interface is ever "active" anyway, so probably nothing changed anyway from your changes.
If your deduplication ratio is low, it will slow down the speed because it all has to come from disk, especially if the target has never seen this data or type of data before, though this may start to "pick up" as it starts to seed more and more data.
This would make the actual "NIC" speed a bit of a red herring.
There is not really a default answer to how fast it can be or should be - sadly "it depends".
You may find all traffic was/is going over the "management" interface anyway and maybe it's still going via your switch and thus your Veth0 interface is not being used at all.
Put in a route to force it but you'll need to go back on your switch to use LACP.
I've made masses of assumptions here, but hopefully you can pick some stuff out of this to help you.
Regards,
Jonathan