Unsolved
This post is more than 5 years old
2 Intern
•
356 Posts
0
2629
March 13th, 2017 09:00
Isilon - Verify Data Copied from Source to Destination
Community,
according to this blog the preferred tool used to copy data to a new location is rsync.
NFS data copy tool - rsync and what other open source tools ??
But once the copy is complete what tool does people use to verify the integrity of the data copied? basically to make sure that all the data at the source has all been copied to the destination one for one?
Thank you,
No Events found!



crklosterman
450 Posts
0
March 13th, 2017 09:00
chjatwork ,
You're certainly right, rsync can move data, but it can't give you a chain-of-custody level report with md5 hashes for instance for every file source to target, (including permissions). Is this a 1-time move/migration or an on-going synchronization? In either case my companies' products can fit the need. Reach out of you want to chat about it offline sometime.
~Chris Klosterman
Principal SE, Datadobi
www.datadobi.com
chris.klosterman@datadobi.com
chjatwork
2 Intern
•
356 Posts
0
March 13th, 2017 12:00
Chris K.,
As much as I would love to purchase your tool. I will most likely be restricted to finding a solution that is home brewed. I just sort of wanted to find out what methods others admins are using to solve this issue?
Thank you,
crklosterman
450 Posts
1
March 13th, 2017 12:00
Cross your fingers, find a four-leaf clover, hope and pray, etc. That's about it, only 1/2 joking.
BeyondCompare I've seen used from time to time, but it's still commercial software you have to pay for.
Rsync has a --checksum option specified in the man page, but be prepared for let's say less-than-stellar performance. Like emcopy's --cm md5 option it'll hash every single file on every single pass. That's why the question is important of is this a 1-time thing you need to do, or ongoing.
~Chris
Eric_W1
33 Posts
0
March 13th, 2017 12:00
On the isilon ssh command line you could create a hash tree with something like `find /path/ -type f -exec md5 {} \; > sums.txt` but you would need a method to compare that to the source.
sluetze
2 Intern
•
300 Posts
1
March 14th, 2017 07:00
If you want to migrate from one cluster to another i would use SyncIQ. Then you:
- have the integrity-checks integrated
- don't need external servers for configuration
chjatwork
2 Intern
•
356 Posts
0
March 14th, 2017 07:00
Chris K.,
This would be for anytime we need to move data from one cluster to another to possibly create space on a cluster that maybe low on space. So not very often, but it happens from time to time as users run into quota issues.
chjatwork
2 Intern
•
356 Posts
0
March 15th, 2017 04:00
Eric,
so once I get the sums report from the source and the destination I would simply just run a diff against both files?
chjatwork
2 Intern
•
356 Posts
0
March 15th, 2017 04:00
Sluetze,
good point, but if we archive to any other brand of storage like outside of ECS or Isilon I want to make sure I know what the industries best practices are.
Eric_W1
33 Posts
0
March 15th, 2017 05:00
In a simple case, yes, but in reality you may need to massage and sort the output to make the two platforms produce identical output. It's not a hard problem but it could end up being non-trivial if the platform is not linux/bsd.
There exists another way to transfer data in bulk but I would not recommend it to the non-savvy. If you have a means of using NDMP it's just a pax/tar stream with some additional records. YMMV offer not valid in all 50 states you must test, but the stream does contain all the file data and it can easily stream at 700MB/s if your receiver can handle it.