Highlighted
David_Bigham
6 Indium

Using Isilon as a Hadoop distCP target

Jump to solution

I have a customer that is using the Hadoop utility distCP (distributed copy) to move data from one Hadoop cluster to another.  That got me thinking that if Isilon can be a distCP target, then we might have uncovered another Isilon and Hadoop use case. 

So the question is; Can Isilon be an effective Hadoop distCP target?

Tags (3)
1 Solution

Accepted Solutions
David_Bigham
6 Indium

Re: Using Isilon as a Hadoop distCP target

Jump to solution

Yes, Isilon can be an effective Hadoop distCP target! 

Thanks to Andy Pernsteiner for doing the tests to confirm this.  Andy reports that depending on the file size, you should see decent write throughput (around 350MB/sec/node).  Andy mentioned that his bottleneck could have been the direct attached storage his Hadoop cluster was using or something else, but at least we know that it works and that it should perform relatively well.

See Andy's notes on his distCP test here:

http://one.emc.com/clearspace/blogs/andypern/2013/06/07/random-hadoop-tidbits

0 Kudos
1 Reply
David_Bigham
6 Indium

Re: Using Isilon as a Hadoop distCP target

Jump to solution

Yes, Isilon can be an effective Hadoop distCP target! 

Thanks to Andy Pernsteiner for doing the tests to confirm this.  Andy reports that depending on the file size, you should see decent write throughput (around 350MB/sec/node).  Andy mentioned that his bottleneck could have been the direct attached storage his Hadoop cluster was using or something else, but at least we know that it works and that it should perform relatively well.

See Andy's notes on his distCP test here:

http://one.emc.com/clearspace/blogs/andypern/2013/06/07/random-hadoop-tidbits

0 Kudos