Start a Conversation

Unsolved

This post is more than 5 years old

1538

April 5th, 2018 10:00

Dedupe Best Practice

I have a question about deduplication.  The best practice guide states to limit the number of individual paths for dedupe jobs to no more than 10.  If I have a folder structure with terabytes of mixed data, a good portion of which is media files that I wouldn't expect to dedupe efficiently (so might not want to waste CPU cycles on it) would it still make sense to dedupe at the top of that folder root?

In other words, should I be doing an assessment on the root of said folder and taking the total "dedupe-ability" as a whole regardless of the content mix?  Our Isilon cluster is in no way taxed, so I'm not really concerned with performance but I don't like the idea of trying to dedupe a significant amount of stuff that won't dedupe efficiently.  Also, this content doesn't really exist yet, I'd be turning on dedupe early in the roll-out to the departments.  I'd probably only have a better idea of real total "dedupe-ablility" in 12-18 months but I know that the first 45% of the space is going towards media archives, the remaining 55% as standard departmental mixed use shares.

If, as I suspect, the answer would be to just dedupe at the root and monitor efficiency, what is your rule-of-thumb for percentage making it worthwhile to spend some cycles on?  10%?  5%?  Assuming performance stays acceptable, is pretty much any savings worthwhile?

Any input welcome, thanks.

1 Rookie

 • 

62 Posts

April 6th, 2018 09:00

Good point about the license fee, but that's a sunk cost now.  To be fair, EMC was very upfront about that point but from a budgetary standpoint it was get it now or you might never get it.

Thanks for the input.

1.2K Posts

April 6th, 2018 09:00

(that was quick -- moderation issue solved?)

1.2K Posts

April 6th, 2018 09:00

Usually the pain point with low dedupe savings isn't the performance, but

the wasted SmartDedupe license fee...

SmartDedupe job runs at "low impact" schedule, so organize the folders

as smart as the application workflow allows, and go for it.

-- Peter

1 Rookie

 • 

62 Posts

April 6th, 2018 10:00

Yep!  Squeaky wheel gets the grease I guess.

No Events found!

Top