Start a Conversation

Unsolved

This post is more than 5 years old

709

April 19th, 2018 11:00

Dedupe Stats

I'm confused about what is reported by the dedupe stats:

Deduplicated Data:  8.65 TB from 23.7 TB configured for deduplication

Savings from Deduplication:  15.1 TB

corresponds (mostly) to:

8.65TB = 23.7TB - 15.1TB

Deduplicated Data = (cluster.dedupe.estimated.deduplicated.bytes - cluster.dedupe.estimated.saved.bytes)

I know the stats probably look at blocks, but the bytes stats is close enough for my reporting and was the first thing I tried.

Where does the 23.7TB number comes from when I actually have close to 90TB of data that is being scanned for deduplication?  I realize not all of that data will be deduped but in calculating a percentage from these numbers I get a 36% deduplication rate, which doesn't seem right (based on my 90TB, that should be closer to 10%).

Am I using the wrong stats to calculate dedupe rate and total savings?  If I'm pointing to a directory with 90TB of data, shouldn't "configured for deduplication" be 90TB, even if not all of it dedupes?

No Responses!
No Events found!

Top