Unsolved
This post is more than 5 years old
5 Posts
0
7607
Data Domain compression / de-dupe expectations
We just got a new DD160 and are starting to integrate it into our backup infrastructure. We have a very typical setup - about 40 virtualized windows servers, mostly file servers, SQL servers, exchange, print server, domain controllers, etc. We use Storage Craft Shadow Copy for our front end backup client. I configured the backup jobs to disable compression on the client side and pointed them to the DD160. We are over 50% utilization on capacity and still copying our data over, but only seeing about 2.3x compression / de-dupe. Is this normal? We were led to expect at least 10x space savings. I would have thought multiple backups of the same server OS would give better de-dupe results than this, and the data / user files would compress much more than 2x. What are we missing?
Thanks,
ch
ble1
2 Intern
2 Intern
•
14.3K Posts
0
April 11th, 2014 09:00
Which compression do you use on DD? Do you backup fulls or incrementals or...?
chappel02
5 Posts
0
April 11th, 2014 09:00
I'm doing full, uncompressed backups (I didn't copy over any archived backups, so I had nothing to do a differential/incremental with), and am using the default 'lz' compression on the DD160.
ble1
2 Intern
2 Intern
•
14.3K Posts
0
April 11th, 2014 10:00
Yes it is ongoing, but I would expect more from DBs and OS (which has many things static and repeated among different boxes).
ble1
2 Intern
2 Intern
•
14.3K Posts
0
April 11th, 2014 10:00
... and my filesys options are:
sysadmin@dd160# filesys option show
Option Value
------------------------------- --------
Local compression type gzfast
Marker-type auto
Report-replica-as-writable disabled
Current global compression type 9
Staging reserve disabled
------------------------------- --------
ble1
2 Intern
2 Intern
•
14.3K Posts
0
April 11th, 2014 10:00
Perhaps Patrick can dig something up after you send him autosupport.
ble1
2 Intern
2 Intern
•
14.3K Posts
0
April 11th, 2014 10:00
One thing which might influence de-dupe ratio is of course encoding of the stream, but I doubt this is the case. I can only assume that data copied so far was unique for some reason and if you say data didn't change, then stream format is such that is not very de-dupe friendly. I happen to have one DD160 and backup is pretty much the same as you describe, but I get obviously better numbers:
sysadmin@dd160# filesys show compression
From: 2014-04-04 16:00 To: 2014-04-11 16:00
Pre-Comp Post-Comp Global-Comp Local-Comp Total-Comp
(GiB) (GiB) Factor Factor Factor
(Reduction %)
--------------- -------- --------- ----------- ---------- -------------
Currently Used: 14241.1 1208.0 - - 11.8x (91.5)
Written:*
Last 7 days 4972.3 380.4 9.2x 1.4x 13.1x (92.3)
Last 24 hrs 683.7 50.7 9.2x 1.5x 13.5x (92.6)
--------------- -------- --------- ----------- ---------- -------------
* Does not include the effects of pre-comp file deletes/truncates
since the last cleaning on 2014/04/06 05:13:07.
Key:
Pre-Comp = Data written before compression
Post-Comp = Storage used after compression
Global-Comp Factor = Pre-Comp / (Size after de-dupe)
Local-Comp Factor = (Size after de-dupe) / Post-Comp
Total-Comp Factor = Pre-Comp / Post-Comp
Reduction % = ((Pre-Comp - Post-Comp) / Pre-Comp) * 100
PatrickBetts
116 Posts
0
April 11th, 2014 10:00
Hi chappel02,
I am a Data Domain TSE. Could you email the serial number of the affected system so that I can look at the latest AutoSupports? My email address is patrick.betts@emc.com. I'll take a look and see if I can figure out what's going on.
Best Regards,
Patrick
chappel02
5 Posts
0
April 11th, 2014 10:00
Here are my options:
sysadmin@DDBackup01# filesys option show
Option Value
------------------------------- --------
Local compression type lz
Marker-type auto
app-optimized-compression none
Report-replica-as-writable disabled
Current global compression type 9
Staging reserve disabled
------------------------------- --------
chappel02
5 Posts
0
April 11th, 2014 10:00
As I said, these are our initial copies of data, so there are no multiples of full backups to de-dupe, but I would think typical end-user data, databases and email would compress and de-dupe better than 2x, as would multiple installs of the same OS files.
I assume your DD160 is showing the results from ongoing weekly full backups?
for comparison:
sysadmin@DDBackup01# filesys sh compression
From: 2014-04-04 12:00 To: 2014-04-11 12:00
Pre-Comp Post-Comp Global-Comp Local-Comp Total-Comp
(GiB) (GiB) Factor Factor Factor
(Reduction %)
--------------- -------- --------- ----------- ---------- -------------
Currently Used: 5336.3 2484.7 - - 2.1x (53.4)
Written:*
Last 7 days 7773.0 4002.4 1.6x 1.2x 1.9x (48.5)
Last 24 hrs 4804.7 2150.2 1.8x 1.2x 2.2x (55.2)
--------------- -------- --------- ----------- ---------- -------------
* Does not include the effects of pre-comp file deletes/truncates
since the last cleaning on 2014/04/10 08:35:53.
Key:
Pre-Comp = Data written before compression
Post-Comp = Storage used after compression
Global-Comp Factor = Pre-Comp / (Size after de-dupe)
Local-Comp Factor = (Size after de-dupe) / Post-Comp
Total-Comp Factor = Pre-Comp / Post-Comp
Reduction % = ((Pre-Comp - Post-Comp) / Pre-Comp) * 100
PatrickBetts
116 Posts
0
April 11th, 2014 11:00
Hrvoje,
I'm trying to. Initially there was an issue with sending me attachments but we got it working. I'm going over the data now. If chappel02 allows, I'll post my findings (or chappel02 can).
Best Regards,
Patrick
dynamox
2 Intern
2 Intern
•
20.4K Posts
0
April 11th, 2014 12:00
compression and encryption disabled in your backup application ?
Anonymous
5 Practitioner
5 Practitioner
•
274.2K Posts
0
May 19th, 2014 09:00
gzfast compression seems a very odd setting. I would highly recommend setting this to lz (auto).
How often are you running full backups? Higher dedupe rates will be achieve on subsequent incrementals. A 3-10x on initial should be expected, however.
The other recommendations are fair: ensure no multi-plexing, encryption, or compression is being done to the data before its received by the Data Domain system.
ble1
2 Intern
2 Intern
•
14.3K Posts
0
May 19th, 2014 10:00
I'm running gzfast on all my DD boxes and there is nothing odd about it
chappel02
5 Posts
0
August 19th, 2014 08:00
Sorry for dropping this. For the sake of giving some closure, the final answer was that Storage Craft is not a supported backup client, so although the DD160 shows up as a valid target and can be written to, it DOES NOT compress OR de-dupe the file output of the Storage Craft clients. We switched to vDP-A for our virtualized clients, and got expected levels of compression and de-dupe (more or less), and were forced to retain our old pre-DD160 backup targets and Storage Craft for our physical systems.
Thanks for all your suggestions and tips.
ch