Start a Conversation

Unsolved

This post is more than 5 years old

7511

April 11th, 2014 09:00

Data Domain compression / de-dupe expectations

We just got a new DD160 and are starting to integrate it into our backup infrastructure. We have a very typical setup - about 40 virtualized windows servers, mostly file servers, SQL servers, exchange, print server, domain controllers, etc. We use Storage Craft Shadow Copy for our front end backup client. I configured the backup jobs to disable compression on the client side and pointed them to the DD160. We are over 50% utilization on capacity and still copying our data over, but only seeing about 2.3x compression / de-dupe. Is this normal? We were led to expect at least 10x space savings. I would have thought multiple backups of the same server OS would give better de-dupe results than this, and the data / user files would compress much more than 2x. What are we missing?

Thanks,

ch

14.3K Posts

April 11th, 2014 09:00

Which compression do you use on DD? Do you backup fulls or incrementals or...?

5 Posts

April 11th, 2014 09:00

I'm doing full, uncompressed backups (I didn't copy over any archived backups, so I had nothing to do a differential/incremental with), and am using the default 'lz' compression on the DD160.

14.3K Posts

April 11th, 2014 10:00

Yes it is ongoing, but I would expect more from DBs and OS (which has many things static and repeated among different boxes).

14.3K Posts

April 11th, 2014 10:00

... and my filesys options are:

sysadmin@dd160# filesys option show

Option                            Value  

-------------------------------   --------

Local compression type            gzfast 

Marker-type                       auto   

Report-replica-as-writable        disabled

Current global compression type   9      

Staging reserve                   disabled

-------------------------------   --------

14.3K Posts

April 11th, 2014 10:00

Perhaps Patrick can dig something up after you send him autosupport.

14.3K Posts

April 11th, 2014 10:00

One thing which might influence de-dupe ratio is of course encoding of the stream, but I doubt this is the case.  I can only assume that data copied so far was unique for some reason and if you say data didn't change, then stream format is such that is not very de-dupe friendly.  I happen to have one DD160 and backup is pretty much the same as you describe, but I get obviously better numbers:

sysadmin@dd160# filesys show compression

                     

From: 2014-04-04 16:00 To: 2014-04-11 16:00

                     

                  Pre-Comp   Post-Comp   Global-Comp   Local-Comp      Total-Comp

                     (GiB)       (GiB)        Factor       Factor          Factor

                                                                    (Reduction %)

---------------   --------   ---------   -----------   ----------   -------------

Currently Used:    14241.1      1208.0             -            -    11.8x (91.5)

Written:*                                                                       

  Last 7 days       4972.3       380.4          9.2x         1.4x    13.1x (92.3)

  Last 24 hrs        683.7        50.7          9.2x         1.5x    13.5x (92.6)

---------------   --------   ---------   -----------   ----------   -------------

* Does not include the effects of pre-comp file deletes/truncates

   since the last cleaning on 2014/04/06 05:13:07.

Key:                                                         

       Pre-Comp = Data written before compression            

       Post-Comp = Storage used after compression            

       Global-Comp Factor = Pre-Comp / (Size after de-dupe)  

       Local-Comp Factor = (Size after de-dupe) / Post-Comp  

       Total-Comp Factor = Pre-Comp / Post-Comp              

       Reduction % = ((Pre-Comp - Post-Comp) / Pre-Comp) * 100

116 Posts

April 11th, 2014 10:00

Hi chappel02,

I am a Data Domain TSE.  Could you email the serial number of the affected system so that I can look at the latest AutoSupports?  My email address is patrick.betts@emc.com.  I'll take a look and see if I can figure out what's going on.

Best Regards,

Patrick

5 Posts

April 11th, 2014 10:00

Here are my options:

sysadmin@DDBackup01# filesys option show

Option                            Value  

-------------------------------   --------

Local compression type            lz     

Marker-type                       auto   

app-optimized-compression         none   

Report-replica-as-writable        disabled

Current global compression type   9      

Staging reserve                   disabled

-------------------------------   --------

5 Posts

April 11th, 2014 10:00

As I said, these are our initial copies of data, so there are no multiples of full backups to de-dupe, but I would think typical end-user data, databases and email would compress and de-dupe better than 2x, as would multiple installs of the same OS files.

I assume your DD160 is showing the results from ongoing weekly full backups?

for comparison:

sysadmin@DDBackup01# filesys sh compression

                    

From: 2014-04-04 12:00 To: 2014-04-11 12:00

                    

                  Pre-Comp  Post-Comp  Global-Comp  Local-Comp      Total-Comp

                    (GiB)      (GiB)        Factor      Factor          Factor

                                                                    (Reduction %)

---------------  --------  ---------  -----------  ----------  -------------

Currently Used:    5336.3      2484.7            -            -    2.1x (53.4)

Written:*                                                                      

  Last 7 days      7773.0      4002.4          1.6x        1.2x    1.9x (48.5)

  Last 24 hrs      4804.7      2150.2          1.8x        1.2x    2.2x (55.2)

---------------  --------  ---------  -----------  ----------  -------------

* Does not include the effects of pre-comp file deletes/truncates

  since the last cleaning on 2014/04/10 08:35:53.

Key:                                                        

      Pre-Comp = Data written before compression          

      Post-Comp = Storage used after compression          

      Global-Comp Factor = Pre-Comp / (Size after de-dupe)

      Local-Comp Factor = (Size after de-dupe) / Post-Comp

      Total-Comp Factor = Pre-Comp / Post-Comp            

      Reduction % = ((Pre-Comp - Post-Comp) / Pre-Comp) * 100

116 Posts

April 11th, 2014 11:00

Hrvoje,

I'm trying to.  Initially there was an issue with sending me attachments but we got it working.  I'm going over the data now.  If chappel02 allows, I'll post my findings (or chappel02 can).

Best Regards,

Patrick

1 Rookie

 • 

20.4K Posts

April 11th, 2014 12:00

compression and encryption disabled in your backup application ?

5 Practitioner

 • 

274.2K Posts

May 19th, 2014 09:00

gzfast compression seems a very odd setting. I would highly recommend setting this to lz (auto).

How often are you running full backups? Higher dedupe rates will be achieve on subsequent incrementals. A 3-10x on initial should be expected, however.

The other recommendations are fair: ensure no multi-plexing, encryption, or compression is being done to the data before its received by the Data Domain system.

14.3K Posts

May 19th, 2014 10:00

I'm running gzfast on all my DD boxes and there is nothing odd about it

5 Posts

August 19th, 2014 08:00

Sorry for dropping this. For the sake of giving some closure, the final answer was that Storage Craft is not a supported backup client, so although the DD160 shows up as a valid target and can be written to, it DOES NOT compress OR de-dupe the file output of the Storage Craft clients. We switched to vDP-A for our virtualized clients, and got expected levels of compression and de-dupe (more or less), and were forced to retain our old pre-DD160 backup targets and Storage Craft for our physical systems.

Thanks for all your suggestions and tips.

ch

No Events found!

Top