Start a Conversation

This post is more than 5 years old

Solved!

Go to Solution

12808

March 5th, 2012 07:00

Deduplication ratio on Avamar

Hello,

I am fairly new to Avamar backup technology. I started at this new client who has Avamar 5.0 deployed but DOESNOT do image level backup. So I am trying to make a report explaining why it is better to do so for which I need the present dedup ratio.

So is there a way to get a report from Avamar which gives the dedup ratio. If not, is there a way to calculate the same?

Any information is appreciated.

Thanks,

Sri

2K Posts

March 20th, 2012 13:00

The de-dupe ratio is calculated as the total amount of data protected vs. the total amount of data stored.

If you have a DPN Summary report that covers all the backups back to the install time of the system, you can calculate the dedup ratio by summing the TotalBytes column, summing the ModSent column and then dividing the TotalBytes sum by the ModSent sum.

On my test system, I came out with a de-dupe ratio of roughly 70:1.

173 Posts

March 5th, 2012 07:00

Compare Totalbytes colum to modsend culum, it will give you dedupe ratio.

Regards

Lukas

223 Posts

March 14th, 2012 04:00

Hi Lucky85,

is this information available for the whole avamar system?

In the reports I only can see these entries for each single client.

Of course I can export this and calculate the ration by myself, but it would be easier to see it in the reports in one step.

173 Posts

March 14th, 2012 05:00

No as far as I know, you have to calculate it, use dpn summary report from the beginning and import it to excel…

Regards

Luksa

115 Posts

March 20th, 2012 12:00

Hi Lucky85,

Thanks for your reply.

I calculated the dedup ratio using the columns you said and I am seeing a number between 500 to 1000 for filecluster servers. Now I understand that the number will be high but numbers upwards of 500???

Does this look right? or Am I doing something wrong?

2K Posts

March 20th, 2012 13:00

You can also look at the PcntCommon column. This column is the percentage of data for the backup that is already on the grid (higher is better). I believe the column is rounded to whole numbers so you might see a fair number of 100s if the change rate for the client is low enough compared to its backup size.

173 Posts

March 21st, 2012 01:00

Seems to be wrong, typical dedupe ratio is 20-50x.Check if you also counts initial backups…

Regards

Lukas

2K Posts

April 17th, 2012 06:00

rmartin@ipm.es wrote:

Hello,

I have done the same calculation (by summing the TotalBytes column, summing the ModSent column and then dividing the TotalBytes sum by the ModSent sum.), my resullts are 108:1, is it possible?

Thanks

108:1 seems too high to me.

Are you certain that the DPN Summary covers the entire history of the grid? If the initial backup for each client is not included in the report, the calculated dedupe ratio will be too high.

April 17th, 2012 06:00

Hello,

I have done the same calculation (by summing the TotalBytes column, summing the ModSent column and then dividing the TotalBytes sum by the ModSent sum.), my resullts are 108:1, is it possible?

Thanks

April 17th, 2012 07:00

Hi,

Are you certain that the DPN Summary covers the entire history of the grid?. yes

I will perform calculations again. Is there another way to calculate this?

Thanks

2K Posts

April 17th, 2012 10:00

If the system has been in production for a long time, the activity records for the initial backups may have been pruned from the database, in which case they won't appear in the DPN Summary report. In the default configuration, entries older than 1 year will be pruned to keep the database at a reasonable size. This would be good to check.

If the intials for any client were taken more than a year ago, you will have to do some fiddling with the report or the calculated de-dupe ratio will be much higher than the real ratio. For any client older than a year, duplicating the oldest entry and copying the TotalBytes value into the ModSent column for that row will serve as an estimate of the initial size of the backup. This won't be perfect because it doesn't consider data that has been de-duplicated because it appears on multiple clients but it should be close enough for most purposes.

43 Posts

April 18th, 2012 08:00

Hi ianderson,

I have calculated the dpnsummary from day one until today and we are getting Value of 82,7%. Could this be real.

Since they one all Backup in Totalbytes are ~853TB and we have a Single Node with 7,8TB.

Can this report be real?

Thanks and regards,

2K Posts

April 18th, 2012 09:00

One thing I did forget to mention is that the DPN Summary report will include backups that have expired from the system. These will need to be excluded from the calculations.

43 Posts

April 18th, 2012 10:00

Okay, but how can i find out which backup have already been expired, as there is no such column within this table?

I guess creating a customized view for tables could solve this issue but it need some scripting effort for this.

maybe somebody already did something similar?

2K Posts

April 18th, 2012 11:00

You can retrieve this information from the v_activities view in the Administrator Server's database (including the expiry dates and times). The System Administration guide explains how you can set up an ODBC connection to the database. From there you can use third party tools (spreadsheets, database packages, reporting packages, etc.) to query the views directly.

I'm not aware of any script available that does this "out of the box" but if you develop a script, please feel free to share it on the support forum.

No Events found!

Top