Start a Conversation

This post is more than 5 years old

Solved!

Go to Solution

17652

April 22nd, 2015 16:00

Tracking Data Domain usage with Avamar and DPA?

This is one of those weird ones. Is it a DPA question? Avamar? Data Domain? Really, it's sort of all of them. However, as I feel DPA should be able to help answer this sort of question, I'm going to start here and move it as needed.

The Objective:

Recent datacentre moves have shifted some Avamar workloads. All Avamar workloads write to one of our DD's. (nothing on the grid other than metadata).  The directive has come down from on high that we need to move between 1 and 3   data domain ES30 shelves from one location to another to help this production move. In itself, this is not  a problem. We will have a loaner DD that we will do the appropriate replication to/from in order to remove the shelves and rebuild the DD.

The Question:

EMC asked us to provide how much data is being taken up on the Data Domain by certain clients, groups, domains, anything other than just the overall DD utilization number. Why? This would allow us to determine what we can move off the existing DD, and determine how many shelves we can ultimately move, not to mention properly size the loaner DD coming in. Even if we could determine how much data is currrent, how much is in RETIRED, and how much has been replicated to this DD, that would help. 

The Difficulty:

There doesn't seem to be a good way to do this. EMC SE's have sort of punted the question back to us. "Tell us how much is there, specifically as you can, and we'll help size appropriately."

DPA reports, and avamar reports (ie: DPN summary) do not seem to be able to directly let me know how much data is on the DD in specific to anything granular. I can get avamar data sizes and deduplication levels by client from DPA.... but that doesn't translate well into knowing what's on the DD, and I can't even get that sort of data on RETIRED or REPLICATED sources.  I've got great reports on total filesystem capacity and trending on the DD thanks to DPA, but that solves nothing in specific.

This is something I've struggled with before. Networker is better able to answer these sorts of questions than Avamar, depending on how Networker/Boost is set up. However, data domains in general, and avamar in specific with DD, really doesn't seem to answer this question well. I figure something must know. If I delete a client's data in Avamar, the DD knows how much to clean during the next cleaning cycle... but what?

The Assumptions:

Avamar 7.0.2

Data Domain OS 5.4, single DDBoost storage unit from avamar. No other data sources for the DD. 

DPA 6.1SP2

So, how would you approach this? (apparently my response of "Raucous laughter and heavy drinking" isn't acceptable). How do you know what in specific is taking up x amount of space on your DD?


141 Posts

May 14th, 2015 02:00

Hi Brendan, In DPA 6.2 a new tool was added called the Data Domain Processor Tool (separate download available via support.emc.com). This tool is installed on a separate server/vm and has 3 scripts to run. The first connects to a DD and outputs the info about all the files on it to a file. The 2nd script then takes this file as an input, parses through it, connects back to the DPA server and maps each file to the backup client that generated it, then groups the file sizes together per client and sends the info back into the DPA server. The 3rd script gathers additional data about the size and count of the files on the DD grouped by age ranges and also sends it back into the DPA server. This then allows you to run some reports in DPA over your DD which tells you how much space (pre-and post logical sizes) is being used by each client. The restrictions here is that DPA has to know about the job that generated the file on the DD (otherwise it can't map the file). HTH David

141 Posts

May 14th, 2015 02:00

1.PNG.png

141 Posts

May 14th, 2015 02:00

2.PNG.png

141 Posts

May 14th, 2015 02:00

3.PNG.png

5 Practitioner

 • 

274.2K Posts

May 14th, 2015 02:00

Brendan

Get yourself a copy of DPA 6.2 SP1. It has a new tool call the Data Domain Processing Tool - that allows you determine which of the many client types is chewing up your DD space.

Raucous laughter should stop.

Heavy Drinking can start

15 Posts

May 14th, 2015 14:00

This looks like interesting stuff! I'll check into it as soon as I'm able to.

15 Posts

May 15th, 2015 10:00

Of course, with my usual timing, currently the 6.2 SP1 binaries are unavailable from EMC until further notice, to boot...

21 Posts

May 29th, 2015 07:00

Where is this tool located? I just upgraded to SP1 and cannot find it.

21 Posts

May 29th, 2015 08:00

I looked in the install directory and did not see this. I'm currently on version number: 6.2.1.96071 Build Number: 327

This is the latest version correct?

5 Practitioner

 • 

274.2K Posts

May 29th, 2015 08:00

The tool can be found under the

In a folder called 'dataprocessor'

There are three components to the tool :

dd_scanner - which collates raw data

file_age - which processes data for file age

client_aggregation - which processes raw data per client

66 Posts

May 29th, 2015 14:00

You can download the Data Domain Processor tool from support.emc.com/products/829. Under the downloads section look for a section called "PRODUCT TOOL" and that will have the DD Processor tool for Windows and Linux.

15 Posts

June 2nd, 2015 16:00

Well, I've got the tool downloaded, and installed. Ran the scripts to get the data from the DD and export it to the DPA environment. But when I try to add the data domain analysis as the documentation says to the given DD, it just sits in a disabled state. Doesn't seem to be able to run. I'll have to poke at this some more.

141 Posts

June 3rd, 2015 01:00

Hi Brendan,

The Data Domain Analysis request is just a dummy request to enable the reports to show up on the menu. So it just needs to be assigned and will be set to Disabled, as it doesn't actually do anything.

So once you have run the scripts and you have the request assigned, you should be able to run the reports

Thanks

David

1 Message

March 16th, 2016 19:00

Hi David,

When I run the report in question I don't get a list of clients I only get 1 "unknown" client.

Does this report only work against Agent based backups and not Image Level backups which are located under the vSphere server ?

Regards,

Shannen

April 17th, 2017 07:00

the datadomain reports show empty for the report names you have given graphs and outputs here, any idea/clue?

No Events found!

Top