Start a Conversation

Unsolved

This post is more than 5 years old

1277

November 28th, 2012 12:00

Rman backups Datadomain

We recently installed a Datadomain for use with Oracle Rman backups and I was wondering if someone might answer a question for me concerning the deduplication of data on the device.  Basically I performed a RMAN full backup of a database using 4 rman channels and the rman filesperset=1.  When the backup completed I noted that approximately 268.12Gb of data was written to the directory on the Datadomain - at least looking at the file sizes from the O/S side as the Oracle user.  I then turned about and ran the full backup again.  I should point out that this database is just a test and in fact I'm the only user.  When I looked at the amount of data written to the device it indicated that an additional 268.09Gb had been written to the device as a result of the second backup.  What I'm wondering is shouldn't the amount of space be much less the second backup given deduplication?  I must admit that I'm rather new to this technology and it might be the actual amount of space being used on the device is not reflected correctly when seeing it from the O/S side.

Thanks,

53 Posts

November 28th, 2012 13:00

Hi Dmflinn,


I am making the assumption that you are backing up to the DD appliance with it mounted as an NFS share or possible as a VTL, either way it’s unimportant as the answer is the same.


Data Domain uses what is known as target deduplication, that means that dedupe is done on the appliance, therefore every time RMAN does a backup it sends the complete data set to the appliance. The appliance does an inline dedupe and stores only unique data.


When looking from the OS side you will see pretty much the same amount of data backed up every time. However if you look from the Data Domain side you will see total amount of data written (cumulative figure of OS backups) and amount of data stored will reflect the reduced number you are looking for.

These papers will help explain this, go straight to the results section for a quick example.

Data Warehouse

http://powerlink.emc.com/km/live1/en_US/Offering_Technical/White_Paper/h8028-backup-recovery-oracle-wp.pdf

OLTP over Fibre Channel

http://powerlink.emc.com/km/live1/en_US/Offering_Basics/White_Paper/h6835-backup-recovery-oracle-clariion-data-domain-ra.pdf

OLTP over NFS

http://powerlink.emc.com/km/live1/en_US/Offering_Basics/White_Paper/h7087-backup-recovery-oracle-clariion-dd-networker-ra.pdf

For reference EMC Avamar uses what's known as source deduplication, and yes you've guessed it the dedupe is done locally and only the unique data is sent across the transport to the appliance.

DDBoost is also an option, using distributed deduplication, this is a combination of both. The heavy lifting is still done on the DD appliance but the storage node is aware of the unique data on the appliance and only sends the required data to the appliance for further dedupe.

Either way the GUI or CLI will show you what you need.

Hope this helps, Dave O’…

1 Rookie

 • 

20.4K Posts

November 28th, 2012 13:00

i assume you are not using DD Boost ? If you are not , you still need to send that RMAN data to DD but as it gets to DD it gets deduped inline. So now if you ssh to DD and run this command against the directory where you sent your data, it should give a rough estimate of deduplication

filesys show compression /backup/

9 Posts

November 29th, 2012 13:00

Dynamox and Dave O:  Thank you both for the answers - you confirmed what I was wondering it is simply was a matter that I was not familiar with the device.  Now all I have to get someone to agree to is a loggin into it.  Thank you again.

No Events found!

Top