We have set up a VDM with a CIFS server. Currently we do not have the option of using NDMP (due next year) to perform backups, only LAN based backups (with HP Data Protector). The backup SW is accessing the configured shares on the CIFS server whan backing up data.
We want to use the deduplication feature of the VNX. I realize that all files will be duplicated when backed up and of course duplicated when a restore is performed and that is OK for now. However, will the deduplication feature of the VNX work? The Data_Deduplication.pdf states that "Deduplication functionality operates on whole files and is applicable to files that are static or nearly static with a last-access time greater than 15 days (default value)" and "During deduplication, each deduplication-enabled file system on the Data Mover is scanned for files that match specific criteria, such as last-access time and a modification time older than a certain date."
I'm guessing that the backup will update the 'Last Accessed Time' attribute and thus the dedup engine will newer run. If so, are there any options that we can configure to get the dedup engine to work in this environment, given the LAN based backup scheme?
We have 2 x VNX5500 with 2 datamovers each.
Planning to use cifs and VNX replicator between the 2 systems.
Flare block code: 5.32.5.011 and Nas code: 184.108.40.206
It sounds like you're talking about the filesytem-level deduplication. This is enabled either globally, or can be enabled on each filesystem individually. When deduplication is enabled, the VNX will atempt to scan the filesystem on a regular basis and attempt to deduplicate files found. If there are files that cannot be deduplicated, then the VNX will attempt to compress them to conserve space.
When your backup runs, the VNX will pass the space-saving version of the file (compressed or deduped) to the backup software, if the file is below the setting, "-backup_data_threshold <percent>", which is 90% by default. So, any file reduced to 90% of its size, by default, is handed over to the backup software in that fashion. Now, here's the big caveat: I'm not certain you will get this behavior with LAN-based backups. If HPDP does not start an NDMP session for the backup, but rather, reads the network backup as "just files", you will probably only read the full files over the network. You might try making a local replica of a filesystem, enabling deduplication, then checking the files in the backup set.
Let us know if that helps!
Thanks for you reply. Our main concern is if the dedup will skip files which are backup up by DP, since the last access bit will be updated when DP runs file backup of the cifs share.
If your backup software is modifying the "last access time" then you need to eliminate this criteria from the dedup policy scan and look for different criterias.
Why don´t you back up a Snapshot instead of the real produktion FS? Then the last access update would not matter and your backup would be consistent.