This post is more than 5 years old
2 Intern
•
259 Posts
0
1979
July 21st, 2009 10:00
data dedupe question
I understand that if I dedupe a filesystem, the deduped data gets written to savvol. If I then delete the deduped filesystem without re-hydrating it, will the deduped data that exists in savvol get deleted as well?
Thanks.
Jim
Thanks.
Jim
No Events found!


gbarretoxx1
2 Intern
•
366 Posts
0
July 22nd, 2009 05:00
actually, the deduplication engine does not USE the SavVol.
It uses a hidden area within the filesystem to copy the candidates and deduplicate/compress them.
What the manual states is :
IF you have checkpoints for this specific filesystem, the deduplication process will copy the changed blocks to the SavVol as it was a user write.
From "Using Celerra Data Deduplication" on "Planning application integration" secction :
Point-in-time views of the file system
The deduplication process releases space in the production file system immediately. However, it may cause blocks to be copied to the SnapSure save volume (SavVol) in the process. Deduplicating data associated with a file involves copying the data within the file system so it can be compressed as well as single instanced. Since SnapSure checkpoints copy changed blocks to the SavVol on first write, the blocks that are deduplicated may need to be copied to the SavVol in order to preserve a previous checkpoint point-in-time view of the file system.
These blocks are freed when the corresponding checkpoint gets deleted or refreshed and are then available for re-use by other checkpoints. How many blocks will need to be copied to the SavVol during the deduplication process is a function of how full the file system is, the rate of change in it, and so on, and therefore is difficult to predict. By default the system is configured to abort deduplication operations on a file system before it causes the SavVol to extend. This avoids the SavVol expanding due to deduplication activity. If the deduplication process is aborted in this way, an alert is generated that explains what happened. The Celerra administrator can choose to extend the SavVol or simply let the deduplication process execute again on its next scheduled run.
So, to answer your question, if you want to delete a deduped filesystem with checkpoints, you need to delete it's checkpoints first, so you won't have a SavVol for this filesystem when you are deleting it.
Gustavo Barreto.
Peter_EMC
674 Posts
1
July 21st, 2009 22:00
jimkunysz
2 Intern
•
259 Posts
0
July 22nd, 2009 06:00
If you used Avamar or Datadomain to dedupe the backup process, what would happen?
Rainer_EMC
6 Operator
•
8.6K Posts
1
July 23rd, 2009 02:00
Peter_EMC
674 Posts
0
July 23rd, 2009 03:00
dynamox
11 Legend
•
20.4K Posts
•
87.4K Points
0
July 23rd, 2009 05:00
can you please elaborate how dedupe changes backup process with VBB option ?
Thank you
Peter_EMC
674 Posts
0
July 23rd, 2009 06:00
As result, files are not deduped writen to backup medium (tape).
VBB:
Celerra deduplication-enabled file systems can be backed up using Celerra Volume Based
Backup (VBB) and restored in full by using the FDR method. However, a single file
restore or a file-by-file restore of deduplicated files from VBB backups is not supported
and will be rejected by the Celerra.
As result, FS is deduped writen to backup medium (tape).
dynamox
11 Legend
•
20.4K Posts
•
87.4K Points
0
July 23rd, 2009 08:00
reduplicate the files during the backup process, will
be slowed when backing up deduplicated files. This
will be particularly noticeable for small files.
As result, files are not deduped writen to backup
medium (tape).
so every time i run NDMP backup, the whole file system gets re-duplicated and dedupe process has to start from scratch ?
gbarretoxx1
2 Intern
•
366 Posts
0
July 23rd, 2009 11:00
No.The files are reduplicated on the tape.
Gustavo Barreto.
dynamox
11 Legend
•
20.4K Posts
•
87.4K Points
0
July 23rd, 2009 13:00
can you please explain what do you mean by "files are reduplicated on the tape" ?
Rainer_EMC
6 Operator
•
8.6K Posts
0
July 24th, 2009 00:00
so a NDMP PAX backup of a deduplicated file system takes a much space on tape as it would be without deduplication
that is planned to change in the future - but you need to talk to your local EMC technical contact for roadmap or beta information
dynamox
11 Legend
•
20.4K Posts
•
87.4K Points
0
July 24th, 2009 04:00
"de-single-instanced" in memory before they get put
on tape by NDMP PAX
in "memory", backup performance will probably suffer but as long as the whole file system does not get uncompressed/un-deduped it's a fair trade off i guess.
jimkunysz
2 Intern
•
259 Posts
0
July 28th, 2009 05:00
I'm familiar with the Centera and it placing a 'clock' symbol in the corner for those files aged to the Centera but I don't see anything similar to that with the Celerra dedupe feature.
jimkunysz
2 Intern
•
259 Posts
0
July 28th, 2009 05:00
Peter_EMC
674 Posts
0
July 28th, 2009 05:00
When a file is archived to the centera Explorer is recognizing the stub and marking the file using this clock icon.
But the Celerra deduplication is transparent to the client, so Explorer is not able to recognize it.
The only way I know is is comparing the properties of the file, f.e. the "size" against the "size on disk"