Data Domain Software : Discrepancy between usage reported by "filesys show space (df)" and sfs_dump totals

Summary: Some customers add up the "post-lc-size" (post-compression size) of all files reported by "sfs_dump" and expect it to match the space utilization reported by “filesys show space” (df). As these two values don't match, they can see the difference as “missing or unaccounted space” ...

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Instructions

The purpose of "sfs_dump" (or filesys show compression) is to help get a sense of the compression achieved for a file at the time of ingest. It is not meant to reflect the overall space consumed.

The summation of post-compression size of all files reported by "sfs_dump" will match the total space utilization only if no deletes have ever been performed on the filesystem. This is never the case on a production DDR, so using "sfs_dump" to account for current space utilization is a fallacy. We will illustrate tthis with an example below.
 
 
ILLUSTRATION:
 
This can be easily seen with a simple three-step example:
 
  1.  I backup a file (file1) that is 10G. Say this is a brand-new box and nothing to de-duplicate  with. The post-compression size is hence around 10G also.

# sfs_dump /data/col1/test/ /data/col1/test/file1: mtime: 1503023531690895000 fileid: 18 size: 10736790004 type: 9 seg_bytes: 10772785048 seg_count: 1282587 redun_seg_count: 0 (0%) pre_lc_size: 10772785048 post_lc_size: 10942680452 (102%) mode: 02000100644

 
As seen above, the post compression size is 10G.


         II.  Next,  I backup another 10G file (file2).  This is very similar or same-as file1.   

# sfs_dump /data/col1/test /data/col1/test/file1: mtime: 1503023531690895000 fileid: 18 size: 10736790004 type: 9 seg_bytes: 10772785048 seg_count: 1282587 redun_seg_count: 0 (0%) pre_lc_size: 10772785048 post_lc_size: 10942680452 (102%) mode: 02000100644 /data/col1/test/file2: mtime: 1503023531690895000 fileid: 19 size: 10736790004 type: 9 seg_bytes: 10772785048 seg_count: 1282587 redun_seg_count: 1282587 (100%) pre_lc_size: 0 post_lc_size: 0 (0%) mode: 02000100644

 
As seen above, file2  took no additional space as it de-duplicated completely with file1.
 

          III.   Now, I delete File1 and run "sfs_dump" on the mtree again.  

# sfs_dump /data/col1/test /data/col1/test/file2: mtime: 1503023531690895000 fileid: 19 size: 10736790004 type: 9 seg_bytes: 10772785048 seg_count: 1282587 redun_seg_count: 1282587 (100%) pre_lc_size: 0 post_lc_size: 0 (0%) mode: 02000100644

 
 
Based on the above, the customer may expect that no space should be consumed on the filesystem.  The only file in the mtree takes up 0 bytes post-compressed (Summation of all files reported by sfs_dump is 0 bytes).  But this is a fallacy. The space utilization on the system is actually 10G and is accurately reflected by  df . 
 
This is because File2 is referencing all segments originally written by File1.  The deletion of File1 freed up nothing. "sfs_dump" doesn t show the original File1 because it has been deleted.

# df Active Tier: Resource Size GiB Used GiB Avail GiB Use% Cleanable GiB* ---------------- -------- -------- --------- ---- -------------- /data: pre-comp - 10.0 - - - /data: post-comp 129479.5 10.7 129468.8 0% 10.2

 

Products

DD OS 5.5, DD OS 5.6, DD OS 6.0, DD OS 6.1, DD OS Previous Versions
Article Properties
Article Number: 000022756
Article Type: How To
Last Modified: 08 Oct 2024
Version:  3
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.