DD uses variable block size (I believe I have seen somewhere 4-12KB chunks). Bare in mind that this does not translate to file size since this is applied at block level to the backup stream.
So Hrvoje is correct - when a file is sent to a DDR it is split into a series of 4-12Kb chunks (known as segments) which is the granularity at which de-duplication takes place. If you send a file smaller than 4Kb (i.e. smaller than the minimum supported segment size) then:
- The file will be padded to 4Kb so that it reaches minimum supported segment size
- The file will not initially be de-duplicated and will be written directly to disk (in a compressed format)
- During the next clean the DDR will see if the file can in fact be de-duplicated and if so will go ahead and do this
Note that there are various issues writing large numbers of small files to a DDR (i.e. excessive use of disk space, poor clean performance on certain versions and so on). DDRs are really designed to hold small numbers of large files so, if possible, I would avoid writing large amounts of very small files to the system.
ble1
4 Operator
•
14.4K Posts
2
January 31st, 2017 05:00
DD uses variable block size (I believe I have seen somewhere 4-12KB chunks). Bare in mind that this does not translate to file size since this is applied at block level to the backup stream.
James_Ford
30 Posts
1
February 7th, 2017 09:00
So Hrvoje is correct - when a file is sent to a DDR it is split into a series of 4-12Kb chunks (known as segments) which is the granularity at which de-duplication takes place. If you send a file smaller than 4Kb (i.e. smaller than the minimum supported segment size) then:
- The file will be padded to 4Kb so that it reaches minimum supported segment size
- The file will not initially be de-duplicated and will be written directly to disk (in a compressed format)
- During the next clean the DDR will see if the file can in fact be de-duplicated and if so will go ahead and do this
Note that there are various issues writing large numbers of small files to a DDR (i.e. excessive use of disk space, poor clean performance on certain versions and so on). DDRs are really designed to hold small numbers of large files so, if possible, I would avoid writing large amounts of very small files to the system.
Thanks, James