NetWorker: Optimizing space recovery operations for Data Domain

Summary: This article provides some useful tunables and steps for reducing load and helping to free up space on Data Domain devices in a NetWorker datazone.

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Symptoms

  • NetWorker savesets marked Expired but not removed
  • Space recovery messages appear in logs more than once per day
  • Data Domain speed and load impacts
  • General server performance impacts

Cause

  • Volumes eligible for space recovery are reading during Expiry action (staging cloning or recovering)
  • Space recovery runs by default after every staging operation on any given volume
  • Space recovery checks each file in a volume directory structure when running space recovery
  • Server operations and responsiveness may slow down during the space recovery phase

Resolution

NetWorker's space recovery phase runs once a day as one of the final phases of the Expiration action in the Server backup workflow. It is intended to delete saveset file objects within a volume following the server's assessment, expiry, and deletion of saveset records after calculating those which are safe to remove according to their configuration.

There are several factors which may have adverse impacts on the Data Domain or NetWorker server responsiveness. Enable any of the below which appear to suit the requirements of the datazone in question. Before considering testing with the debug keyfiles below: Disable the daily Server Protection > Server backup > Expiration action to disable all recover space and media database calculations for one or more days to confirm the performance issues encountered are related to space recovery and/or expiration activities.

If disabling Expiration confirms the issue related to daily maintenance, the following features can be disabled for troubleshooting by creating an empty file of the same name (without an extension) in NetWorker server or node under the main nsr directory's debug subdirectory. None of these flag files require a restart, and will take effect with recover space jobs launched while they are present.

Linux Location: /nsr/debug
Windows Location: C:\Program Files\EMC NetWorker\nsr\debug (or corresponding nsr installation path)
 

NOTE: Not all tunables here are present below NetWorker version 19.8.0.4.

The file names and their functions are detailed below:

skip_recover_space_for_stage

  • Storage nodes. This flag causes NetWorker to skip the recover space phase of a staging operation (cloning followed by source deletion). If your environment uses staging, particularly staging from the same source volumes repeatedly, this is recommended since it negates the possibility of spawning multiple recover space operations for the same volumes. When this flag is in place, the recover space operation is deferred entirely, allowing the system to delete the files when the Expiration daily action runs, or the nsrim command is run manually. 

recover_space_anytime

  • Server only. This allows recover space to expire and remove savesets on volumes which are actively reading, which by default is deferred. This means that for volumes which have long-running clone jobs, expiry, and space recovery can be deferred repeatedly when Expiration action, nsrim, or a staging job (see previous) runs. This in turn can lead to large space recovery backlogs, gradual free space depletion, and a larger space recovery job when it is allowed to run. 

skip_disk_usage

  • Storage nodes. As part of space recovery and disk volume file system checking, by default, individual files are recursively checked and counted to produce a precise aggregate of data for the volume. While some may consider this precision essential, deferring this step relies on NetWorker's media database records for the file and byte totals, which usually can be expected to be accurate enough for most uses. In a heavily loaded Data Domain, especially one where many recover space operations run repeatedly for volumes, this can be considered a needless expense, and safely disabled.

skip_consistency_check_in_recover_space

  • Storage nodes. During space recovery for a volume, the volume filesystem is checked file by file to ensure consistency between the media database; this can also introduce latency. Adding this keyfile to each node will prevent that node from deleting saveset files where a corresponding record does not exist in the media database, or marking media database records where no file is found as 'suspect'. Note that this will prevent the normal cleaning operations, and should be used to help qualify latency related to recover space operations, and should not be disabled longer term. 

More verbose logging has been introduced by default causing the entire saveset paths to be logged into the data_audit logs on the NetWorker server. Where there is already heavy load, many/large space recovery jobs, this is a factor which can lead to unresponsiveness, in particular from Storage Nodes which return the information remotely to NetWorker. To disable this, raise the logging threshold for these logs on the NetWorker server:

# nsradmin
# nsradmin> show name; auditlog severity
# print type: nsr auditlog

Restrict this change to only affect the data audit, if wanted, by refining the query to the specific instance by including its name. Skip this step to reapply the setting to each:

# print type: nsr auditlog; name: servername_data_audit.raw

Change the threshold to one or both to 'Error' to cease logging the individual deletes - deletions are still logged in the server's daemon.raw.

# update auditlog severity: Error

Affected Products

NetWorker
Article Properties
Article Number: 000225835
Article Type: Solution
Last Modified: 26 Nov 2025
Version:  2
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.