Avamar: NDMP backup fails with Fatal signal 11 segmentation fault from volume memory use
Summary: Avamar Network Data Management Protocol (NDMP) backup can abort with "Fatal signal 11" (segmentation fault) when a very large volume (such as, 5 TB, 3 M files) runs multiple streams, consuming approximately 15 GB Random Access Memory (RAM)/SWAP and exceeding memory limits. Resolve by splitting the volume, reducing concurrent backups or file count, backing up a lower directory level, or increasing parallel streams. ...
Symptoms
Backup Failure Indications
The following symptoms are observed when an Avamar NDMP backup encounters a segmentation fault (signal 11):
- Backup job aborts with a fatal error message similar to:
2017-10-13 19:42:00 avtar FATAL <5889>: Fatal signal 11 in pid 31103
- Log entries show unusually large datasets being processed, for example:
avtar Info <8688>: Status 2017-10-13 19:32:37, 3,050,352 files, 2,419,299 directories, 5,119 GB (3,050,352 files, 1.913 GB, 41.42% new) 15049MB 60% CPU (1 open files)
- High memory consumption is reported, often exceeding 15 GB of RAM/SWAP for a single backup stream.
- Numerous NDMP streams are active (up to 8 per client), each potentially using 2 GB or more of memory.
- Multiple large backups may run concurrently, increasing the overall system load.
- The affected volume contains millions of files and directories (such as, 3 M files in 2.4 M directories covering 5.1 TB of data).
- Even when only a small amount of data has changed (such as, 1.9 GB), the backup process attempts to send every file from the NAS for processing.
Cause
Underlying factors that triggered the fatal signal 11.
Signal 11 (segmentation fault) is generated when a process accesses memory that is not allocated to it. The following conditions directly caused this event during the Avamar NDMP backup:
- Multiple large NDMP backups were running concurrently.
- One backup processed 3,050,352 files and 2,419,299 directories, totaling 5.1 TB of data, while only 1.9 GB of that data had changed.
- Each NDMP stream can consume ≥ 2 GB of memory. The client was permitted up to 8 streams, and several clients were active simultaneously, leading to a high aggregate memory demand.
- The backup process used approximately 15 GB of RAM/SWAP before the crash.
- Avamar limits the number of streams per client but does NOT enforce a global limit on the total number of streams. This allows the combined memory usage to exceed available resources.
These memory‑intensive conditions caused the avtar process to encounter a segmentation fault, recorded in the log as:
2017-10-13 19:42:00 avtar FATAL <5889>: Fatal signal 11 in pid 31103
Resolution
Fixing Avamar NDMP Backup Failures Caused by Signal 11 (Segmentation Fault)
Step 1 - Assess Current Backup Load.
Use the Avamar Administrator console or CLI to identify volumes that generate large NDMP backups.
List active NDMP jobs and their resource usage:
$ avtar -listjobs -type ndmp
Step 2 - Reduce Simultaneous Volume Backups.
- Limit the number of volumes backed up simultaneously to avoid excessive RAM/SWAP consumption.
- In the Avamar Administrator, edit the backup schedule and deselect overlapping windows.
Step 3 - Split Large Volumes into Smaller Sub‑Volumes
- Identify volumes with more than 3 million files or more than 5 TB of data (as in the example).
- Create logical subvolumes one level lower in the directory tree.
- Configure each subvolume as a separate NDMP client in Avamar.
- Example: Create a new NDMP client for a sub‑directory
$ avtar -addclient -name subvol1 -path /data/level2/subvol1
Step 4 - Adjust NDMP Stream Settings.
- Increase the maximum number of NDMP streams per client if the NAS supports it.
- In the NAS NDMP configuration, raise the stream limit from the default 4 to 8 where possible.
Step 5 - Limit Files per Stream
- When creating backup policies, set a lower "files per stream" threshold to keep each stream’s memory footprint at less than 2 GB.
- Use the Avamar Administrator → Policies → Advanced Settings to adjust this value.
Step 6 - Monitor Memory Usage During Backups
- Watch RAM and swap consumption on the Avamar server while the backup runs.
- Ensure usage stays well below the total available memory (such as less than 12 GB for a 15 GB job).
- Real‑time memory monitoring
$ top -b -n 1 | grep avtar
Step 7 - Validate the Fix
- Run the previously failed backup again.
- Confirm that the log no longer contains the
Fatal signal 11message. - Verify that the backup completes successfully and that the reported data size matches expectations.
- Check the latest backup log for errors
$ tail -n 50 /var/log/avtar/backup.log