PowerScale: NDMP Performance Troubleshooting

Summary: When investigating Network Data Management Protocol (NDMP) performance issues on a PowerScale cluster, there are some key areas to investigate for possible causes.

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Instructions

In the newer version of OneFS 9.x.x, several performance enhancements were made to NDMP. Verify the cluster's version and installed RUPs to ensure that the latest improvements are being applied.

Evaluating NDMP performance should be assessed by analyzing three key system resources:

  • CPU Utilization
  • Disk I/O
  • Network infrastructure

CPU Performance Analysis

For each node that is reported to be running slowly, check the isi_hw_status and top outputs.

  1. Identify Virtual Cores

From isi_hw_status, calculate virtual cores:

Virtual Cores = CPUs × Cores per CPU × 2 (if Hyperthreading is enabled)

Example:

PROC: Single-proc, Dual-HT-core → 1 × 2 × 2 = 4 virtual cores
  1. Check Load Averages

From the top output, review the 1, 3, and 5-minute load averages:

load averages: 4.71, 3.48, 3.09

If the load average exceeds the number of virtual cores, CPU load might be a contributing factor to NDMP performance issues. The recommendation is to reduce the number of active processes or redistribute the load to less heavily used nodes.

Disk Performance Analysis

Steps:

  1. Review Drive Statistics

For each node that is reported to be running slowly, check the isi statistics drive, and examine the Queue column. A value:​​​​​​

  • > 1.0 indicates queuing
  • > 1.5 suggests significant performance degradation
Example:
Queued: 2.3 → High I/O wait on the spindle
  1. Check Storage Utilization

Ensure that disk usage is below 90%. High utilization can exacerbate performance issues.

Example:

Used: 63.2%  <-- Within acceptable range
  1. Recommendations

If queuing is high, reduce I/O load, redistribute backups, or scale resources.

Network Performance Analysis (Three-Way NDMP Only)

Steps:

  1. Identify NDMP Connections

In the netstat output, locate the NDMP CONTROL connection (port 10000) and identify the corresponding DATA connection (typically listed above it).

Example:

tcp4  0  384563 172.19.220.31.23261  172.19.200.22.55621  ESTABLISHED  ← DATA
tcp4  0       0 172.17.2.91.10000    172.19.200.22.55424  ESTABLISHED  ← CONTROL
  1. Analyze Send-Q

A high and stable Send-Q (for example, a six-digit value) indicates that data is being sent but not acknowledged, suggesting a bottleneck.

  1. Check Backup Server
On the backup server, inspect the Recv-Q. A high value implies the Data Management Application (DMA) is overwhelmed.
  1. Recommendations

If the Data Management Application (DMA) is the bottleneck, the recommendation is to engage the DMA support team for further assistance. 

Affected Products

Isilon, PowerScale OneFS
Article Properties
Article Number: 000187297
Article Type: How To
Last Modified: 20 Aug 2025
Version:  6
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.