PowerProtect Data Manager: In the PPDM UI, the status of the search cluster shows that a particular search node is in a failed state

Summary: The search node becomes unresponsive, and indexing jobs remain in a queued state as they cannot run on failed nodes. This can happen with a Search node that is 19.16 release or earlier. ...

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Symptoms

On the Search Node which is in a failed state, go to /var/log and check the messages log. You see an entry similar to:

2024-07-08T10:00:12.049322-04:00 search_node_name kernel: [518834.025665][    C1] watchdog: BUG: soft lockup - CPU#1 stuck for 235970s! [nfsd:2692]

Affected versions: 19.16 and below

Investigated by Dell Engineering in PPDMESC-6808

 

Cause

The NFS daemon on the Search Cluster hits an OS level "Soft lockup." For more details about a soft lockup, read:
https://www.suse.com/support/kb/doc/?id=000018705This hyperlink is taking you to a website outside of Dell Technologies.

 

Resolution

Workaround:
Log in to the search node which had nfsd was unresponsive.

NOTE: If you need the credentials for the search node, run the following command on the PPDM appliance as the root user:
source /opt/emc/vmdirect/unit/vmdirect.env  &&  /opt/emc/vmdirect/bin/infranodemgmt get -secret 

This supplies the admin and root credentials for the search nodes. Open SSH session to the search node in question as the admin user and run the following commands:

echo 20 > /proc/sys/kernel/watchdog_thresh

This command modifies the watchdog threshold to 20. However, applying this configuration change does not persist across restart of the server. Make the following change to persist this across server restart.

echo "kernel.watchdog_thresh=20" > /etc/sysctl.d/99-watchdog_thresh.conf
sysctl -p  /etc/sysctl.d/99-watchdog_thresh.conf

Permanent Fix: PowerProtect Data Manager version 19.16 P2 & 19.17+ release

 

Products

PowerProtect Data Manager
Article Properties
Article Number: 000228169
Article Type: Solution
Last Modified: 07 Jul 2025
Version:  2
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.