PowerFlex 4.X: MDM reconnection delay due to DNS timeout

Podsumowanie: In certain environments, an MDM restart may take approximately 20 seconds before re‑joining the cluster. The delay occurs during ActiveMQ CMS initialization, which is part of the MDM startup sequence. If DNS name resolution is slow or misconfigured, the CMS initialization blocks until the DNS lookup times out. ...

Ten artykuł dotyczy Ten artykuł nie dotyczy Ten artykuł nie jest powiązany z żadnym konkretnym produktem. Nie wszystkie wersje produktu zostały zidentyfikowane w tym artykule.

Objawy

  • When the MDM daemon starts, you see the following in the logs:

  • MDM events - other MDM that started connecting after 20 seconds

2025/12/12 20:53:06.835 MDM_CLI_CONF_COMMAND_RECEIVED INFO Command SWITCH_MDM_OWNERSHIP received, User: 'admin'.
2025/12/12 20:53:27.065 REMOTE_SYSLOG_MODULE_INITIALIZED INFO Initialized the remote syslog module
2025/12/12 20:53:27.065 MDM_MANAGER_START INFO MDM started with the role of Manager
2025/12/12 20:53:27.133 MDM_CLUSTER_CONNECTED INFO The MDM, server-014 (ID 000000000000000e), connected after 2000ms

 

  • In the MDM trace logs, a ~20-second gap can be observed during CMS initialization:
2025/12/12 20:53:07.022040 LOW:000000000000:mosTrcLayer_Create:00210: ---------- Process started. Version private PowerFlex R4_5.4000.111_Release, CodeBase , Feb 20 2025. PID 9138 ----------
2025/12/12 20:53:07.061777 LOW:7fe11c4fadb0:amqEventMgr_Init:00037: Starting AMQ CMS initialization <<<=======
2025/12/12 20:53:27.064872 LOW:7fe11c4fadb0:amqProducer_Create:00323: Creating events handler
2025/12/12 20:53:27.065210 LOW:7fe11c4fadb0:amqEventMgr_Init:00048: Finished AMQ CMS initialization <<<=======

Impact

The MDM daemon startup time is extended, which increases the time required for the MDM to reconnect to the cluster. This delay can be observed during operations that restart an MDM, such as:

  • MDM ownership switchover
  • MDM service restart
  • Node reboot
  • Other events that trigger an MDM restart

As a result, the MDM_CLUSTER_CONNECTED event appears approximately 20 seconds after the restart, rather than immediately.

Przyczyna

When an MDM daemon starts and attempts to rejoin the cluster, it performs several internal initialization steps before becoming operational. One of these steps is the initialization of the event and notification framework (ActiveMQ CMS), which is used for system alerts and monitoring.

Currently, this initialization is:

  • Synchronous
  • Required before cluster membership is established

Therefore, the MDM cannot join the cluster until CMS initialization completes. If DNS name resolution on the node is slow or misconfigured, CMS initialization may wait for DNS queries to timeout, resulting in an approximately 20-second startup delay.

Rozwiązanie

Verify that DNS configuration and hostname resolution are functioning correctly on the MDM nodes.
The following commands can be used to test hostname resolution performance:

  1. Verify DNS latency:

    Note: Perform these checks on a Secondary MDM or non-primary node. Avoid running them on the Primary MDM.
    time hostname -f time getent hosts $(hostname -f)

    The real time should be less than 1 second.

     If the commands take ~20 seconds or timeout, it indicates a DNS resolution issue.

     

  2. Verify DNS configuration
    Check configured DNS server in /etc/resolv.conf

    • Verify that a valid nameserver entries exist.

    • Verify that the configured DNS servers are reachable.

    Correct the DNS configuration or hostname resolution settings on the affected node and re-run the tests.

    Note: Perform these checks on a Secondary MDM or non-primary node. Avoid running them on the Primary MDM.

     

  3. Verify DNS server reachability

    Test connectivity to the configured DNS servers:

    ping <dns_server_ip>

 

  1. Re-test DNS latency provided in Step 1.

  2. After DNS resolution is functioning correctly, you could test by restarting the MDM daemon to test to check if MDM rejoins the cluster in approximately couple of second .

    Note: Perform the MDM daemon restart only on a Secondary or Standby MDM. Do not restart the daemon on the Primary MDM. You could check the cluster status via 'scli --query_cluster' 

 

Impacted Versions

PowerFlex 4.x

Fixed In Version

N/A - Environmental/configuration issue

Produkty, których dotyczy problem

PowerFlex rack, ScaleIO
Właściwości artykułu
Numer artykułu: 000438735
Typ artykułu: Solution
Ostatnia modyfikacja: 12 mar 2026
Wersja:  4
Znajdź odpowiedzi na swoje pytania u innych użytkowników produktów Dell
Usługi pomocy technicznej
Sprawdź, czy Twoje urządzenie jest objęte usługą pomocy technicznej.