PowerFlex 4.X: MDM reconnection delay due to DNS timeout

Resumen: In certain environments, an MDM restart may take approximately 20 seconds before re‑joining the cluster. The delay occurs during ActiveMQ CMS initialization, which is part of the MDM startup sequence. If DNS name resolution is slow or misconfigured, the CMS initialization blocks until the DNS lookup times out. ...

Este artículo se aplica a Este artículo no se aplica a Este artículo no está vinculado a ningún producto específico. No se identifican todas las versiones del producto en este artículo.

Síntomas

  • When the MDM daemon starts, you see the following in the logs:

  • MDM events - other MDM that started connecting after 20 seconds

2025/12/12 20:53:06.835 MDM_CLI_CONF_COMMAND_RECEIVED INFO Command SWITCH_MDM_OWNERSHIP received, User: 'admin'.
2025/12/12 20:53:27.065 REMOTE_SYSLOG_MODULE_INITIALIZED INFO Initialized the remote syslog module
2025/12/12 20:53:27.065 MDM_MANAGER_START INFO MDM started with the role of Manager
2025/12/12 20:53:27.133 MDM_CLUSTER_CONNECTED INFO The MDM, server-014 (ID 000000000000000e), connected after 2000ms

 

  • In the MDM trace logs, a ~20-second gap can be observed during CMS initialization:
2025/12/12 20:53:07.022040 LOW:000000000000:mosTrcLayer_Create:00210: ---------- Process started. Version private PowerFlex R4_5.4000.111_Release, CodeBase , Feb 20 2025. PID 9138 ----------
2025/12/12 20:53:07.061777 LOW:7fe11c4fadb0:amqEventMgr_Init:00037: Starting AMQ CMS initialization <<<=======
2025/12/12 20:53:27.064872 LOW:7fe11c4fadb0:amqProducer_Create:00323: Creating events handler
2025/12/12 20:53:27.065210 LOW:7fe11c4fadb0:amqEventMgr_Init:00048: Finished AMQ CMS initialization <<<=======

Impact

The MDM daemon startup time is extended, which increases the time required for the MDM to reconnect to the cluster. This delay can be observed during operations that restart an MDM, such as:

  • MDM ownership switchover
  • MDM service restart
  • Node reboot
  • Other events that trigger an MDM restart

As a result, the MDM_CLUSTER_CONNECTED event appears approximately 20 seconds after the restart, rather than immediately.

Causa

When an MDM daemon starts and attempts to rejoin the cluster, it performs several internal initialization steps before becoming operational. One of these steps is the initialization of the event and notification framework (ActiveMQ CMS), which is used for system alerts and monitoring.

Currently, this initialization is:

  • Synchronous
  • Required before cluster membership is established

Therefore, the MDM cannot join the cluster until CMS initialization completes. If DNS name resolution on the node is slow or misconfigured, CMS initialization may wait for DNS queries to timeout, resulting in an approximately 20-second startup delay.

Resolución

Verify that DNS configuration and hostname resolution are functioning correctly on the MDM nodes.
The following commands can be used to test hostname resolution performance:

  1. Verify DNS latency:

    Note: Perform these checks on a Secondary MDM or non-primary node. Avoid running them on the Primary MDM.
    time hostname -f time getent hosts $(hostname -f)

    The real time should be less than 1 second.

     If the commands take ~20 seconds or timeout, it indicates a DNS resolution issue.

     

  2. Verify DNS configuration
    Check configured DNS server in /etc/resolv.conf

    • Verify that a valid nameserver entries exist.

    • Verify that the configured DNS servers are reachable.

    Correct the DNS configuration or hostname resolution settings on the affected node and re-run the tests.

    Note: Perform these checks on a Secondary MDM or non-primary node. Avoid running them on the Primary MDM.

     

  3. Verify DNS server reachability

    Test connectivity to the configured DNS servers:

    ping <dns_server_ip>

 

  1. Re-test DNS latency provided in Step 1.

  2. After DNS resolution is functioning correctly, you could test by restarting the MDM daemon to test to check if MDM rejoins the cluster in approximately couple of second .

    Note: Perform the MDM daemon restart only on a Secondary or Standby MDM. Do not restart the daemon on the Primary MDM. You could check the cluster status via 'scli --query_cluster' 

 

Impacted Versions

PowerFlex 4.x

Fixed In Version

N/A - Environmental/configuration issue

Productos afectados

PowerFlex rack, ScaleIO
Propiedades del artículo
Número del artículo: 000438735
Tipo de artículo: Solution
Última modificación: 12 mar 2026
Versión:  4
Encuentre respuestas a sus preguntas de otros usuarios de Dell
Servicios de soporte
Compruebe si el dispositivo está cubierto por los servicios de soporte.