Skip to main content
  • Place orders quickly and easily
  • View orders and track your shipping status
  • Enjoy members-only rewards and discounts
  • Create and access a list of your products
  • Manage your Dell EMC sites, products, and product-level contacts using Company Administration.

PowerFlex: Frequent PowerFlex MDM Disconnects

Summary: PowerFlex components such as the MDM are disconnecting and reconnecting quickly and frequently. Lost connection followed by connected found in MDM events. Connecting after 80-190ms.

This article may have been automatically translated. If you have any feedback regarding its quality, please let us know using the form at the bottom of this page.

Article Content


Instructions

Symptoms

MDM event logs showing the frequent disconnect and reconnect of the MDM component:
 

2023-xx-xx 00:00:21.316 MDM_CLUSTER_LOST_CONNECTION WARNING        The MDM, <MDM_Name> (ID <MDM_ID>), has lost connection to the cluster.
2023-xx-xx 00:00:21.419 MDM_CLUSTER_CONNECTED     INFO             The MDM, <MDM_Name> (ID <MDM_ID>), connected after 100ms
2023-xx-xx 00:00:23.480 MDM_CLUSTER_LOST_CONNECTION WARNING        The MDM, <MDM_Name> (ID <MDM_ID>), has lost connection to the cluster.
2023-xx-xx 00:00:23.584 MDM_CLUSTER_CONNECTED     INFO             The MDM, <MDM_Name> (ID <MDM_ID>), connected after 110ms

Sar output from the MDM server that's disconnecting showing high TCP retransmissions:
 

sar -n ETCP 1 -t -f sar.0
                 atmptf/s  estres/s retrans/s  isegerr/s orsts/s
00:00:27 AM      0.00      0.00     62.00      0.00      0.00
00:00:28 AM      0.00      0.00     88.12      0.00      0.00
00:00:29 AM      0.00      3.00    100.00      0.00      0.00
00:00:30 AM      0.00      0.00     71.29      0.00      0.00
00:00:31 AM      0.00      0.00     71.00      0.00      0.00
...
00:01:02 AM      0.00      0.00     48.51      0.00      0.00
00:01:03 AM      0.00      0.00     15.00      0.00      0.00
00:01:04 AM      0.00      0.00    207.00      0.00      0.00
00:01:05 AM      0.00      0.00     36.00      0.00      0.00
00:01:06 AM      0.00      0.99    105.94      0.00      0.00

 

Impact

Brief MDM Cluster degraded events

Performance degradation
 

Root Cause

The MDM server was patched and the Linux kernel was upgraded from 3.x to 5.x. This kernel upgrade changes many of the default OS parameters to different values. In this case, the TCP parameter "net.ipv4.tcp_fack" was disabled, among others, but this one seemed to have caused the high TCP retransmissions.

The SDS RPM provides a configuration file called emc.conf in the /opt/emc/scaleio/sds/cfg/ directory. This file includes many recommended OS parameters from Dell EMC.

 

If this is a PowerFlex Rack / Appliance environment, PowerFlex Manager will automatically copy the emc.conf file from "/opt/emc/scaleio/sds/cfg" to each of the server's systcl.conf and apply it. This will only happen on the initial node deployment. There is the possibility the sysctl.conf was not updated properly. If the sysctl.conf file does not exist with the correct values, after a kernel upgrade to 5.x, it's possible that some important parameters will change.

Workaround

In a PowerFlex Rack / Appliance environment, if the sysctl.conf doesn't include all the parameters that the emc.conf has, it is recommended to copy over the emc.conf into each server /etc/sysctl.conf file. To apply the changes on the server.  The server could either be rebooted or the command "sysctl -p" can be run to apply the changes from /etc/sysctl.conf. Ensure that proper maintenance best practices are done when making these changes.

In a Software Only environment, Dell EMC recommends these Linux parameters be applied to each of the servers, but ultimately, it's up to the business. Please consult with the OS vendor for best practices or if there are any questions.
 

Impacted Versions

All PowerFlex versions

Article Properties


Affected Product

PowerFlex rack, PowerFlex Appliance, PowerFlex custom node, PowerFlex appliance connectivity, PowerFlex custom node, PowerFlex rack connectivity

Last Published Date

16 Oct 2023

Version

4

Article Type

How To