PowerFlex: SDS decoupled Network Test

Summary: This article describes how to troubleshoot an SDS_Decoupled error received by the MDM server.

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Symptoms

An SDS disconnect manifests itself with the following symptoms:

  • Storage rebuilds / rebalance
  • Reduced spare capacity
  • Increased disk usage on remaining SDS systems
  • Possible volume corruption
  • Devices long io's or device issues

Look for the following errors in the MDM event logs:

7256  2017-07-13 14:20:12.410 SDS_DECOUPLED             ERROR            SDS: SDS_10.241.xxx.xxx (id: 3711776c00000003) decoupled.

 

Cause

The keep-alive timeout for communications with the MDM is 5 s. As with the MDM disconnect, check for network latency and process issues.

Resolution

If you do scli query_all_sds, you may see disconnect messages.

Verify the connection between the MDM and the SDS nodes:

 For connections where the SDS says it cannot communicate run telnet to port 7072 between the hosts. If you cannot connect, check to be sure that the SDS process is running on the target server. If it is, check the network and hosts for firewalls or other items that could be blocking.

scli --start_sds_network_test

Retrieve the results with.

scli --query_sds_network_test --sds_ip 10.241.xxx.xxx
SDS with IP 10.241.xxx.xxx (port 7072) returned information on 4 SDSs
    SDS 3711505b00000000 10.xxx.xxx.xxx bandwidth 183.8 MB (188254 KB) per-second
    SDS 3711505c00000001 10.xxx.xxx.xxx bandwidth 182.9 MB (187245 KB) per-second
    SDS 3711505d00000002 10.xxx.xxx.xxx bandwidth 182.2 MB (186579 KB) per-second
    SDS 3711776d00000004 10.xxx.xxx.xxx bandwidth 174.7 MB (178937 KB) per-second

To verify if this is an issue, run a telnet session to port 7072 from each SDC. If you see a rejection, verify the network connectivity. This is the correct output: 

[root@scaleio-1 ~]# telnet 10.xxx.xxx.xxx 7072
Trying 10.xxx.xxx.xxx...
Connected to 10.xxx.xxx.xxx.
Escape character is '^]'.

Here is an example of a failed connect:

[root@scaleio-1 ~]# telnet 10.xxx.xxx.xxx 7072
Trying 10.xxx.xxx.xxx...
telnet: connect to address 10.xxx.xxx.xxx: No route to host

 

Affected Products

VxFlex Product Family

Products

VxFlex Product Family
Article Properties
Article Number: 000064157
Article Type: Solution
Last Modified: 04 Sept 2025
Version:  4
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.