PowerFlex Unable to add SDS
Summary: The user is unable to add the previously removed SDS.
Symptoms
Scenario
Remove SDS from the configuration while the SDS is offline or not communicating with the MDM.
Symptoms
Get the following error while trying to add SDS:
StorageVM568:~ # scli --add_sds --sds_ip 172.16.88.32 --protection_domain_name domain1 --device_name /dev/sdc,/dev/sdd,/dev/sde,/dev/sdf,/dev/sdg,/dev/sdh,/dev/sdi,/dev/sdj --sds_name sds569 --storage_pool_name Pool1
Error: MDM failed command. Status: SDS already attached to this MDM When doing an SDS query you will see that the SDS you are trying to add is not there:
StorageVM568:~ # scli --query_all_sds Query-all-SDS returned 7 SDS nodes. Protection Domain: Name: domain1 ID: e18c21df00000000 SDS ID: 3ec0d0df00000000 Name: sds568 IP: 172.16.88.31 Port: 7072 SDS ID: 3ec0d14400000003 Name: sds570 IP: 172.16.88.33 Port: 7072 SDS ID: 3ec0d14500000004 Name: sds571 IP: 172.16.88.34 Port: 7072 SDS ID: 3ec0d14600000005 Name: sds572 IP: 172.16.88.35 Port: 7072 SDS ID: 3ec0d14700000006 Name: sds573 IP: 172.16.88.36 Port: 7072 SDS ID: 3ec0d14800000007 Name: sds574 IP: 172.16.88.37 Port: 7072 SDS ID: 3ec0d14900000008 Name: sds575 IP: 172.16.88.38 Port: 7072
In the MDM trace logs you should see the following message:
15/01 06:51:48.056568 mosEventLog_PostInternal:00260: New event added. Message: "Command ADD_SDS Failed. Error code: SDS already attached to this MDM". Additional info: "(172.16.88.32:7072). ID: 3ec0d1a90000000a" Severity: Warning 15/01 06:51:48.056603 tgtMgr_RemoveTgtSync:06404: ID: 3ec0d1a90000000a. TGT passed test. Waiting for all combs to be released 15/01 06:51:48.056613 tgtMgr_RemoveTgtSync:06440: ID: 3ec0d1a90000000a. All combs released syncing with other processes 15/01 06:51:48.056616 tgtMgr_RemoveTgtSync:06444: ID: 3ec0d1a90000000a. TGT quiesced. Send Cleanup and removing from using components 15/01 06:51:48.056635 mdmTgtMsg_SendSyncTgtCleanUp:01773: TgtId: 3ec0d1a90000000a 15/01 06:51:48.057341 mdmTgtMsg_SendSyncTgtCleanUp:01786: TgtId: 3ec0d1a90000000a cleanup failed with RC: WRONG_TGT_ID 15/01 06:51:48.057354 netCon_Disconnect:00720: Client con 3ec0d1a90000000a state changed CONNECTED => DISCONNECTING
Impact
Unable to add SDS to the configuration.
Cause
When removing SDS from the configuration while it is not communicating with the MDM, the SDS remains with an old Tgt ID. When the SDS comes back online, it thinks that it is already connected to the MDM - has the MDM ID and Its Tgt ID in the following file /opt/scaleio/ecs/sds/cfg/rep_tgt.txt.
When a user tries to add this SDS back to the configuration, the operation fails as the MDM thinks that it is already attached - while it is not.
Resolution
Run the add_sds command with --force_clean at the end.
--force_clean
Clean a previous SDS configuration. Use this if the SDS was previously part of a PowerFlex system.
Example:
StorageVM568:~ # scli --add_sds --sds_ip 172.16.88.32 --protection_domain_name domain1 --device_name /dev/sdc,/dev/sdd,/dev/sde,/dev/sdf,/dev/sdg,/dev/sdh,/dev/sdi,/dev/sdj --sds_name sds569 --storage_pool_name Pool1 --force_clean