Dell Unity: How to perform a management services (ECOM) failover (Dell Correctable)
Summary: This article focuses on different ways to perform manual failover of management services on Unity arrays.
Instructions
Usage:
Failing over Management Services (MGMT/ECOM) can be used for advanced troubleshooting, such as log gathering, testing the management network configuration, and so on.
Prerequisites:
Both Unity Storage Processors are required to be attached to the network.
Reference the Unity XT installation guide:
Dell Unity™ All Flash and Unity Hybrid Unity 480/F, Unity 680/F, Unity 880/F Installation and Service Guide
Sections "Installation Procedures"- "Attach the storage processors to the network"
- The SPA and SPB network management ports must be connected on the same subnet.
- In general, both SPs should have mirrored configurations to provide High Availability (HA), failover, and so on.
Methods to perform Management (MGMT) failover
Method #1: Using service command
- Verify which SP is holding ECOM (in the example below is SPA):
service@(none) spa:~# pgrep ECOM 21933
- Run command to failover MGMT:
root@(none) spa:/# svc_restart_service failover MGMT Found ECOM on spa, will failover to spb Waiting for ECOM to stop......................... Waiting for ECOM to failover..................... << expected connection drop, as the primary SP fails over to the secondary SSH Session will disconnect here............... << reconnect to session.
- Verify which SP is holding ECOM (should be now SPB):
service@(none) spa:~# ssh peer service@(none) spb:~# pgrep ECOM 21933
If it fails, see KB article Dell Unity: Command "svc_restart_service failover MGMT" not working correctly (Dell Correctable)
Method #2: Physically on the array
- Verify which SP is primary (step 1 in Method #1 above).
- Pull out the network cable from the Primary SP's management port.
- Wait for about 15 minutes, for the management failover to be fully completed.
- Connect the network cable back to the SP's management port.
To monitor the array from either UI or CLI:
- Ping the primary SP and wait for ping loss.
- There will be a ping response after a while (approximately 5 minutes). This is an indication of management services restarting on the peer SP.
- Wait for another 5 minutes after a successful ping response. After a failover, the management services take another 5 minutes to start completely.
- Log in to Unisphere and confirm Unity's health Status or run command:
uemcli -no /sys/general healthcheck -detail
Additional Information
For additional assistance with the above steps, contact Dell Technical Support or your Authorized Service Representative, and quote this Dell Knowledge Base article.