PowerFlex 3.X: MDM Certificate has Expired
Summary: Scli commands fail with "SSL Error: certificate has expired."
Symptoms
Any scli commands fail with this error.
SSL Error: certificate has expired.
This prevents operations such as generate_mdm_csr, replace_mdm_security_files, set_management_client_communication (to enable or disable secure communication), and so on.
Other management tools such as GW and UI may fail to manage the cluster too.
Impact:
Inability to manage ScaleIO, VxFlex OS, or PowerFlex environment.
Cause
The management clients (Command-Line Interface-CLI, Gateway-GW, or User Interface-UI) establish secure communication to the Primary Meta Data Manager (MDM), before which the certificate must be validated. When the certificate has expired, the clients cannot connect to the primary MDM.
Resolution
To address the issue after it has occurred:
- Identify the Primary MDM and Secondary MDMs:
- On all MDM nodes, run the following command:
netstat -anp | grep :6611.*LISTEN
If it returns an entry, then it is the Primary MDM; otherwise it is the Secondary.
- Identify all MDMs (Primary and Secondary) that have expired certificates:
- On All MDM Managers, run the following command:
scli --query_cluster --tech
- Run the following command on the MDMs with the expired certificates. Run the command on the Secondary MDMs first, and then the Primary MDM last.
- Remove:
/opt/emc/scaleio/mdm/cfg/mdm_management_certificate.pem;
Restart MDM service
pkill mdm
This generates a new self-signed certificate, which can then be added to the truststores of the management clients, as they would prompt.
Impacted Versions:
PowerFlex versions 2.x and 3.x