PowerScale : OneFS : SED external key manager ERROR after certificate renewal
Summary: When the certificate registered with an external key manager is allowed to expire, then the certificate is renewed, the nodes are no longer able to connect and shows "ERROR" in the status column. ...
Symptoms
This applies only to SED nodes that are connected to an external key manager per: https://infohub.delltechnologies.com/en-US/l/dell-powerscale-onefs-security-considerations/external-key-manager-10/
AND the nodes were previously connected and working as intended until the certificate was allowed to expire then was renewed (no other changes to the configuration).
To verify what SSL cert the key manager is using:
isi keymanager kmip servers list
Grab the ID from the output
isi keymanager kmip servers view <ID>
From CLI on the node, run the following:
/usr/bin/isi_km_diag status
In the output, you are looking for the line that states keystore domain [SEDS] status = [9008, [ISI_KM_SERVER_UNREACHABLE]]
In your key manager error log, you may see failures where the PowerScale nodes are attempting to connect with the expired certificate.
From CLI on the nodes, you find messages similar to the the following in /var/log/isi_km_d.log
2024-02-15T17:12:06.599757+00:00 <3.3> isilon-5(id5) isi_km_d[2395]: [0x801da2000]key_mgr: Failed to connect to KMIP server!
2024-02-15T17:12:06.599812+00:00 <3.3> isilon-5(id5) isi_km_d[2395]: [0x801da2000]key_mgr: [SEDS] Failed to initialize provider! [9007, ISI_KM_KMIP_SERVER_UNREACHABLE]
To verify the issue is on all nodes in the cluster:
isi keymanager sed status
Cause
Resolution
Do not reboot nodes or restart the isi_km_d service.