PowerScale: OneFS: SED External key manager ERROR after certificate renewal
Summary: When a certificate registered with an external key manager is allowed to expire, then that certificate is renewed, the nodes are no longer able to connect and show "ERROR" in the status column. ...
Symptoms
This applies only to Self-Encrypting Drive (SED) nodes that are connected to an External key manager per: https://infohub.delltechnologies.com/en-US/l/dell-powerscale-onefs-security-considerations/external-key-manager-10/
AND the nodes were previously connected and working as intended until the certificate was allowed to expire then was renewed with no other configuration changes.
To verify what SSL cert the key manager is using:
isi keymanager kmip servers list
Take the ID from the output:
isi keymanager kmip servers view <ID>
From CLI on the node, run the following:
/usr/bin/isi_km_diag status
In the output, you are looking for the line that states: keystore domain [SEDS] status = [9008, [ISI_KM_SERVER_UNREACHABLE]]
In the key manager error log, you may see failures where the PowerScale nodes are attempting to connect with the expired certificate.
From CLI on the nodes, messages similar to the following appear in the /var/log/isi_km_d.log:
2024-02-15T17:12:06.599757+00:00 <3.3> isilon-5(id5) isi_km_d[2395]: [0x801da2000]key_mgr: Failed to connect to KMIP server!
2024-02-15T17:12:06.599812+00:00 <3.3> isilon-5(id5) isi_km_d[2395]: [0x801da2000]key_mgr: [SEDS] Failed to initialize provider! [9007, ISI_KM_KMIP_SERVER_UNREACHABLE]
Run the following to verify the issue is present on all nodes in the cluster:
isi keymanager sed status
Cause
The PowerScale node is attempting to connect to the kmip server using the old certificate.
Resolution
Contact support as soon as possible to open a Service Request (SR) and reference this KB article.
Do not reboot nodes or restart the isi_km_d service.