Powerflex Management Platform: keycloak-0 logs HTTP probe failed with statuscode: 503

Summary: This article explains an issue where the keycloak-0 pod reports a health check failure due to database connectivity problems caused by an incorrect DNS configuration. This issue impacts authentication services managed by keycloak ...

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Check out other resources

Symptoms

Scenario

One of the two Keycloak pods (here keycloak-0) experiences connectivity issues with the database, while keycloak-1 remains functional.

Event logs shows repeated readiness probe failures.

# kubectl get pods -n powerflex | egrep keycloak
keycloak-0                                                1/1     Running     0               22d
keycloak-1                                                1/1     Running     0               22d

# kubectl get events | egrep kube
Events:
  Type     Reason     Age                 From     Message
  ----     ------     ----                ----     -------
  Warning  Unhealthy  12m (x58 over 17h)  keycloak-0  Readiness probe failed: HTTP probe failed with statuscode: 503

The keycloak pod logs indicate a failure to acquire JDBC connections due to an acquisition timeout:

# kubectl get logs keycloak-0 -n powerflex
..
2024-11-27 07:01:41,593 INFO  [org.infinispan.CLUSTER] (non-blocking-thread--p2-t126) [Context=actionTokens] ISPN100010: Finished rebalance with members [keycloak-0-17437, keycloak-1-41022], topology id 7
2024-11-27 07:31:03,379 WARN  [org.hibernate.engine.jdbc.spi.SqlExceptionHelper] (Timer-0) SQL Error: 0, SQLState: null
2024-11-27 07:31:03,379 ERROR [org.hibernate.engine.jdbc.spi.SqlExceptionHelper] (Timer-0) Acquisition timeout while waiting for new connection
2024-11-27 07:31:03,384 ERROR [org.keycloak.services.scheduled.ScheduledTaskRunner] (Timer-0) Failed to run scheduled task ClearExpiredEvents: javax.persistence.PersistenceException: org.hibernate.exception.GenericJDBCException: Unable to acquire JDBC Connection
        at org.hibernate.internal.ExceptionConverterImpl.convert(ExceptionConverterImpl.java:154)
        at java.base/java.util.TimerThread.run(Timer.java:506)
Caused by: org.hibernate.exception.GenericJDBCException: Unable to acquire JDBC Connection  <---------
..
Caused by: java.sql.SQLException: Acquisition timeout while waiting for new connection  <--------- 
.. 
Caused by: java.util.concurrent.TimeoutException  <--------- 
..
2024-11-27 09:31:03,476 INFO  [io.smallrye.health] (executor-thread-15) SRHCK01001: Reporting health down status: {"status":"DOWN","checks":[{"name":"Keycloak database connections health check","status":"DOWN","data":{"Failing since":"2024-11-27 07:31:03,477"}}]}
2024-11-27 09:56:03,477 INFO  [io.smallrye.health] (executor-thread-15) SRHCK01001: Reporting health down status: {"status":"DOWN","checks":[{"name":"Keycloak database connections health check","status":"DOWN","data":{"Failing since":"2024-11-27 07:31:03,477"}}]}

Impact

Authentication requests handled by keycloak-0 fail, causing intermittent or complete authentication failures for the PowerFlex Management Platform.
keycloak health check continuously reports a DOWN status, impacting high availability.

Cause

The issue occurs due to incorrect DNS configuration.

The JDBC connection used by keycloak to connect to the database relies on resolving the database hostname or endpoint.

Any misconfiguration or failure in hostname resolution can cause timeouts when attempting to establish a connection.

Resolution

1) Fix the DNS configuration as per the operating system documentation

a) If RedHat or CentOS v7,x or v8,x,

i) Edit /etc/resolv.conf to update the correct DNS server on each MgmtVMs (MVMs)

ii) Delete the coredns pods (rke2-coredns-rke2-coredns-xxxxxxxxxx-xxxxx) to propagate the changes to those pods:

for x in `kubectl get pods -n kube-system | grep -i rke2-coredns-rke2-coredns | awk '{print $1}' | grep -iv auto`; do kubectl delete pods -n kube-system $x; done

iii) Verify DNS changes are now reflected in the coredns pods (there are 2 coredns pods responsible for DNS):

for x in `kubectl get pods -n kube-system | grep -i rke2-coredns-rke2-coredns | awk '{print $1}' | grep -iv auto`; do echo $x; kubectl exec -it $x -n kube-system -- cat /etc/resolv.conf; echo " "; done

b) If SLES v15.x and above, engage support to follow internal article https://www.dell.com/support/kbdoc/en-us/000227354

2) Restart keycloak pods

kubectl rollout restart statefulset keycloak -n powerflex

3) Monitor keycloak logs for any additional database connectivity issues

kubectl logs keycloak-0 -n powerflex [-f]
kubectl logs keycloak-1 -n powerflex [-f]

Products

PowerFlex rack, PowerFlex Appliance, PowerFlex custom node, ScaleIO, PowerFlex appliance connectivity

Article Number: 000261288

Article Type: Solution

Last Modified: 19 Dec 2024

Version: 1

Check if your device is covered by Support Services.

Powerflex Management Platform: keycloak-0 logs HTTP probe failed with statuscode: 503

Summary: This article explains an issue where the keycloak-0 pod reports a health check failure due to database connectivity problems caused by an incorrect DNS configuration. This issue impacts authentication services managed by keycloak ...

Symptoms

Scenario

Impact

Cause

Resolution

Products

Article Properties

Find answers to your questions from other Dell users

Support Services

Article Properties

Find answers to your questions from other Dell users

Support Services

Powerflex Management Platform: keycloak-0 logs HTTP probe failed with statuscode: 503

Summary: This article explains an issue where the keycloak-0 pod reports a health check failure due to database connectivity problems caused by an incorrect DNS configuration. This issue impacts authentication services managed by keycloak ... View More View Less

Detailed Article

Symptoms

Cause

Resolution

Affected Products

Symptoms

Scenario

Impact

Cause

Resolution

Products

Article Properties

Find answers to your questions from other Dell users

Support Services

Article Properties

Find answers to your questions from other Dell users

Support Services

Summary: This article explains an issue where the keycloak-0 pod reports a health check failure due to database connectivity problems caused by an incorrect DNS configuration. This issue impacts authentication services managed by keycloak ...