PowerFlex 管理平台：keycloak-0 日志 HTTP 探测失败，状态码为：503

Summary: 本文介绍了 keycloak-0 pod 报告运行状况检查失败的问题，这是由于 DNS 配置不正确导致的数据库连接问题。此问题会影响由 keycloak 管理的身份验证服务

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Check out other resources

Symptoms

情况说明

两个 Keycloak Pod 之一（此处） keycloak-0）遇到数据库连接问题，同时 keycloak-1 保持功能正常。

Event 日志显示重复的就绪情况探测失败。

# kubectl get pods -n powerflex | egrep keycloak
keycloak-0                                                1/1     Running     0               22d
keycloak-1                                                1/1     Running     0               22d

# kubectl get events | egrep kube
Events:
  Type     Reason     Age                 From     Message
  ----     ------     ----                ----     -------
  Warning  Unhealthy  12m (x58 over 17h)  keycloak-0  Readiness probe failed: HTTP probe failed with statuscode: 503

该 keycloak pod 日志指示由于获取超时而无法获取 JDBC 连接：

# kubectl get logs keycloak-0 -n powerflex
..
2024-11-27 07:01:41,593 INFO  [org.infinispan.CLUSTER] (non-blocking-thread--p2-t126) [Context=actionTokens] ISPN100010: Finished rebalance with members [keycloak-0-17437, keycloak-1-41022], topology id 7
2024-11-27 07:31:03,379 WARN  [org.hibernate.engine.jdbc.spi.SqlExceptionHelper] (Timer-0) SQL Error: 0, SQLState: null
2024-11-27 07:31:03,379 ERROR [org.hibernate.engine.jdbc.spi.SqlExceptionHelper] (Timer-0) Acquisition timeout while waiting for new connection
2024-11-27 07:31:03,384 ERROR [org.keycloak.services.scheduled.ScheduledTaskRunner] (Timer-0) Failed to run scheduled task ClearExpiredEvents: javax.persistence.PersistenceException: org.hibernate.exception.GenericJDBCException: Unable to acquire JDBC Connection
        at org.hibernate.internal.ExceptionConverterImpl.convert(ExceptionConverterImpl.java:154)
        at java.base/java.util.TimerThread.run(Timer.java:506)
Caused by: org.hibernate.exception.GenericJDBCException: Unable to acquire JDBC Connection  <---------
..
Caused by: java.sql.SQLException: Acquisition timeout while waiting for new connection  <--------- 
.. 
Caused by: java.util.concurrent.TimeoutException  <--------- 
..
2024-11-27 09:31:03,476 INFO  [io.smallrye.health] (executor-thread-15) SRHCK01001: Reporting health down status: {"status":"DOWN","checks":[{"name":"Keycloak database connections health check","status":"DOWN","data":{"Failing since":"2024-11-27 07:31:03,477"}}]}
2024-11-27 09:56:03,477 INFO  [io.smallrye.health] (executor-thread-15) SRHCK01001: Reporting health down status: {"status":"DOWN","checks":[{"name":"Keycloak database connections health check","status":"DOWN","data":{"Failing since":"2024-11-27 07:31:03,477"}}]}

影响

由以下人员处理的身份验证请求： keycloak-0 fail，导致 PowerFlex 管理平台间歇性或完全身份验证失败。
keycloak 运行状况检查持续报告 DOWN 状态，影响高可用性。

Cause

出现此问题的原因是 DNS 配置不正确。

使用的 JDBC 连接，由 keycloak 连接到数据库依赖于解析数据库主机名或端点。

在尝试建立连接时，主机名解析中的任何错误配置或故障都可能导致超时。

Resolution

1）根据操作系统文档修复 DNS 配置

a） 如果是 RedHat 或 CentOS v7，x 或 v8，x，

i）编辑 /etc/resolv.conf 在每个 MgmtVM （MVM）上更新正确的 DNS 服务器

ii）删除 coredns 豆荚（rke2-coredns-rke2-coredns-xxxxxxxxxx-xxxxx）将更改传播到这些 Pod：

for x in `kubectl get pods -n kube-system | grep -i rke2-coredns-rke2-coredns | awk '{print $1}' | grep -iv auto`; do kubectl delete pods -n kube-system $x; done

iii）验证 DNS 更改现在是否反映在 coredns Pod（有 2 个 coredns 负责 DNS 的 pod）：

for x in `kubectl get pods -n kube-system | grep -i rke2-coredns-rke2-coredns | awk '{print $1}' | grep -iv auto`; do echo $x; kubectl exec -it $x -n kube-system -- cat /etc/resolv.conf; echo " "; done

b） 如果是 SLES v15.x及更高版本，请联系支持人员以遵循内部文章 https://www.dell.com/support/kbdoc/en-us/000227354

2）重新启动 keycloak 豆荚

kubectl rollout restart statefulset keycloak -n powerflex

3）显示器 keycloak 任何其他数据库连接问题的日志

kubectl logs keycloak-0 -n powerflex [-f]
kubectl logs keycloak-1 -n powerflex [-f]

Products

PowerFlex rack, PowerFlex Appliance, PowerFlex custom node, ScaleIO, PowerFlex appliance connectivity

Article Number: 000261288

Article Type: Solution

Last Modified: 19 Dec 2024

Version: 1

Check if your device is covered by Support Services.

PowerFlex 管理平台：keycloak-0 日志 HTTP 探测失败，状态码为：503

Summary: 本文介绍了 keycloak-0 pod 报告运行状况检查失败的问题，这是由于 DNS 配置不正确导致的数据库连接问题。此问题会影响由 keycloak 管理的身份验证服务

Symptoms

情况说明

影响

Cause

Resolution

Products

Article Properties

Find answers to your questions from other Dell users

Support Services

Article Properties

Find answers to your questions from other Dell users

Support Services

PowerFlex 管理平台：keycloak-0 日志 HTTP 探测失败，状态码为：503

Summary: 本文介绍了 keycloak-0 pod 报告运行状况检查失败的问题，这是由于 DNS 配置不正确导致的数据库连接问题。此问题会影响由 keycloak 管理的身份验证服务

Detailed Article

Symptoms

Cause

Resolution

Affected Products

Symptoms

情况说明

影响

Cause

Resolution

Products

Article Properties

Find answers to your questions from other Dell users

Support Services

Article Properties

Find answers to your questions from other Dell users

Support Services