PowerFlex 4.x 升级因 Outofmemoryerror 和 Java 堆空间而失败
Summary: PFxM 升级因内存不足错误和 Java 堆空间错误而失败。
This article applies to
This article does not apply to
This article is not tied to any specific product.
Not all product versions are identified in this article.
Symptoms
执行 PowerFlex 升级时,升级失败,并显示错误 Out-OfMemoryError 和 java 堆空间。
从 deployment.log:
DEBUG [2025-01-06T22:50:31.482915] 573476: provider/configuration/vxos_update.rb:1082:in `process!': scaleio-block-legacy-gateway: Upgrade LIA DEBUG [2025-01-06T22:50:31.483170] 573476: provider/configuration/vxos_update.rb:707:in `im_upgrade': scaleio-block-legacy-gateway: Initiating VxOS cluster upgrade DEBUG [2025-01-06T22:50:31.576972] 573476: provider/configuration/vxos_update.rb:687:in `im_upgrade_staging': scaleio-block-legacy-gateway: Uploading VxOS RPMs for current version DEBUG [2025-01-06T22:50:58.310637] 573476: provider/configuration/vxos_update.rb:690:in `im_upgrade_staging': scaleio-block-legacy-gateway: Uploading VxOS RPMs for newer versions DEBUG [2025-01-06T22:50:58.312193] 573476: provider/configuration/vxos_update.rb:404:in `block in vxos_rpms': scaleio-block-legacy-gateway: Downloading rpms from nginx to /tmp: https://http-share.powerflex.svc:443/download//8aaa80939422a51b01943b5873e82d40/os/VxFlex4.5.2SLES15.4Repo/vxflexos_4.5.2000/ DEBUG [2025-01-06T22:50:59.581675] 573476: provider/configuration/vxos_update.rb:418:in `block in vxos_rpms': scaleio-block-legacy-gateway: Download operation result: # DEBUG [2025-01-06T22:50:59.582232] 573476: provider/configuration/vxos_update.rb:424:in `block in vxos_rpms': scaleio-block-legacy-gateway: Local rpm path: /tmp/d20250106-5032-ukddq7/http-share.powerflex.svc/8aaa80939422a51b01943b5873e82d40/os/VxFlex4.5.2SLES15.4Repo/vxflexos_4.5.2000 DEBUG [2025-01-06T22:50:59.587624] 573476: provider/configuration/vxos_update.rb:404:in `block in vxos_rpms': scaleio-block-legacy-gateway: Downloading rpms from nginx to /tmp: https://http-share.powerflex.svc:443/download//8aaa80939422a51b01943b5873e82d40/os/VxFlex4.5.2RHEL7Repo/vxflexos_4.5.2000/ DEBUG [2025-01-06T22:51:00.312803] 573476: provider/configuration/vxos_update.rb:418:in `block in vxos_rpms': scaleio-block-legacy-gateway: Download operation result: # DEBUG [2025-01-06T22:51:00.313528] 573476: provider/configuration/vxos_update.rb:424:in `block in vxos_rpms': scaleio-block-legacy-gateway: Local rpm path: /tmp/d20250106-5032-ukddq7/http-share.powerflex.svc/8aaa80939422a51b01943b5873e82d40/os/VxFlex4.5.2RHEL7Repo/vxflexos_4.5.2000 ERROR [2025-01-06T22:56:16.377904] 573476: rule_engine/rule.rb:241:in `process_state': Encountered a critical unrecoverable error while processing #: Java::JavaLang::OutOfMemoryError: Java heap space ERROR [2025-01-06T22:56:16.378496] 573476: service/processor.rb:54:in `block in process_state_threaded': Encountered a critical unrecoverable error while processing the service: Java::JavaLang::OutOfMemoryError: Java heap space ERROR [2025-01-06T22:56:16.379412] 573400: rule_engine/rule.rb:241:in `process_state': Encountered a critical unrecoverable error while processing #: Java::JavaLang::OutOfMemoryError: Java heap space
影响
升级失败。
Cause
由于 PFxM 中有大量对象(例如 64 个节点/SVM 环境),因此在编排升级过程时,精简部署程序中会出现内存不足问题。
部署程序进程没有硬编码的堆大小内存设置,因此 JVM 使用 1/4 的节点内存。默认情况下,MVM 虚拟机部署有 32 GB 内存。这相当于部署程序进程的最大堆大小为 8 GB。
Resolution
如果需要,可以同时使用这两个选项。
选项 1:
将 MVM VM 内存增加到 64 GB。这会将部署程序进程使用的可用 Java 堆内存量增加到 16 GB:
工程部门没有每个节点所需内存的具体指标,但事实证明使用 64 GB 是成功的。
如果升级仍然失败,请将 PowerFlex 网关 (GW) 的副本集降至 1 GW:
1) 通过 SSH 连接到 PFMP 服务器
2) 将副本数减少到 1:
kubectl scale sts block-legacy-gateway -n powerflex --replicas=1
3) 执行 GW 升级。这会将系统升级到使用 MTLS 的 4.x,并且不会出现此问题。4) 后端 PowerFlex 系统升级到 4.x 后,将 GW 的副本集调整回 2:
kubectl scale sts block-legacy-gateway -n powerflex --replicas=2
选项 2:
在升级 PowerFlex 之前删除资源组 (RG),然后使用Add Existing Resource Group作将其添加回来。此问题中所述的内存使用情况仅用于位于 RG 中的节点;如果没有 RG,则升级不会构建对象,并且不应有 OOM。
受影响的版本
PFMP 4.x
已修复问题的版本
PFMP 4.8
Affected Products
PowerFlex rack, VxFlex Ready Nodes, PowerFlex custom node, PowerFlex appliance R650, PowerFlex appliance R6525, PowerFlex appliance R660, PowerFlex appliance R6625, Powerflex appliance R750, PowerFlex appliance R760, PowerFlex appliance R7625
, PowerFlex appliance R640, PowerFlex appliance R740XD, PowerFlex appliance R7525, PowerFlex appliance R840
...
Article Properties
Article Number: 000305779
Article Type: Solution
Last Modified: 16 Apr 2025
Version: 3
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.