PowerScale OneFS 9.11 及更高版本:虚拟节点未有效回收,导致节点运行 OOM

Summary: 内存中累积的大量 vnode 可能会导致一个或多个节点进入内存不足 (OOM) 状况。任何使用快照和 SmartPools 且运行 OneFS 9.11 或更高版本的群集都可能存在此问题。

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Symptoms

OOM 条件可能会导致节点崩溃和无响应,从而导致性能下降和数据不可用事件。内存中的虚拟节点数高于所需的最大虚拟节点数。启用或禁用 vfs.vnlru_reuse_freevnodes (vnode 回收)对此问题没有影响。

可以从 OOM 消息中识别问题, vmlogs和小型转储。具有 48 GB 内存的 F200 节点上的消息输出示例:

2025-06-11T05:31:56.025986+02:00 <0.4> - /boot/kernel.amd64/kernel: OOM: v_wire_count: 11262575, v_active_count: 7 events_since_last_log 674
2025-06-11T05:31:56.025992+02:00 <0.4> - /boot/kernel.amd64/kernel: Malloc Pigs:
2025-06-11T05:31:56.025997+02:00 <0.4> - /boot/kernel.amd64/kernel: Type                   InUse   MemUse   Requests
2025-06-11T05:31:56.026004+02:00 <0.4> - /boot/kernel.amd64/kernel: iaddr_set            15701654  981354K 4078195298
2025-06-11T05:31:56.026010+02:00 <0.4> - /boot/kernel.amd64/kernel: devbuf                171845  649730K   35775475
2025-06-11T05:31:56.026016+02:00 <0.4> - /boot/kernel.amd64/kernel: isi_hash              137112  505157K  760464664
2025-06-11T05:31:56.026022+02:00 <0.4> - /boot/kernel.amd64/kernel: newblk                     4  131072K    1483813
2025-06-11T05:31:56.026027+02:00 <0.4> - /boot/kernel.amd64/kernel: inodedep                   4   65536K     531073
2025-06-11T05:31:56.026035+02:00 <0.4> - /boot/kernel.amd64/kernel: vfscache                   4   32817K          4
2025-06-11T05:31:56.026041+02:00 <0.4> - /boot/kernel.amd64/kernel: bar_owner_vec264         259   32256K     191809
2025-06-11T05:31:56.026047+02:00 <0.4> - /boot/kernel.amd64/kernel: linux                 169103   28170K  142448723
2025-06-11T05:31:56.026052+02:00 <0.4> - /boot/kernel.amd64/kernel: statistics data        13062   20453K    1189902
2025-06-11T05:31:56.026058+02:00 <0.4> - /boot/kernel.amd64/kernel: vfs_hash                   1   16384K          1
2025-06-11T05:31:56.026064+02:00 <0.4> - /boot/kernel.amd64/kernel: pagedep                    4   16384K     151925
2025-06-11T05:31:56.026069+02:00 <0.4> - /boot/kernel.amd64/kernel: sysctloid             269900   14747K     274194
2025-06-11T05:31:56.026075+02:00 <0.4> - /boot/kernel.amd64/kernel: acpica                194000   12817K    3055662
2025-06-11T05:31:56.026081+02:00 <0.4> - /boot/kernel.amd64/kernel: 8kB dinodes             2315   11909K 10694470462
2025-06-11T05:31:56.026086+02:00 <0.4> - /boot/kernel.amd64/kernel: pcb                      136    9230K     470386
2025-06-11T05:31:56.026092+02:00 <0.4> - /boot/kernel.amd64/kernel: Unshown bins account for 100723K
2025-06-11T05:31:56.026103+02:00 <0.4> - /boot/kernel.amd64/kernel: Total: 2628733K
2025-06-11T05:31:56.026109+02:00 <0.4> - /boot/kernel.amd64/kernel: UMA Zalloc Pigs:
2025-06-11T05:31:56.026114+02:00 <0.4> - /boot/kernel.amd64/kernel: NAME              SIZE      LIMIT      COUNT   MEM USED
2025-06-11T05:31:56.026120+02:00 <0.4> - /boot/kernel.amd64/kernel: IFSINODE          616,         0,  17222149,  11481500K
2025-06-11T05:31:56.026126+02:00 <0.4> - /boot/kernel.amd64/kernel: VNODE             584,         0,  17224440,   9842604K
2025-06-11T05:31:56.026132+02:00 <0.4> - /boot/kernel.amd64/kernel: mbuf_jumbo_p     4096,         0,    272268,   1089072K
2025-06-11T05:31:56.026137+02:00 <0.4> - /boot/kernel.amd64/kernel: VM OBJECT         272,         0,     49671,    839200K
2025-06-11T05:31:56.026143+02:00 <0.4> - /boot/kernel.amd64/kernel: UMA Slabs 0        80,         0,   3078132,    246336K
2025-06-11T05:31:56.026149+02:00 <0.4> - /boot/kernel.amd64/kernel: BUF TRIE          144,         0,    122678,    217204K
2025-06-11T05:31:56.026154+02:00 <0.4> - /boot/kernel.amd64/kernel: RADIX NODE        144,         0,    333231,    141808K
2025-06-11T05:31:56.026160+02:00 <0.4> - /boot/kernel.amd64/kernel: vmem btag          56,         0,    143433,    114884K
2025-06-11T05:31:56.026166+02:00 <0.4> - /boot/kernel.amd64/kernel: mbuf              256,         *,         *,     75888K
2025-06-11T05:31:56.026172+02:00 <0.4> - /boot/kernel.amd64/kernel:  mbuf             256,  19265886,    284503,          *
2025-06-11T05:31:56.026177+02:00 <0.4> - /boot/kernel.amd64/kernel:  mbuf_packet      256,         0,        64,          *
2025-06-11T05:31:56.026183+02:00 <0.4> - /boot/kernel.amd64/kernel: lki_mds_ent       160,         0,     62000,     69428K
2025-06-11T05:31:56.026189+02:00 <0.4> - /boot/kernel.amd64/kernel: md3               512,         0,    131072,     65540K
2025-06-11T05:31:56.026194+02:00 <0.4> - /boot/kernel.amd64/kernel: md0               512,         0,    131072,     65540K
2025-06-11T05:31:56.026200+02:00 <0.4> - /boot/kernel.amd64/kernel: pbuf             1024,         *,         *,     38760K
2025-06-11T05:31:56.026206+02:00 <0.4> - /boot/kernel.amd64/kernel:  pbuf            1024,       256,         0,          *
2025-06-11T05:31:56.026211+02:00 <0.4> - /boot/kernel.amd64/kernel:  vnpbuf          1024,       512,         0,          *
2025-06-11T05:31:56.026217+02:00 <0.4> - /boot/kernel.amd64/kernel:  clpbuf          1024,     15872,         0,          *
2025-06-11T05:31:56.026223+02:00 <0.4> - /boot/kernel.amd64/kernel:  mdpbuf          1024,      1638,         0,          *
2025-06-11T05:31:56.026228+02:00 <0.4> - /boot/kernel.amd64/kernel:  nfspbuf         1024,      8192,         0,          *
2025-06-11T05:31:56.026234+02:00 <0.4> - /boot/kernel.amd64/kernel:  swwbuf          1024,      4096,         0,          *
2025-06-11T05:31:56.026240+02:00 <0.4> - /boot/kernel.amd64/kernel:  swrbuf          1024,      8192,         0,          *
2025-06-11T05:31:56.026245+02:00 <0.4> - /boot/kernel.amd64/kernel: lkc_gen_ent        64,         0,     21333,     18100K
2025-06-11T05:31:56.026253+02:00 <0.4> - /boot/kernel.amd64/kernel: MAP ENTRY          96,         0,    138594,     15488K
2025-06-11T05:31:56.026259+02:00 <0.4> - /boot/kernel.amd64/kernel: Top zones:        24321352K
2025-06-11T05:31:56.026265+02:00 <0.4> - /boot/kernel.amd64/kernel: Malloc zones:      2344608K
2025-06-11T05:31:56.026270+02:00 <0.4> - /boot/kernel.amd64/kernel: Other zones:        117512K
2025-06-11T05:31:56.026276+02:00 <0.4> - /boot/kernel.amd64/kernel: UMA total:        26783472K

以下条目数超过 1700 万:

NAME              SIZE      LIMIT      COUNT   MEM USE
IFSINODE          616,         0,  17222149,  11481500
VNODE             584,         0,  17224440,   9842604

其中最大 vnodes 为 2,900,000,可在以下位置找到: vmlogs > kern.maxvnodes.

可以使用以下各项在群集上验证虚拟节点数,并为最大虚拟节点数参考 sysctl

# isi_for_array -s "sysctl vfs.numvnodes"
# isi_for_array -s "sysctl kern.maxvnodes"

Cause

问题正在调查中。

Resolution

OneFS 9.11.0.2 和 9.12.0.0 及更高版本 OneFS 中提供了永久解决方案。

解决方法:
当前的解决方法是每小时刷新一次所有节点上的内存高速缓存。以下示例循环可用于任何节点上的屏幕会话:

# while true; do date; isi_for_array -s 'sysctl vfs.numvnodes; isi_flush '; sleep 3600; done

 

提醒:可在 3600 秒(1 小时)等待期间随时取消脚本,除非 isi_flush 命令正在进行中。不要终止 isi_flush 处理,因为这可能会导致临时服务中断。

 

为了进一步实现自动化并使高速缓存刷新更加可靠, cron 可以计划作业。下面的示例将每小时刷新一次缓存 7 分钟:

编辑 /etc/mcp/override/crontab ,然后输入以下内容:

7 * * * * root pgrep isi_flush || /usr/bin/isi_flush

有关如何编辑的更多信息 crontab 可在文章 Isilon如何编辑 crontab中找到。

SmartLock 合规性模式
对于 SmartLock 合规性模式群集, sudo 应在之前添加 isi_for_arrayisi_flush

# while true; do date; sudo isi_for_array -s 'sysctl vfs.numvnodes; sudo isi_flush '; sleep 3600; done

如果使用 cron 工作 compadmin 无法编辑 /etc/mcp/override/crontabcrontab CLI 工具,使用 -e 应在每个节点上使用选项来添加所需的行,如下所示。该行不包含根所在的“who”列 /etc/mcp/override/crontab,因为它是一个 cron 作业者 compadmin 只。

编辑 cron 作业使用:

# crontab -e

在编辑器中:
i 切换到插入模式。
粘贴 到以下行:

7 * * * *  pgrep isi_flush || sudo /usr/bin/isi_flush

按 Esc
键压 :wq! 以保存更改。

Affected Products

Isilon, PowerScale, Isilon Gen6.5, Isilon Gen6, PowerScale OneFS
Article Properties
Article Number: 000342005
Article Type: Solution
Last Modified: 19 Nov 2025
Version:  6
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.