Nodes run OOM from excessive lockdata utilization
Summary: Nodes run out of memory due to lockdata not releasing used memory.
Symptoms
Nodes show signs of being Out of Memory (OOM) with "lockdata" shown to be the high memory utilization.
Node may show performance degradation or even panic in severe cases.
This is more likely to impact lower memory nodes, such as 16GB A200/2000 nodes, but may impact higher memory nodes.
Example of an OOM message in the messages log, and lockdata shown as top consumer with over 3GB used--
2025-12-28T04:47:47.270517-05:00 <0.4> EXAMPLE-24(id24) /boot/kernel.amd64/kernel: Malloc Pigs:
2025-12-28T04:47:47.273671-05:00 <0.4> EXAMPLE-24(id24) /boot/kernel.amd64/kernel: Type InUse MemUse Requests
2025-12-28T04:47:47.275835-05:00 <0.4> EXAMPLE-24(id24) /boot/kernel.amd64/kernel: lockdata 3389051 3389051K 3716197873
2025-12-28T04:47:47.278575-05:00 <0.4> EXAMPLE-24(id24) /boot/kernel.amd64/kernel: 8kB dinodes 818152 670368K 29437430894
2025-12-28T04:47:47.281406-05:00 <0.4> EXAMPLE-24(id24) /boot/kernel.amd64/kernel: isi_hash 31612 92553K 16882805458
2025-12-28T04:47:47.283209-05:00 <0.4> EXAMPLE-24(id24) /boot/kernel.amd64/kernel: layout_hints 216930 67791K 506523261
2025-12-28T04:47:47.285766-05:00 <0.4> EXAMPLE-24(id24) /boot/kernel.amd64/kernel: iaddr_set 918679 57418K 76306136502
2025-12-28T04:47:47.289540-05:00 <0.4> EXAMPLE-24(id24) /boot/kernel.amd64/kernel: lbm super 8820 42195K 255485113
2025-12-28T04:47:47.291600-05:00 <0.4> EXAMPLE-24(id24) /boot/kernel.amd64/kernel: devbuf 16994 37217K 2699236
2025-12-28T04:47:47.293557-05:00 <0.4> EXAMPLE-24(id24) /boot/kernel.amd64/kernel: bar_owner_vec259 301 32928K 11315960
2025-12-28T04:47:47.295987-05:00 <0.4> EXAMPLE-24(id24) /boot/kernel.amd64/kernel: newblk 4 32768K 18925356
2025-12-28T04:47:47.299318-05:00 <0.4> EXAMPLE-24(id24) /boot/kernel.amd64/kernel: crc_vec200 7986 31867K 863150005
2025-12-28T04:47:47.301157-05:00 <0.4> EXAMPLE-24(id24) /boot/kernel.amd64/kernel: bam_alloc 34903 19302K 2500981635
2025-12-28T04:47:47.303032-05:00 <0.4> EXAMPLE-24(id24) /boot/kernel.amd64/kernel: ptr_llcb_map 99885 19299K 2274118968
2025-12-28T04:47:47.306277-05:00 <0.4> EXAMPLE-24(id24) /boot/kernel.amd64/kernel: inodedep 4 16384K 7051517
2025-12-28T04:47:47.309572-05:00 <0.4> EXAMPLE-24(id24) /boot/kernel.amd64/kernel: ddvec84 216932 13621K 4596579001
2025-12-28T04:47:47.312698-05:00 <0.4> EXAMPLE-24(id24) /boot/kernel.amd64/kernel: sysctloid 216712 11835K 231785
2025-12-28T04:47:47.314522-05:00 <0.4> EXAMPLE-24(id24) /boot/kernel.amd64/kernel: Unshown bins account for 99614K
2025-12-28T04:47:47.316385-05:00 <0.4> EXAMPLE-24(id24) /boot/kernel.amd64/kernel: Total: 4634206K
Checking the nodes may show several with higher utilization--
martid32@elvis: /2026-01-12-003 $ grep lockdata */kern.malloc_pigs|sort -k3n
EXAMPLE-21/kern.malloc_pigs:lockdata 57883 57883K 33140621
EXAMPLE-24/kern.malloc_pigs:lockdata 179535 179535K 133252264
EXAMPLE-22/kern.malloc_pigs:lockdata 191116 191116K 134601448
EXAMPLE-15/kern.malloc_pigs:lockdata 201258 201258K 173115576
EXAMPLE-19/kern.malloc_pigs:lockdata 212236 212236K 145973662
EXAMPLE-16/kern.malloc_pigs:lockdata 223574 223574K 200913232
EXAMPLE-14/kern.malloc_pigs:lockdata 270028 270028K 212437579
EXAMPLE-12/kern.malloc_pigs:lockdata 810088 810088K 751687757
EXAMPLE-8/kern.malloc_pigs:lockdata 814480 814480K 820272423
EXAMPLE-20/kern.malloc_pigs:lockdata 861939 861939K 814769447
EXAMPLE-6/kern.malloc_pigs:lockdata 1002016 1002016K 1010428656
EXAMPLE-11/kern.malloc_pigs:lockdata 1043191 1043191K 1112512581
EXAMPLE-13/kern.malloc_pigs:lockdata 1052161 1052161K 1236223736
EXAMPLE-10/kern.malloc_pigs:lockdata 2287836 2287836K 2137296070
EXAMPLE-9/kern.malloc_pigs:lockdata 2287836 2287836K 2130601482
EXAMPLE-5/kern.malloc_pigs:lockdata 2288665 2288665K 2014785292
EXAMPLE-7/kern.malloc_pigs:lockdata 2290701 2290701K 2020741030
EXAMPLE-4/kern.malloc_pigs:lockdata 2291665 2291665K 2069854314
EXAMPLE-3/kern.malloc_pigs:lockdata 2293574 2293574K 2431882220
EXAMPLE-17/kern.malloc_pigs:lockdata 3554690 3554690K 3615677378
EXAMPLE-23/kern.malloc_pigs:lockdata 3554842 3554842K 3785712414
EXAMPLE-2/kern.malloc_pigs:lockdata 3566354 3566354K 3874955814
EXAMPLE-1/kern.malloc_pigs:lockdata 3567030 3567030K 3715078119Cause
NFS per-file instance lockdata is a normal operation in the file system of file open operations.
In some situations, it may not show to release consumed memory correctly.
This can lead to 3GB+ of unreleased memory being retained, leading to over utilization of memory by lockdata.
Resolution
A proactive reboot of the node or nodes should be performed to free up the memory from lockdata.
Flushing of the cache has shown not to make any improvements.
Checking for memory utilization on a live cluster the below command can be used--
isi_for_array -s sysctl kern.malloc_pigs |grep lockdata