PowerFlex:Linux SDC 内核死机(内存分配):内核:net_sched:页面分配失败
Podsumowanie: Linux SDC 失去对部分或全部卷的访问权限,或失去与内存分配相关的内核崩溃。
Ten artykuł dotyczy
Ten artykuł nie dotyczy
Ten artykuł nie jest powiązany z żadnym konkretnym produktem.
Nie wszystkie wersje produktu zostały zidentyfikowane w tym artykule.
Objawy
SDC 安装在 Linux 虚拟机上,但是,该问题可能发生在物理 Linux 或安装了 SDC 的任何其他作系统上。
SDC 突然断开连接。
可能是 Linux SDC 内核死机。
SDC IO 错误。
文件系统 IO 错误。
症状
Linux 计算机上的消息文件报告 SDC 堆栈跟踪,其中包括页面分配(内存)统计信息:
Dec 3 10:40:50 backup7 kernel: net_sched: page allocation failure: order:4, mode:0x104020 Dec 3 10:40:50 backup7 kernel: CPU: 3 PID: 1538 Comm: net_sched Tainted: P OE ------------ 3.10.0-693.21.1.el7.x86_64 #1 Dec 3 10:40:50 backup7 kernel: Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/21/2015 Dec 3 10:40:50 backup7 kernel: Call Trace: Dec 3 10:40:50 backup7 kernel: [<ffffffff816ae7c8>] dump_stack+0x19/0x1b Dec 3 10:40:50 backup7 kernel: [<ffffffff8118cd10>] warn_alloc_failed+0x110/0x180 Dec 3 10:40:50 backup7 kernel: [<ffffffff816aa774>] __alloc_pages_slowpath+0x6b6/0x724 Dec 3 10:40:50 backup7 kernel: [<ffffffff811912a5>] __alloc_pages_nodemask+0x405/0x420 Dec 3 10:40:50 backup7 kernel: [<ffffffff811d5a38>] alloc_pages_current+0x98/0x110 Dec 3 10:40:50 backup7 kernel: [<ffffffff8118bb0e>] __get_free_pages+0xe/0x40 Dec 3 10:40:50 backup7 kernel: [<ffffffff811e146e>] kmalloc_order_trace+0x2e/0xa0 Dec 3 10:40:50 backup7 kernel: [<ffffffff811e5011>] __kmalloc+0x211/0x230 Dec 3 10:40:50 backup7 kernel: [<ffffffffc0530e3e>] mapClass_AllocAndInitObj+0x3e/0x120 [scini] Dec 3 10:40:50 backup7 kernel: [<ffffffffc0531ca6>] mapClass_UpdateAll+0x306/0x760 [scini] Dec 3 10:40:50 backup7 kernel: [<ffffffffc055d54a>] ? mosMitSchedThrd_CurThrdOurs+0x6a/0xa0 [scini] Dec 3 10:40:50 backup7 kernel: [<ffffffffc053df93>] mapMdm_HandleObjUpdate_CK+0x2b3/0x540 [scini] Dec 3 10:40:50 backup7 kernel: [<ffffffffc053e290>] ? mapMdm_SendUpdateReq_CK+0x70/0xcd0 [scini] Dec 3 10:40:50 backup7 kernel: [<ffffffffc053e686>] mapMdm_SendUpdateReq_CK+0x466/0xcd0 [scini] Dec 3 10:40:50 backup7 kernel: [<ffffffffc0547a46>] ? netSock_DoIO+0xe6/0x630 [scini] Dec 3 10:40:50 backup7 kernel: [<ffffffffc05112f0>] ? netChan_SendReq_CK+0x70/0x800 [scini] Dec 3 10:40:50 backup7 kernel: [<ffffffffc0511432>] netChan_SendReq_CK+0x1b2/0x800 [scini] Dec 3 10:40:50 backup7 kernel: [<ffffffffc051a5fe>] netCon_SendReq_CK+0x17e/0x500 [scini] Dec 3 10:40:50 backup7 kernel: [<ffffffffc05158d7>] ? netRPC_SendDone_CK+0x47/0x6f0 [scini] Dec 3 10:40:50 backup7 kernel: [<ffffffffc05159ad>] netRPC_SendDone_CK+0x11d/0x6f0 [scini] Dec 3 10:40:50 backup7 kernel: [<ffffffffc055d7df>] mosMit_RunWithTLS+0x4f/0x60 [scini] Dec 3 10:40:50 backup7 kernel: [<ffffffffc055f0ba>] mosMitSchedThrd_ThrdEntry+0x1aa/0x510 [scini] Dec 3 10:40:50 backup7 kernel: [<ffffffffc055c490>] ? mosTicks_GetCurrentTick+0x20/0x20 [scini] Dec 3 10:40:50 backup7 kernel: [<ffffffffc055c4aa>] mosOsThrd_Entry+0x1a/0x40 [scini] Dec 3 10:40:50 backup7 kernel: [<ffffffff810b4031>] kthread+0xd1/0xe0 Dec 3 10:40:50 backup7 kernel: [<ffffffff810b3f60>] ? insert_kthread_work+0x40/0x40 Dec 3 10:40:50 backup7 kernel: [<ffffffff816c0577>] ret_from_fork+0x77/0xb0 Dec 3 10:40:50 backup7 kernel: [<ffffffff810b3f60>] ? insert_kthread_work+0x40/0x40 Dec 3 10:40:50 backup7 kernel: Mem-Info: Dec 3 10:40:50 backup7 kernel: active_anon:540198 inactive_anon:192106 isolated_anon:0#012 active_file:526767 inactive_file:908890 isolated_file:0#012 unevictable:0 dirty:2548 writeback:0 unstable:0#012 slab_reclaimable:113189 slab_unreclaimable:12471#012 mapped:4048 shmem:21154 pagetables:2768 bounce:0#012 free:87384 free_pcp:669 free_cma:0 Dec 3 10:40:50 backup7 kernel: Node 0 DMA free:15900kB min:104kB low:128kB high:156kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15908kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:8kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes Dec 3 10:40:50 backup7 kernel: lowmem_reserve[]: 0 2814 9821 9821 Dec 3 10:40:50 backup7 kernel: Node 0 DMA32 free:200976kB min:19336kB low:24168kB high:29004kB active_anon:195676kB inactive_anon:266280kB active_file:292588kB inactive_file:1429216kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3129280kB managed:2884228kB mlocked:0kB dirty:1004kB writeback:0kB mapped:5056kB shmem:26680kB slab_reclaimable:405056kB slab_unreclaimable:19648kB kernel_stack:2464kB pagetables:1864kB unstable:0kB bounce:0kB free_pcp:468kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no Dec 3 10:40:50 backup7 kernel: lowmem_reserve[]: 0 0 7006 7006 Dec 3 10:40:50 backup7 kernel: Node 0 Normal free:132556kB min:48136kB low:60168kB high:72204kB active_anon:1965116kB inactive_anon:502176kB active_file:1814484kB inactive_file:2206340kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:7340032kB managed:7174724kB mlocked:0kB dirty:9200kB writeback:0kB mapped:11168kB shmem:57936kB slab_reclaimable:47700kB slab_unreclaimable:30224kB kernel_stack:4960kB pagetables:9208kB unstable:0kB bounce:0kB free_pcp:2212kB local_pcp:704kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no Dec 3 10:40:50 backup7 kernel: lowmem_reserve[]: 0 0 0 0 Dec 3 10:40:50 backup7 kernel: Node 0 DMA: 1*4kB (U) 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15900kB Dec 3 10:40:50 backup7 kernel: Node 0 DMA32: 5802*4kB (UEM) 3223*8kB (UEM) 9329*16kB (UEM) 85*32kB (UEM) 2*64kB (UM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 201104kB Dec 3 10:40:50 backup7 kernel: Node 0 Normal: 29631*4kB (UEM) 1755*8kB (U) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 132564kB Dec 3 10:40:50 backup7 kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB Dec 3 10:40:50 backup7 kernel: 1469304 total pagecache pages Dec 3 10:40:50 backup7 kernel: 12478 pages in swap cache Dec 3 10:40:50 backup7 kernel: Swap cache stats: add 927451, delete 914973, find 1499563/1552563 Dec 3 10:40:50 backup7 kernel: Free swap = 3295096kB Dec 3 10:40:50 backup7 kernel: Total swap = 4194300kB Dec 3 10:40:50 backup7 kernel: ScaleIO R2_5 mapClass_AllocAndInitObj:1212 :Error: Failed to allocate memory 36288.Cannot process MDM response
同时或稍后(取决于工作负载),出现“NO_RESOURCES”SDC 错误和/或 SDC“IO 错误”和/或文件系统 IO 错误:
消息文件:
Dec 3 11:23:55 backup7 kernel: ScaleIO R2_5 mapClass_UpdateAll:523 :Error: Object ffff8802aa340000 failed to update in place.status NO_RESOURCES (67)
Dec 3 11:24:45 backup7 kernel: ScaleIO R2_5 mapVolIO_ReportIOErrorIfNeeded:361 :[7567770049] IO-ERROR comb: 0. offsetInComb 0. SizeInLB 0. SDS_ID 0. Comb Gen 0. Head Gen 16da.
Dec 3 11:24:45 backup7 kernel: ScaleIO R2_5 mapVolIO_ReportIOErrorIfNeeded:374 :Vol ID 0x7dfb023900000046. Last fault Status IO_FAULT_NOT_PRI(12).Last error Status NOT_FOUND(3) Reason (failed getting LB-Info) Retry count (0) chan (0)
Dec 3 11:24:45 backup7 kernel: blk_update_request: I/O error, dev scinia, sector 2166028544
Dec 3 11:24:45 backup7 kernel: ScaleIO R2_5 mapVolIO_ReportIOErrorIfNeeded:361 :[7567770056] IO-ERROR comb: 0. offsetInComb 0. SizeInLB 0. SDS_ID 0. Comb Gen 0. Head Gen 16da.
Dec 3 11:24:45 backup7 kernel: ScaleIO R2_5 mapVolIO_ReportIOErrorIfNeeded:374 :Vol ID 0x7dfb023900000046. Last fault Status IO_FAULT_NOT_PRI(12).Last error Status NOT_FOUND(3) Reason (failed getting LB-Info) Retry count (0) chan (0)
Dec 3 11:24:45 backup7 kernel: blk_update_request: I/O error, dev scinia, sector 2166028544
Dec 3 11:24:45 backup7 kernel: ScaleIO R2_5 mapVolIO_ReportIOErrorIfNeeded:361 :[7567770372] IO-ERROR comb: 0. offsetInComb 0. SizeInLB 0. SDS_ID 0. Comb Gen 0. Head Gen 16da.
Dec 3 11:24:45 backup7 kernel: ScaleIO R2_5 mapVolIO_ReportIOErrorIfNeeded:374 :Vol ID 0x7dfb023900000046. Last fault Status IO_FAULT_NOT_PRI(12).Last error Status NOT_FOUND(3) Reason (failed getting LB-Info) Retry count (0) chan (0)
Dec 3 11:24:45 backup7 kernel: blk_update_request: I/O error, dev scinia, sector 2166028552
...
...
Dec 3 11:27:05 backup7 kernel: XFS (dm-2): metadata I/O error: block 0x7dec700 ("xfs_trans_read_buf_map") error 19 numblks 32
Dec 3 11:27:05 backup7 kernel: XFS (dm-2): xfs_imap_to_bp: xfs_trans_read_buf() returned error -19.
Dec 3 11:27:05 backup7 kernel: ScaleIO R2_5 mapVolIO_ReportIOErrorIfNeeded:361 :[7567910448] IO-ERROR comb: 0. offsetInComb 0. SizeInLB 0. SDS_ID 0. Comb Gen 0. Head Gen 16ac.
Dec 3 11:27:05 backup7 kernel: ScaleIO R2_5 mapVolIO_ReportIOErrorIfNeeded:374 :Vol ID 0x7dfb023900000046. Last fault Status IO_FAULT_NOT_PRI(12).Last error Status NOT_FOUND(3) Reason (failed getting LB-Info) Retry count (0) chan (0)
Dec 3 11:27:05 backup7 kernel: blk_update_request: I/O error, dev scinia, sector 132042496
Dec 3 11:27:05 backup7 kernel: XFS (dm-2): metadata I/O error: block 0x7dec700 ("xfs_trans_read_buf_map") error 19 numblks 32
Dec 3 11:27:05 backup7 kernel: XFS (dm-2): xfs_imap_to_bp: xfs_trans_read_buf() returned error -19.
Dec 3 11:27:05 backup7 kernel: ScaleIO R2_5 mapVolIO_ReportIOErrorIfNeeded:361 :[7567910460] IO-ERROR comb: 0. offsetInComb 0. SizeInLB 0. SDS_ID 0. Comb Gen 0. Head Gen 16ac.
Dec 3 11:27:05 backup7 kernel: ScaleIO R2_5 mapVolIO_ReportIOErrorIfNeeded:374 :Vol ID 0x7dfb023900000046. Last fault Status IO_FAULT_NOT_PRI(12).Last error Status NOT_FOUND(3) Reason (failed getting LB-Info) Retry count (0) chan (0)
Dec 3 11:27:05 backup7 kernel: blk_update_request: I/O error, dev scinia, sector 132042496
影响
SDC 无法正常工作。
SDC 断开连接。
无法访问一个或多个卷。
Przyczyna
SDC 没有足够的连续内存。
主机上内存碎片和可用内存不足。
由于 Linux 计算机的可用内存不足,并且由于内存碎片,SDC 没有足够的内存。
按照设计,SDC 使用大型卡盘进行内存分配,在此特定情况下,SDC 请求的 36k (36288) 内存无法分配:
Dec 3 10:40:50 backup7 kernel: ScaleIO R2_5 mapClass_AllocAndInitObj:1212 :Error: Failed to allocate memory 36288.Cannot process MDM response
在消息文件中:大约有 132MB 的可用内存,但是,没有足够的大块(32k、64k 等)可用于内存分配,导致内核死机:
有 29631*4kb 可用区块加上 1755*8k 可用区块 = 132MB (132564kb)。
Dec 3 10:40:50 backup7 kernel: Node 0 Normal: 29631*4kB (UEM) 1755*8kB (U) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 132564kB
提醒:这不太可能发生在可用内存很少的机器上。
Rozwiązanie
提醒:主机重新启动将临时清除内存碎片,直至下次出现问题。
从 SDC 端来看,没有解决方法,因为行为是设计使然。
从主机端:
1) 添加更多内存并确保可用内存保持在足够高的水平。
2) 在此特定情况下,SDC Linux 计算机是虚拟机,将 SDC 移至 ESXi 将解决问题,因为 ESXi 主机几乎没有 GB 的可用内存。
3) 验证运行的应用程序/服务是否可能导致或促成内存碎片。
Dodatkowe informacje
受影响的版本
任何 SIO 版本。
已在版本中修复
N/A
Produkty, których dotyczy problem
PowerFlex Software, VxFlex Product Family, VxFlex Ready NodeWłaściwości artykułu
Numer artykułu: 000056228
Typ artykułu: Solution
Ostatnia modyfikacja: 31 paź 2025
Wersja: 5
Znajdź odpowiedzi na swoje pytania u innych użytkowników produktów Dell
Usługi pomocy technicznej
Sprawdź, czy Twoje urządzenie jest objęte usługą pomocy technicznej.