VNX:kernel.cpu.utilization.cpuutil 在 15 分钟内> 90

Summary: kernel.cpu.utilization.cpuutil > 90,持续 15 分钟。

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Symptoms

CIFS、NFS 性能下降,并且长时间在日期移动器上 CPU 利用率过高

/nas/log/sys_log文件充斥着大量警报:

Oct 20 10:19:21 2016:CS_PLATFORM:PERFSTATS:NOTICE:3:::::nas_alerterd: Clearing event for policy: default:server_2:kernel.cpu.utilization.cpuutil is > 90 for 15 minutes due to value = 88
Oct 20 10:20:26 2016:CS_PLATFORM:PERFSTATS:NOTICE:3:::::nas_alerterd: Clearing event for policy: default:server_3:kernel.cpu.utilization.cpuutil is > 90 for 15 minutes due to value = 89
Oct 20 10:35:21 2016:CS_PLATFORM:PERFSTATS:WARNING:2:::::nas_alerterd: Raising event for policy: default:server_2:kernel.cpu.utilization.cpuutil is > 90 for 15 minutes. The last sample value was 94
Oct 20 10:48:26 2016:CS_PLATFORM:PERFSTATS:WARNING:2:::::nas_alerterd: Raising event for policy: default:server_3:kernel.cpu.utilization.cpuutil is > 90 for 15 minutes. The last sample value was 95
Oct 20 11:01:26 2016:CS_PLATFORM:PERFSTATS:NOTICE:3:::::nas_alerterd: Clearing event for policy: default:server_3:kernel.cpu.utilization.cpuutil is > 90 for 15 minutes due to value = 87
Oct 20 11:17:21 2016:CS_PLATFORM:PERFSTATS:NOTICE:3:::::nas_alerterd: Clearing event for policy: default:server_2:kernel.cpu.utilization.cpuutil is > 90 for 15 minutes due to value = 89
Oct 20 11:21:26 2016:CS_PLATFORM:PERFSTATS:WARNING:2:::::nas_alerterd: Raising event for policy: default:server_3:kernel.cpu.utilization.cpuutil is > 90 for 15 minutes. The last sample value was 96
[nasadmin@storage ~]$ server_sysstat server_2
server_2 :
threads runnable = 216
threads blocked = 6859
threads I/J/Z = 1
memory free(kB) = 6925375
cpu idle_% = 2 < ------- 98% utilized

[nasadmin@storage ~]$ server_sysstat server_3
server_3 :
threads runnable = 61
threads blocked = 6940
threads I/J/Z = 1
memory free(kB) = 6683987
cpu idle_% = 1 < ------- 99% utilized

 

Cause

完成了整体系统配置和容量分析,以确定每个数据移动器上的需求类型(复制、重复数据消除、检查点计划等)。经确定,名为 Mirage 的 VMware 应用程序正用于映像管理。该软件通过创建多个小型 CVD 文件,将客户环境中的数千个工作站备份到 CIFS、NFS 共享。

在此示例中,配置了 200 个 VMware Mirage 会话,每小时为 4,000 台计算机拍摄快照。一台计算机需要一个 CVD 文件,而该文件又需要 1.5 IOPS 才能完成快照。当备份运行时,性能缓慢,Mirage 端出现了每毫秒 626.78 KB 的巨大延迟。

当 CPU 利用率过高且 VMware Mirage 应用程序正在运行时,捕获了 server_2 的数据移动器配置文件。配置文件配置为运行 60 秒:
示例:

[nasadmin@storage ~]$ /nas/tools/profile_slot -slot 2 -method function -seconds 60 -output /root_vdm_3/FS_Backup_01/profile_slot2.out
Starting profile on slot 2 with the following params...
Slot = 2
Method = function
Seconds = 60
Ignorebounds = no
Frequency = 256
Outfile = /root_vdm_3/FS_Backup_01/profile_slot2.out
Profile started. Waiting for 60 seconds...
Profile stopped.
Profile output has been written to /root_vdm_3/FS_Backup_01/profile_slot2.out on server in slot 2.

对数据移动器配置文件的分析已完成,发现占用大部分 CPU 的主要瓶颈是称为“AES_encrypt”的 SMB 安全加密过程。此 SMB 进程用于提供 SMB 数据的端到端加密,并保护数据免遭不受信任网络上的窃听。

[nasadmin@storage ~]$ more profile_slot2.out | grep -i aes
0.0% (          1 ) EVP_aes_128_cbc
0.0% (          2 ) aes_init_key
1.0% (        631 ) aes_cbc_cipher
0.0% (         16 ) private_AES_set_encrypt_key
44.8% (      27454 ) AES_encrypt < ---------------------
0.9% (        581 ) AES_cbc_encrypt
0.0% (          1 ) EVP_aes_128_cbc
0.8% (        123 ) aes_cbc_cipher
0.0% (          1 ) private_AES_set_encrypt_key
37.0% (       5676 ) AES_encrypt < ---------------------
0.8% (        128 ) AES_cbc_encrypt
0.0% (          1 ) aes_init_key
0.9% (        140 ) aes_cbc_cipher
0.0% (          3 ) private_AES_set_encrypt_key
47.1% (       7219 ) AES_encrypt < ---------------------
0.9% (        146 ) AES_cbc_encrypt
0.0% (          1 ) aes_init_key
1.3% (        204 ) aes_cbc_cipher
0.0% (          7 ) private_AES_set_encrypt_key
48.2% (       7388 ) AES_encrypt < ---------------------
0.9% (        151 ) AES_cbc_encrypt
1.0% (        164 ) aes_cbc_cipher
0.0% (          5 ) private_AES_set_encrypt_key
46.8% (       7171 ) AES_encrypt < ---------------------
1.0% (        156 ) AES_cbc_encrypt

 

Resolution

有两个选项可用:

选项 1
不做任何更改,允许 SMB 通信的最大协议保持在 SMB3,承受高 CPU 和较差的性能。

选项 2
实施一种解决方法,将用于 SMB 通信的最大协议从 SMB3 丢弃到 SMB2。SMB3 和 SMB2 之间的主要区别在于“AES_Encrypt”。通过将 max 协议放到 SMB2,这会丢弃加密过程,CPU 利用率下降,并且性能应该会提高。

要在 Data Mover 上启用 SMB2 协议,请执行以下作:

  1. 在 putty/SSH 中以“root”用户身份登录到主控制站

  2. 请求客户允许暂时停止 Data Mover 上的 CIFS 服务。这会导致在 CIFS 服务停止期间 CIFS 访问发生小中断,因此必须与客户一起相应地安排该服务。

    server_setup server_x -P cifs -o stop
  3. 将 SMB 通信的最大协议从 SMB3 更改为 SMB2:

    server_cifs server_x -add security=NT,dialect=SMB2
  4. 重新启动 CIFS 服务:

    server_setup server_x -P cifs -o start
  5. 确保 CIFS 服务已成功重新启动,并且 max protocol 设置为 SMB2:

    server_cifs server_x

    示例:

    [root@Bstorage]# server_cifs server_2
    server_2 :
    384 Cifs threads started
    Security mode = NT
    Max protocol = SMB2.1 < -----

 

Affected Products

VNX2 Series

Products

VNX2 Series
Article Properties
Article Number: 000056854
Article Type: Solution
Last Modified: 20 Oct 2025
Version:  5
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.