VNX:kernel.cpu.utilization.cpuutil 會> 90 持續 15 分鐘

Summary: kernel.cpu.utilization.cpuutil 會> 90 持續 15 分鐘。

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Symptoms

CIFS 降級、NFS 效能,以及日期移動者的 CPU 使用率過高

/nas/log/sys_log 檔案充滿下列警示:

Oct 20 10:19:21 2016:CS_PLATFORM:PERFSTATS:NOTICE:3:::::nas_alerterd: Clearing event for policy: default:server_2:kernel.cpu.utilization.cpuutil is > 90 for 15 minutes due to value = 88
Oct 20 10:20:26 2016:CS_PLATFORM:PERFSTATS:NOTICE:3:::::nas_alerterd: Clearing event for policy: default:server_3:kernel.cpu.utilization.cpuutil is > 90 for 15 minutes due to value = 89
Oct 20 10:35:21 2016:CS_PLATFORM:PERFSTATS:WARNING:2:::::nas_alerterd: Raising event for policy: default:server_2:kernel.cpu.utilization.cpuutil is > 90 for 15 minutes. The last sample value was 94
Oct 20 10:48:26 2016:CS_PLATFORM:PERFSTATS:WARNING:2:::::nas_alerterd: Raising event for policy: default:server_3:kernel.cpu.utilization.cpuutil is > 90 for 15 minutes. The last sample value was 95
Oct 20 11:01:26 2016:CS_PLATFORM:PERFSTATS:NOTICE:3:::::nas_alerterd: Clearing event for policy: default:server_3:kernel.cpu.utilization.cpuutil is > 90 for 15 minutes due to value = 87
Oct 20 11:17:21 2016:CS_PLATFORM:PERFSTATS:NOTICE:3:::::nas_alerterd: Clearing event for policy: default:server_2:kernel.cpu.utilization.cpuutil is > 90 for 15 minutes due to value = 89
Oct 20 11:21:26 2016:CS_PLATFORM:PERFSTATS:WARNING:2:::::nas_alerterd: Raising event for policy: default:server_3:kernel.cpu.utilization.cpuutil is > 90 for 15 minutes. The last sample value was 96
[nasadmin@storage ~]$ server_sysstat server_2
server_2 :
threads runnable = 216
threads blocked = 6859
threads I/J/Z = 1
memory free(kB) = 6925375
cpu idle_% = 2 < ------- 98% utilized

[nasadmin@storage ~]$ server_sysstat server_3
server_3 :
threads runnable = 61
threads blocked = 6940
threads I/J/Z = 1
memory free(kB) = 6683987
cpu idle_% = 1 < ------- 99% utilized

 

Cause

已完成整體系統組態和容量分析,以確定每個資料移動器上的需求類型 (複寫數量、重複資料刪除、檢查點排程等)。已確定名為Mirage的VMware應用程式用於映像管理。該軟體通過創建多個小型 CVD 檔,將客戶環境中的數千個工作站備份到 CIFS、NFS 共用。

在此示例中,配置了 200 個 VMware Mirage 會話,每小時拍攝 4,000 台機器的快照。一台電腦需要一個 CVD 檔案,而該檔案需要 1.5 IOPS 才能完成快照。執行備份時,效能緩慢,Mirage 端出現每毫秒 626.78 KB 的巨大延遲。

當 CPU 使用率過高且 VMware Mirage 應用程式正在執行時,會擷取 server_2 的資料移動者設定檔。設定檔設定為執行 60 秒:
範例:

[nasadmin@storage ~]$ /nas/tools/profile_slot -slot 2 -method function -seconds 60 -output /root_vdm_3/FS_Backup_01/profile_slot2.out
Starting profile on slot 2 with the following params...
Slot = 2
Method = function
Seconds = 60
Ignorebounds = no
Frequency = 256
Outfile = /root_vdm_3/FS_Backup_01/profile_slot2.out
Profile started. Waiting for 60 seconds...
Profile stopped.
Profile output has been written to /root_vdm_3/FS_Backup_01/profile_slot2.out on server in slot 2.

已完成對資料移動者設定檔的分析,發現消耗大部分 CPU 的主要瓶頸是稱為「AES_encrypt」的 SMB 安全性加密程序。此 SMB 程序用於提供 SMB 資料的端對端加密,並保護資料不被受信任的網路上發生。

[nasadmin@storage ~]$ more profile_slot2.out | grep -i aes
0.0% (          1 ) EVP_aes_128_cbc
0.0% (          2 ) aes_init_key
1.0% (        631 ) aes_cbc_cipher
0.0% (         16 ) private_AES_set_encrypt_key
44.8% (      27454 ) AES_encrypt < ---------------------
0.9% (        581 ) AES_cbc_encrypt
0.0% (          1 ) EVP_aes_128_cbc
0.8% (        123 ) aes_cbc_cipher
0.0% (          1 ) private_AES_set_encrypt_key
37.0% (       5676 ) AES_encrypt < ---------------------
0.8% (        128 ) AES_cbc_encrypt
0.0% (          1 ) aes_init_key
0.9% (        140 ) aes_cbc_cipher
0.0% (          3 ) private_AES_set_encrypt_key
47.1% (       7219 ) AES_encrypt < ---------------------
0.9% (        146 ) AES_cbc_encrypt
0.0% (          1 ) aes_init_key
1.3% (        204 ) aes_cbc_cipher
0.0% (          7 ) private_AES_set_encrypt_key
48.2% (       7388 ) AES_encrypt < ---------------------
0.9% (        151 ) AES_cbc_encrypt
1.0% (        164 ) aes_cbc_cipher
0.0% (          5 ) private_AES_set_encrypt_key
46.8% (       7171 ) AES_encrypt < ---------------------
1.0% (        156 ) AES_cbc_encrypt

 

Resolution

有兩個選項可供使用:

選項 1
不進行任何變更,讓 SMB 通訊的最大通訊協定保持在 SMB3,承受高 CPU 和不良效能。

選項 2
實作因應措施,將 SMB 通訊的 max 通訊協定從 SMB3 丟棄至 SMB2。SMB3 和 SMB2 的主要差別是「AES_Encrypt」。將最大協定降至 SMB2 會降低加密程序,CPU 使用率會下降,效能應該會改善。

若要在資料移動者上啟用 SMB2 通訊協定:

  1. 在 putty/SSH 中以「root」使用者身分登入主要控制站

  2. 請求客戶許可,以暫時停止資料移動者的 CIFS 服務。這會導致 CIFS 服務停止時,對 CIFS 存取造成小幅中斷,因此必須與客戶據此進行排程。

    server_setup server_x -P cifs -o stop
  3. 將 SMB 通訊的通訊上限從 SMB3 變更為 SMB2:

    server_cifs server_x -add security=NT,dialect=SMB2
  4. 重新啟動 CIFS 服務:

    server_setup server_x -P cifs -o start
  5. 請確定 CIFS 服務已成功重新啟動,且 maximum protocol 設定為 SMB2:

    server_cifs server_x

    範例:

    [root@Bstorage]# server_cifs server_2
    server_2 :
    384 Cifs threads started
    Security mode = NT
    Max protocol = SMB2.1 < -----

 

Affected Products

VNX2 Series

Products

VNX2 Series
Article Properties
Article Number: 000056854
Article Type: Solution
Last Modified: 20 Oct 2025
Version:  5
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.