PowerScale:Isilon:CloudPools 作業導致 CPU 使用率過高isi_cpool_d

Summary: 此isi_cpool_d程序可能會導致 PowerScale Isilon 叢集上的 CPU 使用率過高。

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Symptoms

isi_cpool_d程序顯示叢集上的 CPU 使用率持續偏高。

 

Isilon-1# top -n 10

  PID USERNAME    THR PRI NICE   SIZE    RES STATE    C   TIME    WCPU COMMAND
87857 root        124  20    0   595M   173M nanslp  13 1722.5 857.62% isi_cpool_d
 3455 root         29  98 r150   397M    86M sigwai  10 4216.2  62.55% nfs
 3313 root         40  98 r150  1018M   683M sigwai  14 7402.9  47.71% lwio
94259 root         13  52    0   566M   491M usem    18 374.1H  32.57% isi_celog_monitor
18378 root          5  20    0   102M    53M uwait    3  49:57  24.56% isi_job_d
34552 root          1  52    0    37M    15M adv     22 112.6H  20.51% isi_migr_sched
 3144 root         13  20    0    52M    13M select   8 2009.5  15.33% isi_audit_d
98432 root          1  52    0   105M    66M kqread  26 417:47  14.55% isi_celog_analysis
 3213 root         26  52    0    96M    28M uwait   10 1109.2  12.50% isi_avscan_d
51167 root          5  20    0    93M    42M uwait   21  74:37  10.40% isi_job_d
...
..

 

叢集上可能會執行多個 CloudPools 工作,但即使所有工作都已暫停,isi_cpool_d使用率仍然很高。

 

Isilon-1#  isi cloud jobs list
ID   Description                              Effective State  Type
---------------------------------------------------------------------------------------
1    Write updated data to the cloud          paused           cache-writeback
2    Expire CloudPools cache                  paused           cache-invalidation
4    Clean up unreferenced data in the cloud  paused           cloud-garbage-collection
5    Write updated snapshot data to the cloud paused           snapshot-writeback
6    Update SmartLink file formats            paused           smartlink-upgrade
7    Add data to CloudPools cache             paused           cache-pre-populate
959                                           paused           archive
960                                           paused           archive
961                                           paused           archive
962                                           paused           archive
964                                           paused           archive
965                                           paused           archive
966                                           paused           archive
967                                           paused           archive
968                                           paused           archive
---------------------------------------------------------------------------------------

Isilon-1# top -n 5

  PID USERNAME    THR PRI NICE   SIZE    RES STATE    C   TIME    WCPU COMMAND
87857 root        124  20    0   588M   180M nanslp   4 1723.5 805.81% isi_cpool_d
 3455 root         28  98 r150   397M    87M sigwai  10 4216.3  69.34% nfs
18378 root          6  20    0   122M    72M uwait    9  53:18  68.36% isi_job_d
 3313 root         49  98 r150  1019M   684M sigwai  14 7403.0  66.16% lwio
51167 root          6  20    0    94M    42M uwait   26  76:02  22.36% isi_job_d
...

 

Cause

某些作業 (例如快取寫回和快取失效) 會在背景中執行,不會直接與任何執行中的 CloudPools 工作相關。暫停 CloudPools 工作並不會阻止這些作業執行。這些線程將繼續運行並導致高 CPU 利用率。

若要確認這一點,請在監控 CPU 使用率時,暫停快取寫回和快取失效作業。isi_cpool_d CPU 使用率在暫停後應該會迅速下降。Isi_cpool_d,一旦恢復運營,CPU 利用率就會攀升。 
 

若要暫停 CloudPools 作業:

# isi cloud jobs pause cache-writeback
# isi cloud jobs pause cache-invalidation
 


 

若要恢復 CloudPools 作業:

# isi cloud jobs resume cache-invalidation
# isi cloud jobs resume cache-writeback
 


 

Resolution

不建議長時間暫停緩存寫回和緩存失效操作。各種未完成的任務和操作累積並放大了問題。 
 

由寫回或緩存失效導致的高 CPU 利用率可能表示發生了大量緩存。通常是因為大量數據正在存檔和內聯召回。這可能是由於檔池策略中編寫的存檔條件編寫不當所導致的。在不考慮訪問時間的情況下進行存檔可能會導致活動檔的緩存過多。
 

這是將資料歸檔至 ECS CloudPools 的寫入不佳檔案集區原則範例。請注意,指定路徑內的任何資料會立即歸檔至 CloudPools:

--------------------------------------------------------------------------------
                              Name: Bad ECS Cloud Policy
                       Description: Tier to ECS
                  CloudPools State: OK
                CloudPools Details:
                       Apply Order: 3
             File Matching Pattern: Path == APPS/SeaShoreVideo (begins with)
                                    OR
                                    Path == APPS/OceanArchive (begins with)
          Set Requested Protection: -
               Data Access Pattern: -
                  Enable Coalescer: -
                    Enable Packing: -
               Data Storage Target: -
                 Data SSD Strategy: -
           Snapshot Storage Target: -
             Snapshot SSD Strategy: -
                        Cloud Pool: EMC ECS Pool
         Cloud Compression Enabled: Yes
          Cloud Encryption Enabled: No
              Cloud Data Retention: 1W
Cloud Incremental Backup Retention: 5Y
       Cloud Full Backup Retention: 5Y
               Cloud Accessibility: cached
                  Cloud Read Ahead: partial
            Cloud Cache Expiration: 1D
         Cloud Writeback Frequency: 9H
                                ID: Good ECS Cloud Policy
--------------------------------------------------------------------------------


 

這是正確編寫的檔池策略的示例,該策略可容納活動檔和最近訪問的檔。請注意,此原則包含存取時間準則,因此只有 5 週 5 天後未存取的資料會歸檔至 CloudPool。 

--------------------------------------------------------------------------------
                              Name: Good ECS Cloud Policy
                       Description: Tier to ECS
                  CloudPools State: OK
                CloudPools Details:
                       Apply Order: 3
             File Matching Pattern: Accessed Time > 5W5D AND Path == APPS/SeaShoreVideo (begins with)
                                    OR
                                    Accessed Time > 5W5D AND Path == APPS/OceanArchive (begins with)
          Set Requested Protection: -
               Data Access Pattern: -
                  Enable Coalescer: -
                    Enable Packing: -
               Data Storage Target: -
                 Data SSD Strategy: -
           Snapshot Storage Target: -
             Snapshot SSD Strategy: -
                        Cloud Pool: EMC ECS Pool
         Cloud Compression Enabled: Yes
          Cloud Encryption Enabled: No
              Cloud Data Retention: 1W
Cloud Incremental Backup Retention: 5Y
       Cloud Full Backup Retention: 5Y
               Cloud Accessibility: cached
                  Cloud Read Ahead: partial
            Cloud Cache Expiration: 1D
         Cloud Writeback Frequency: 9H
                                ID: Bad ECS Cloud Policy
--------------------------------------------------------------------------------

 

 

 

CPU 高isi_cpool_d率的其他原因可能會因叢集組態、設定和程式碼層級而有所不同。如需協助,請聯絡 Dell 技術支援部門。

Affected Products

PowerScale OneFS

Products

Isilon, Isilon SmartPools
Article Properties
Article Number: 000214130
Article Type: Solution
Last Modified: 04 Mar 2025
Version:  2
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.