Isilon: Mark-on-Write during truncate causing performance issues
摘要: During a MultiScan job, if a file is truncated or deleted, blocks are marked inline and potentially cause performance issues.
本文适用于
本文不适用于
本文并非针对某种特定的产品。
本文并非包含所有产品版本。
症状
While MultiScan is running, either in a Running or Waiting state. There may be a performance impact or even a temporary lockup of the cluster after a file is deleted or truncated.
原因
When a file is deleted or truncated while MultiScan is running we do mark-on-write inline which could potentially take a long time. This serial behavior could cause other processes to be blocked. Waiting for access to the LIN that was deleted/truncated causing clients to experience slower than usual performance or even appear to stop responding on that operation.
解决方案
In OneFS v8.0.0.0 and later we move the mark-on-write work to a deferred work queue. This eliminates the serial behavior and prevents the blocking of that LIN. This allows other process to obtain locks on that LIN to continue work and prevent performance issues from the mark-on-write.
The fix for this issue required a significant architectural change so there is no fix in any OneFS version before 8.0.
If this issue is encountered in pre-8.0 versions, there are a couple possible workarounds. Implement the one that works best for your situation.
Workaround 1
Workaround 2
The fix for this issue required a significant architectural change so there is no fix in any OneFS version before 8.0.
If this issue is encountered in pre-8.0 versions, there are a couple possible workarounds. Implement the one that works best for your situation.
Workaround 1
Schedule MultiScan to run during off hours.
If there is a time when the cluster is used less or not at all, then MultiScan can be run during these hours to minimize the possible performance impact while MultiScan is running. This does not guarantee that the issue will not be hit, but only used to minimize the impact.
See the System Jobs section in the Administration Guide for the OneFS version the cluster is currently on for how to create an impact policy and set MultiScan to run on that policy.
If there is a time when the cluster is used less or not at all, then MultiScan can be run during these hours to minimize the possible performance impact while MultiScan is running. This does not guarantee that the issue will not be hit, but only used to minimize the impact.
See the System Jobs section in the Administration Guide for the OneFS version the cluster is currently on for how to create an impact policy and set MultiScan to run on that policy.
Workaround 2
Disable MultiScan and run AutoBalance and Collect individually as needed.
If there are no times when the cluster is less utilized, then MultiScan can be disabled. With MultiScan disabled AutoBalance and Collect can be used individually to complete balancing and cleanup tasks.
AutoBalance starts automatically when a new node is added to balance data to the new node and across the cluster. AutoBalance can also be started manually as needed.
Collect will start every 30 days if one has not been run within the past 30 days. Collect can be also be started manually as needed.
See they System Jobs section in the Administration Guide for the OneFS version the cluster is currently on for how to disable/enable jobs.
If there are no times when the cluster is less utilized, then MultiScan can be disabled. With MultiScan disabled AutoBalance and Collect can be used individually to complete balancing and cleanup tasks.
AutoBalance starts automatically when a new node is added to balance data to the new node and across the cluster. AutoBalance can also be started manually as needed.
Collect will start every 30 days if one has not been run within the past 30 days. Collect can be also be started manually as needed.
See they System Jobs section in the Administration Guide for the OneFS version the cluster is currently on for how to disable/enable jobs.
受影响的产品
Isilon, PowerScale OneFS文章属性
文章编号: 000052420
文章类型: Solution
上次修改时间: 28 6月 2023
版本: 5
从其他戴尔用户那里查找问题的答案
支持服务
检查您的设备是否在支持服务涵盖的范围内。