Dell Unity: Large or Incrementing Snapshot Queue Causing Performance Issues

Summary: Dell Unity: Large Or Incrementing Snapshot Queue Causing Performance Issues And High Storage Processor (SP) CPU

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Symptoms

  • The value for Recover Point Objective (RPO) was reduced to something that is considered more aggressive (that is 10 minutes). "
    • "Recovery Point Objective (RPO) is an industry accepted term that indicates the acceptable amount of data, which is measured in units of time, that may be lost in a failure. When you set up an asynchronous replication session, you can configure automatic synchronization based on the RPO. You can specify an RPO from a minimum of 5 minutes up to a maximum of 1440 minutes (24 hours). The default RPO is set at 60 minutes (1 hour) interval. For synchronous replication, RPO is fixed at 0."
  • There are many snapshots in a "destroying" state for a LUN.
  • The number of Snapshots in a "destroying" state is incrementing over time.
  • High SP CPU without a correlating IOPS/Bandwidth workload.
  • LUNs and Backend Drives have queuing and elevated response times.


    You can look in Unisphere in the "Block" section. Be sure to add the column "Snapshots" to get a read-out per LUN. If you see many snapshots listed for a LUN or multiple LUNs, then this indicates a few things 

    Unisphere UI look at LUNs 

    Go to the individual LUN and select the "Snapshots" tab to check the "State" (will be "destroying") and "Taken by" (will be "Replication") for confirmation:

    Unisphere UI look at snanpshots 

     

Cause

There can be many causes for queuing to build. One of the main causes is attributed to an RPO that is considered 'too aggressive'.

Native Asynchronous Block Replication:
Native Asynchronous block replication uses a delta between two snapshots in order to transfer data. During the replication sessions life-span, there will be multiple snapshot "refreshes" that take place when transferring changes.

When a snapshot is refreshed, it is really being deleted and re-created in the background.

The most notable concerns are SP CPU consumption and additional backend I/O that are associated with snapshot functionality.


The Unity array cannot fully delete the snapshots in a relatively reasonable amount of time, causing the rate of snapshots entering a "to be deleted" state to far exceed the rate of snapshots being fully deleted within a given amount of time. As you decrease the RPO value, this increases the amount of snapshot creations or deletions within a given amount of time.

Resolution

For the LUN that have the most snapshots in a destroying state, set the RPO to at least the default (60 minutes) until the deleting of snapshots can catch up. You may want to leave the value at this new RPO depending on how many snapshots were being queued up and judge accordingly.

 

"Dell Technologies recommends including a Flash tier in a hybrid pool where snapshots are active.

 

Snapshots increase the overall CPU load on the system and increase the overall drive IOPS in the storage pool.  Snapshots also use pool capacity to store the older data being tracked by the snapshot, which increases the amount of capacity used in the pool, until the snapshot is deleted.  Consider the overhead of snapshots when planning both performance and capacity requirements for the storage pool.

 

Before enabling snapshots on a storage object, it is recommended to monitor the system and ensure that existing resources can meet the additional workload requirements (see the Hardware Capability Guidelines section, Table 2).  Enable snapshots on a few storage objects at a time, and then monitor the system to be sure that it is still within recommended operating ranges, before enabling more snapshots.

 

It is recommended to stagger snapshot operations (creation, deletion, so forth).  This can be accomplished by using different snapshot schedules for different sets of storage objects.  It is also recommended to schedule snapshot operations after any FAST VP relocations have been completed.

 

Snapshots are deleted by the system asynchronously; when a snapshot is in the process of being deleted, it is marked "Destroying".  If the system is accumulating "Destroying" snapshots over time, it may be an indication that existing snapshot schedules are too aggressive; taking snapshots less frequently may provide more predictable levels of performance. 

 

Dell Unity will throttle snapshot delete operations to reduce the impact to host I/O.  Snapshot deletes will occur more quickly during periods of low system utilization." Dell Unity: Best Practices Guide

Additional Information


 

Affected Products

Dell EMC Unity Family

Products

Dell EMC Unity Family
Article Properties
Article Number: 000055095
Article Type: Solution
Last Modified: 20 Oct 2025
Version:  5
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.