Dell Unity: Drive command timeout errors may result in performance issues and data unavailability (User Correctable)

Summary: Impact Description: Severe performance issue after flash drives start logging errors which are not automatically addressed by the array.

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Symptoms

Potential Data Unavailability
Severity: Critical

The system administrator observes severe performance issues on the array after a drive starts logging errors -  Soft Media Errors and 01|18|ff followed by incidental aborts, select timeouts, and command timeouts.

Drives with the part numbers and firmware listed. The array may report errors as "Soft media error" and 01|18|ff followed by soft SCSI bus errors “[IncidentalAbort]", “[Select timeout]", and “[Command timeout]". The drive may be taken offline on one SP however remain active on the second SP reporting similar errors.


Example SP logs:

>>> drive repeatedly reports 01/18/ff
B       11/15/20 18:05:31.994 Bus0 Enc0 Dsk02   11c4004 [WARN] System: Disk 0_0_2 Soft media error. DrvErrExtStat:0x22 SRT 35ms ST 0x767fd102672 ET 0x767fd10b014 . [Recovered error (on-drive ECC)]
B       11/15/20 18:05:32.009 Bus0 Enc0 Dsk02   11c0006 [INFO] System: Disk 0_0_2 01|18|ff BLBA 0x32d948218 OP 0x88, LBA 0x32d948200, SZ 0x80 .
A       11/15/20 18:06:18.548 Bus0 Enc0 Dsk02   11c4004 [WARN] System: Disk 0_0_2 Soft media error. DrvErrExtStat:0x22 SRT 66ms ST 0x7680628d0f1 ET 0x7680629d1c6 . [Recovered error (on-drive ECC)]
A       11/15/20 18:06:18.566 Bus0 Enc0 Dsk02   11c0006 [INFO] System: Disk 0_0_2 01|18|ff BLBA 0x2d6cce4d8 OP 0x88, LBA 0x2d6cce4d0, SZ 0x10 .

>>> followed by Soft SCSI bus errors (Incidental abort and selection timeout)
A       11/15/20 18:17:33.877 Bus0 Enc0 Dsk02   11c4003 [WARN] System: Disk 0_0_2 Soft SCSI bus error. DrvErrExtStat:0xdd SRT 522ms ST 0x7682e5dd934 ET 0x7682e65cf8b . [IncidentalAbort]
B       11/15/20 18:17:33.892 Bus0 Enc0 Dsk02   11c4003 [WARN] System: Disk 0_0_2 Soft SCSI bus error. DrvErrExtStat:0xdd SRT 535ms ST 0x768280ad284 ET 0x7682812faab . [IncidentalAbort]
A       11/15/20 18:17:33.910 Bus0 Enc0 Dsk02   11c4003 [WARN] System: Disk 0_0_2 Soft SCSI bus error. DrvErrExtStat:0x7 SRT 537ms ST 0x7682e5d9c09 ET 0x7682e65cfc5 . [Select timeout]

>>> followed by repeated command timeout.
A       11/15/20 20:44:30.049 Bus0 Enc0 Dsk02   11c4003 [WARN] System: Disk 0_0_2 Soft SCSI bus error. DrvErrExtStat:0x6 SRT 4340ms ST 0x76a3b63f4df ET 0x76a3ba4175c . [Command timeout]
A       11/15/20 20:44:30.069 Bus0 Enc0 Dsk02   11c4003 [WARN] System: Disk 0_0_2 Soft SCSI bus error. DrvErrExtStat:0x6 SRT 4201ms ST 0x76a3b641f27 ET 0x76a3ba41b53 . [Command timeout]
A       11/15/20 20:44:30.090 Bus0 Enc0 Dsk02   11c4003 [WARN] System: Disk 0_0_2 Soft SCSI bus error. DrvErrExtStat:0x6 SRT 4210ms ST 0x76a3b63e5a2 ET 0x76a3ba41f97 . [Command timeout]

>>> drive reported too many port errors and was logged out, then it could not log in again. It failed on SPB with the Activate timer expired.
B       11/16/20 05:52:47.360 Bus0 Enc0 LccB    1678052 [ERROR] System: LCC is faulted. This failure may be caused by a component other than the LCC (Drive, Cable, Connector, ...).
B       11/16/20 05:54:42.263 Bus0 Enc0 Dsk02     60258 [CRIT] User: Disk 0_0_2 has failed (Part Number 005053578, Serial Number 50L0A01FTT2F)
B       11/16/20 05:54:42.879 Bus0 Enc0 LccB      602bc [CRIT] User: LCC has faulted (Part Number 303-396-000B-00, Serial Number CF2DD201400245)
B       11/16/20 05:55:23.571 Bus0 Enc0 Dsk02   1678058 [ERROR] System: Disk 0_0_2 taken offline. Escalate to support. SN:50L0A01FTT2F TLA:005053578 Rev:PA5H (0x2030001) Reason:Expired.




Systems May Be Affected:

Product (and version) Dell Unity 300F, Dell Unity 350F, Dell Unity XT 380F, Dell Unity 400F, Dell Unity 450F, Dell Unity XT 480F, Dell Unity 600F, Dell Unity 650F, Dell Unity XT 680F, Dell Unity XT 880F, Dell Unity Family |Dell Unity All Flash
Running this Core Software
(Operating System (OS) or Operating Environment (OE))
All Operating Environments
When this condition is true Array contains any of the following drive part numbers with firmware PA5H
005052867, 005052866, 005052869, 005052868, 005052871, 005052870, 005053573, 005053572, 005053577, 005053576, 005053579, 005053578, 005052859, 005052858, 005052861, 005052860, 005052863, 005052862, 005053583, 005053582, 005053596, 005053595, 005053598, 005053597, 005053575, 005053574

 

Cause

Drives with many data errors will run internal error recovery in conjunction with long command timeout setting in firmware can cause the drive to have performance issues

The drives built-in error recovery normally responds within acceptable time limits however on occasion due to a NAND defect the number of blocks required for recovery can be great and in combination with long command, timeout settings can cause excessive command timeouts and affect the performance of the array.

Resolution

Action Type Resolution
Action Needed Upgrade drive firmware to PA5L to address the issue.
Who Can Complete the Action Customer
Issue Addressed in this OS, OE or Software

Firmware PA5L available on www.dell.com/support in Unity drive firmware bundle V18 or greater.

 

SolVe Customer Resolution Procedure For additional information on upgrading drive firmware, select "Software Upgrade Procedures" in SolVe for Dell Unity. 

or follow the article Drive Firmware Upgrade Instructions and Information
Resolution Detail Arrays currently experiencing performance issues:  For immediate relief of the performance issues, take the offending drive out of the pool. Once the drive is out of the pool, performance should improve immediately.  To accomplish this:
If physical access to the system is available:
Remove the drive identified as reporting Soft SCSI bus error and  [Command timeout]
Swap out the removed drive with an equivalent spare.  Do not insert the replacement drive for 5 minutes to allow the system to rebuild to spare from parity.
Contact Dell Technical Support, as necessary, to request a drive replacement for the drive causing the performance issues.
If no physical access to the system is immediately available, and to discuss other possible workarounds, contact Dell Technical Support or an Authorized Service Representative and quote this DTA article number.
Upgrade drive firmware to PA5L.
Ensure new array installs upgrade to drive firmware PA5L
NOTE:
The PA5L firmware is intended to replace drives reporting excessive 01/18/ff and Command Timeout errors, sooner. Note that a secondary performance impact may be experienced during a drive replacement rebuild or reshuffle/rebalance operation in dynamic pools. LKB 000055614 will be updated accordingly when this issue is addressed.

Refer to LKB 000021322 for instructions on updating drive firmware.
Unity drive firmware bundles are available for download from the www.dell.com/support, and it can be found by searching for "Unity Drive Firmware Package".

Affected Products

Dell EMC Unity Family
Article Properties
Article Number: 000190983
Article Type: Solution
Last Modified: 27 Mar 2025
Version:  5
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.