PowerFlex 无法清除 SDS 设备错误 —作系统级别的设备状态为离线
Summary: 使用 SDS 时,“清除设备错误”不起作用,设备将处于错误或故障状态。 当 SDS 设备上存在许多并发设备错误时,管理设备的作系统可能会使设备离线。
This article applies to
This article does not apply to
This article is not tied to any specific product.
Not all product versions are identified in this article.
Symptoms
情况说明
SDS 设备发生故障或报告错误,当尝试使用 SDS“清除设备错误”时,设备将保持错误或故障状态。
症状
ScaleIO 系统事件报告磁盘设备错误或故障:
799 2016-01-22 17:28:39.818 SDS_DEV_ERROR_REPORT ERROR Device error reported on SDS: 10.3.1.21, Device: /dev/sdb. State: NORMAL upDownState: UP processState: DEV_ERR_INPROGRESS devErrState: REPORT
查询 SDS 报告磁盘设备错误或故障:
ScaleIO-10-1-1-202:~ # scli --query_sds --sds_id 9780122600000003
Device information (total 8 devices):
1: Name: ScaleIO-6a0a6209 Path: /dev/sdb Original-path: /dev/sdb ID: 851f01c100030000
Storage Pool: SASPOOL, Capacity: 1675 GB Error-fixes: 0 scanned 0 MB, Compare errors: 0 State: Normal
2: Name: ScaleIO-6a0a620a Path: /dev/sdc Original-path: /dev/sdc ID: 851f01c200030001
Storage Pool: SASPOOL, Capacity: 1675 GB Error-fixes: 0 scanned 0 MB, Compare errors: 0 State: Normal
3: Name: ScaleIO-6a0a620b Path: /dev/sdd Original-path: /dev/sdd ID: 851f01c300030002
Storage Pool: SASPOOL, Capacity: 1675 GB Error-fixes: 0 scanned 0 MB, Compare errors: 0 State: Normal
4: Name: ScaleIO-6a0a620c Path: /dev/sde Original-path: /dev/sde ID: 851f01c400030003
Storage Pool: SSDPOOL, Capacity: 1489 GB Error-fixes: 0 scanned 0 MB, Compare errors: 0 State: Normal
5: Name: ScaleIO-6a0a620d Path: /dev/sdf Original-path: /dev/sdf ID: 851f01c500030004
Storage Pool: SASPOOL, Capacity: 1675 GB Error-fixes: 0 scanned 0 MB, Compare errors: 0 State: Error
6: Name: ScaleIO-6a0a620e Path: /dev/sdg Original-path: /dev/sdg ID: 851f01c600030005
Storage Pool: SSDPOOL, Capacity: 1489 GB Error-fixes: 0 scanned 0 MB, Compare errors: 0 State: Normal
7: Name: ScaleIO-6a0a620f Path: /dev/sdh Original-path: /dev/sdh ID: 851f01c700030006
Storage Pool: SASPOOL, Capacity: 1675 GB Error-fixes: 0 scanned 0 MB, Compare errors: 0 State: Normal
8: Name: ScaleIO-6a0a6210 Path: /dev/sdi Original-path: /dev/sdi ID: 851f01c800030007
Storage Pool: SASPOOL, Capacity: 1675 GB Error-fixes: 0 scanned 0 MB, Compare errors: 0 State: Normal
SVM/Linux 消息文件报告离线设备:
Jan 22 17:28:35 ScaleIO-10-1-1-201 kernel: [45678.865605] end_request: I/O error, dev sdg, sector 1138313984 Jan 22 17:28:37 ScaleIO-10-1-1-201 kernel: [45681.452800] sd 2:0:6:0: [sdg] task abort on host 2, ffff8800b83f6e80 Jan 22 17:28:37 ScaleIO-10-1-1-201 kernel: [45681.452877] sd 2:0:1:0: [sdb] task abort on host 2, ffff8801b7476d80 Jan 22 17:28:37 ScaleIO-10-1-1-201 kernel: [45681.453086] sd 2:0:8:0: [sdh] task abort on host 2, ffff8800b83f6280 Jan 22 17:28:37 ScaleIO-10-1-1-201 kernel: [45681.453109] sd 2:0:8:0: [sdh] task abort on host 2, ffff8800a37a6c80 Jan 22 17:28:37 ScaleIO-10-1-1-201 kernel: [45681.453133] sd 2:0:9:0: [sdi] task abort on host 2, ffff8800b83f6b80 Jan 22 17:28:47 ScaleIO-10-1-1-201 kernel: [45691.537180] sd 2:0:5:0: rejecting I/O to offline device
ESXi VMkernel 日志报告磁盘设备上的错误:
2016-01-22T09:40:21.801Z cpu1:33420)ScsiDeviceIO: 7024: Could not detect setting of QErr for device naa.614187704f3b47001e34b585468abf85. Error Not supported. 2016-01-22T09:40:21.801Z cpu1:33420)ScsiDeviceIO: 7538: Could not detect setting of sitpua for device naa.614187704f3b47001e34b585468abf85. Error Not supported. 2016-01-22T09:40:21.801Z cpu5:33593)NMP: nmp_ThrottleLogForDevice:3178: Cmd 0x28 (0x439e1a830cc0, 0) to dev "naa.614187704f3b47001e34b585468abf85" on path "vmhba1:C2:T1:L0" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0x6 0x29 0x0. Act:NONE 2016-01-22T09:40:21.801Z cpu5:33593)ScsiDeviceIO: 2607: Cmd(0x439e1a830cc0) 0x28, CmdSN 0xd62 from world 0 to dev "naa.614187704f3b47001e34b585468abf85" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x6 0x29 0x0. 2016-01-22T09:40:21.801Z cpu5:33593)ScsiCore: 1609: Power-on Reset occurred on naa.614187704f3b47001e34b585468abf85 2016-01-22T09:40:21.844Z cpu5:33593)NMP: nmp_ThrottleLogForDevice:3178: Cmd 0x1a (0x439e1a830cc0, 0) to dev "naa.614187704f3b47001e34b585468abf85" on path "vmhba1:C2:T1:L0" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0. Act:NONE 2016-01-22T09:40:21.844Z cpu5:33593)ScsiDeviceIO: 2645: Cmd(0x439e1a830cc0) 0x1a, CmdSN 0xd66 from world 0 to dev "naa.614187704f3b47001e34b585468abf85" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0. 2016-01-22T09:40:21.844Z cpu1:33420)ScsiDevice: 3835: Successfully registered device "naa.614187704f3b47001e34b585468abf85" from plugin "NMP" of type 0 2016-01-22T09:40:21.844Z cpu1:33420)NMP: nmp_DeviceUpdateProtectionInfo:569: Set protection info for device 'naa.614187704f3b47001e34b585468abf85', Enabled: 0 ProtType: 0x0 Guard: 0x0 ProtMask: 0x0 2016-01-22T22:27:49.085Z cpu19:33115)WARNING: NMP: nmpDeviceTaskMgmt:2284: Attempt to issue lun reset on device naa.614187704f3b47001e34b585468abf85. This will clear any SCSI-2 reservations on the device.
影响
磁盘设备保持故障状态。
您无法清除 SDS 设备错误。
Cause
当磁盘设备出现问题并且磁盘设备由于任何原因没有响应时,作系统会使磁盘设备离线。
提醒:如果磁盘设备出现故障,设备不会恢复为联机状态,则可能需要更换磁盘设备。
Resolution
解决方法
SVM — Linux 环境:
- 验证 磁盘设备的当前状态:
[root@ssltest ~]# cat /sys/block/sdx/device/state offline
- 如果磁盘设备标记为“离线”,请使用以下命令 将磁盘设备“联机”:
echo "running" > /sys/block/sdx/device/state
- 使用 SCLI 或 UI 清除 SDS 设备 错误。
Windows 环境:
- 使用“Logical Disk Manager”或 Disk Part验证磁盘设备的当前状态:
C:\>diskpart Microsoft DiskPart version 6.1.7601 Copyright (C) 1999-2008 Microsoft Corporation. On computer: ISENABLOVSL1C DISKPART> list disk Disk ### Status Size Free Dyn Gpt -------- ------------- ------- ------- --- --- Disk 0 Online 238 GB 0 B DISKPART>
- 如果磁盘设备标记为“离线”,请使用以下命令 将磁盘设备“联机” 或使用“逻辑磁盘管理器”:
DISKPART> online disk
任何作系统上在线磁盘设备的替代选项:
提醒:此选项需要“停机时间”,并触发 ScaleIO 系统上的重建/重新平衡:
- 如果可能,请使 SDS 进入 维护模式
- 重新启动 SDS 服务器
- 退出维护模式 (如果在步骤 1 中完成)
- 清除 SDS 设备错误 (使用 UI 或 CLI)
- 验证 ScaleIO 中的设备状态
Affected Products
PowerFlex rack, ScaleIOArticle Properties
Article Number: 000281632
Article Type: Solution
Last Modified: 06 Mar 2025
Version: 3
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.