PowerFlex: SDC "lost access to volume"
Summary: PowerFlex SDC can log "lost access to volume" when the local datastore of SVM cannot respond in a given time.
This article applies to
This article does not apply to
This article is not tied to any specific product.
Not all product versions are identified in this article.
Symptoms
- On the ESXi host of problem SVM, the raid controller driver lsi_mr3 reports aborts on the underlying disk of the local datastore, and ESXi reports lost access to volume.
In VMkernel log:
2017-12-03T17:47:01.634Z cpu54:33648)ScsiDeviceIO: 2636: Cmd(0x43be59ec8a00) 0x1a, CmdSN 0x1f6f4 from world 0 to dev "naa.6800733259adcc4f214574350619b91a" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0. 2017-12-03T17:47:44.125Z cpu1:171607)lsi_mr3: mfi_TaskMgmt:254: Processing taskMgmt abort for device: vmhba2:C2:T0:L0 2017-12-03T17:47:44.125Z cpu1:171607)lsi_mr3: mfi_TaskMgmt:262: ABORT 2017-12-03T17:47:45.125Z cpu34:32905)lsi_mr3: mfi_TaskMgmt:254: Processing taskMgmt virt reset for device: vmhba2:C2:T0:L0 2017-12-03T17:47:45.125Z cpu34:32905)lsi_mr3: mfi_TaskMgmt:258: VIRT_RESET cmd # 273733296 2017-12-03T17:47:45.125Z cpu34:32905)lsi_mr3: mfi_TaskMgmt:262: ABORT 2017-12-03T17:47:45.126Z cpu1:171607)lsi_mr3: fusionWaitForOutstanding:2531: megasas: [ 0]waiting for 1 commands to complete 2017-12-03T17:47:46.877Z cpu29:35817)HBX: 2851: 'datastore3': HB at offset 3691008 - Waiting for timed out HB:
In hostd log:
2017-12-03T17:47:45.126Z info hostd[41B40B70] [Originator@6876 sub=Vimsvc.ha-eventmgr] Event 219 : Lost access to volume 59b2c23a-98396dd8-aa53-84a9c4b71ca1 (datastore3) due to connectivity issues. Recovery attempt is in progress and outcome will be reported shortly.
- The SDS device can report task abort(which can be found in both var/log/messages and the VMware log), thus the SDS process on this SVM would experience long inflight IO, which further impacts the stability of the scaleio system
- SDCs then log IO error in VMkernel log, since the problematic SDS was experiencing some net socket issue because of the slow response of local datastore, and the application datastores residing on ScaleIO volume can report loss of access:
In VMkernel log:
2017-12-03T17:47:52.060Z cpu39:33682)scini: netSock_RcvIntrn:1903: ScaleIO R2_0:Error: Failed Success to receive 128 data PTR 0x4306d2923de4 socket 0x4306d2924200 2017-12-03T17:47:54.061Z cpu1:33476)scini: mapVolIO_ReportIOErrorIfNeeded:361: ScaleIO R2_0:[201590843] IO-ERROR comb: 32ba80000015. offsetInComb 11387944. SizeInLB 1. SDS_ID de31ad4800000001. Comb Gen 39. Head Gen 10199. 2017-12-03T17:47:54.061Z cpu1:33476)scini: mapVolIO_ReportIOErrorIfNeeded:374: ScaleIO R2_0:Vol ID 0x756be73300000017. Last fault Status IO_HARD_ERROR(20).Last error Status NOT_CONN(4) Reason (ABORTED) Retry count (2) chan (4)
In hostd log:
2017-12-03T17:47:54.125Z info hostd[3FAAFB70] [Originator@6876 sub=Vimsvc.ha-eventmgr] Event 220 : Lost access to volume 59cb2f80-40ad26ac-cf4f-84a9c4b71ce1 (OS_windows_01) due to connectivity issues. Recovery attempt is in progress and outcome will be reported shortly. 2017-12-03T17:47:54.125Z info hostd[3FAAFB70] [Originator@6876 sub=Vimsvc.ha-eventmgr] Event 221 : Lost access to volume 59cb2f9e-984e3ff8-63e1-84a9c4b71ce1 (OS_Linux_01) due to connectivity issues. Recovery attempt is in progress and outcome will be reported shortly. Impact
Impact
SDC may lost access to datastores reside on PowerFlex volume, and the application or VM on those datastore can be impacted, for example, the file system went read-only.
Cause
- The VMFS datastores are monitored through the heartbeats that are issued in the form of write operations approximately once every 3 s to the VMFS volumes from the hosts. When the local datastore of SVM is responding slowly, the total time of the heartbeat I/O does not complete within a 16-second window, the datastore is marked offline and a Lost access to volume log message is generated by hostd to reflect this behavior. More detail can be found in VMware KB Understanding lost access to volume messages in ESXi
- In this case, the exact behavior of the SDS cannot be predicted, as in which keepalive messages would be missed, against which other PowerFlex components. Some of the SDC IOs which must be served by this SDS, may exceed the timeout of the Operating System or applications, causing an impact.
Resolution
Engage VMware and hardware vendor to fix the problem on raid controller or its firmware and driver.
Additional Information
A temporary workaround would be to remove the problem SDS, or migrate it to another good local datastore.
Affected Products
PowerFlex appliance Intelligent Catalog Software, VxFlex Product FamilyProducts
PowerFlex rack, VxFlex Ready Nodes, PowerFlex Appliance, PowerFlex custom node, PowerFlex appliance R650, PowerFlex appliance R6525, PowerFlex appliance R660, PowerFlex appliance R6625, Powerflex appliance R750, PowerFlex appliance R760
, PowerFlex appliance R7625, PowerFlex custom node, PowerFlex custom node R650, PowerFlex custom node R6525, PowerFlex custom node R660, PowerFlex custom node R6625, PowerFlex custom node R750, PowerFlex custom node R760, PowerFlex custom node R7625, PowerFlex rack connectivity, PowerFlex rack HW, PowerFlex rack RCM Software, VxFlex Product Family, VxFlex Ready Node, VxFlex Ready Node R640, VxFlex Ready Node R740xd, PowerFlex appliance R640, PowerFlex appliance R740XD, PowerFlex appliance R7525, PowerFlex appliance R840, VxFlex Ready Node R840
...
Article Properties
Article Number: 000027267
Article Type: Solution
Last Modified: 22 Sept 2025
Version: 4
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.