ScaleIO: Power-on Reset errors due to datastore not using ATS locks
Summary: When vSphere reports Power-on Reset errors from one or more devices, if those devices are from ScaleIO, VMFS3 Hardware Accelerated Locking should be enabled. This is the procedure on how to change this. ...
Instructions
"power-on reset" errors are caused by several different reasons. This KB is regarding one of the possible reasons that may cause these errors.
To see if this KB addresses your issue, check if Hardware Accelerated Locking is enabled on the ScaleIO device.
Run the following command on the ESXi host that that is having the Power-On Reset errors:
Retrieve the Volume ID/Name from the logs or vSphere:
vmkfstools -Ph -v1 /vmfs/volumes/
If the Mode is set to "Public," change this to "Public ATS-Only" How to change the mode to Public ATS-Only:
Migrate all VMs on the Datastore or power them off.
Unmount the Datastore from all but one host.
SSH to the only host that has the Datastore mounted. Run:
vmkfstools --configATSOnly 1 /vmfs/disks/eui.number
(1 is Public ATS-only | 0 is Public)
Verify that the mode is now Public ATS-Only.
vmkfstools -Ph -v1 /vmfs/volumes/
Once this change is made, check the logs to see if the errors are still occurring.
Additional Information
The following event is an example of what is seen:
Vmkernel logs:
Line 3135: 2017-01-16T20:02:57.847Z cpu41:33611)ScsiCore: 1609: Power-on Reset occurred on eui.1e43660515bd6ba33eb0809500000000 Line 6741: 2017-01-16T20:03:33.849Z cpu41:33611)ScsiCore: 1609: Power-on Reset occurred on eui.1e43660515bd6ba33eb0809500000000 trc logs (from the Primary MDM):
Line 138: 17/01 14:59:18.386534 f07d2eb8:volMgr_BulkGenUpdateMem:04948: Allow SCSI-2-Reserve to vol 3eb0809500000000 Requester isID fffffffffffffffe Requester sdcId 207d5fbe00000005 Requester vol2SDCIId ffffffff00000000 Line 232: 17/01 14:59:30.117231 f07d2eb8:volMgr_BulkGenUpdateMem:04948: Allow SCSI-2-Reserve to vol 3eb0809500000000 Requester isID fffffffffffffffe Requester sdcId 207daddc0000000b Requester vol2SDCIId ffffffff00000000
Look in vSphere for the eui.number that is seen in the Power-on Reset errors. Using this number, find the Datastore it is presenting itself to and get the volume ID:
Example:
583ddd83-0216a301-4048-54ab3a6f9efd
From the VMSupport logs in the commands folder, look for the file that matches the volume ID.
Example:
vmkfstools_-P--v-10-vmfsvolumes583ddd83-0216a301-4048-54ab3a6f9efd vmkfstools_-P--v-10-vmfsvolumes583ddd83-0216a301-4048-54ab3a6f9efd: VMFS-5.61 file system spanning 1 partitions. File system label (if any): sio_sc1bm_gw Mode: public Capacity 68451041280 (65280 file blocks * 1048576), 46207598592 (44067 blocks) avail, max supported file size 69201586814976 Volume Creation Time: Tue Nov 29 19:56:51 2016 Files (max/free): 130000/129943 Ptr Blocks (max/free): 64512/64466 Sub Blocks (max/free): 32000/31981 Secondary Ptr Blocks (max/free): 256/256 File Blocks (overcommit/used/overcommit %): 0/21213/0 Ptr Blocks (overcommit/used/overcommit %): 0/46/0 Sub Blocks (overcommit/used/overcommit %): 0/19/0 Volume Metadata size: 804028416 UUID: 583ddd83-0216a301-4048-54ab3a6f9efd Logical device: 583ddd7e-1fbfcf0d-78ac-54ab3a6f9efd Partitions spanned (on "lvm"): eui.1e43660515bd6ba33eb0809500000000:1 Is Native Snapshot Capable: YES OBJLIB-LIB: ObjLib cleanup done. WORKER: asyncOps=0 maxActiveOps=0 maxPending=0 maxCompleted=0