RecoverPoint voor VM's: Journaalvolumes die zijn vergrendeld op specifieke ESXi-hosts kunnen ertoe leiden dat datareplicatie niet beschikbaar is en dat regelgeving voor opnieuw opstarten wordt gereguleerd.

Summary: Journaalvolumes die zijn vergrendeld op specifieke ESXi-hosts kunnen ertoe leiden dat datareplicatie niet beschikbaar is en dat regelgeving voor opnieuw opstarten wordt gereguleerd.

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Symptoms

RecoverPoint for Virtual Machines gebruikt logboekvolumes om point-in-time-images bij te houden en te beheren.
Binnen een RP4VMs-omgeving kan er een probleem optreden waardoor een journaalvolume wordt vergrendeld op één ESXi-host.
Als het vRPA-paar dat is toegewezen om de VM's te repliceren, wordt gevMotioned naar een andere ESXi-host, kan er geen datareplicatie beschikbaar (DRU) optreden.

Dit kan van invloed zijn op alle consistentiegroepen (CG's) die aan een specifieke RPA zijn toegewezen, waardoor de status DISABLE of ERROR wordt.

 

Er kunnen crashes van het replicatieproces optreden:

replication Sun May 19 20:12:01 UTC 2024

replication Sun May 19 20:37:55 UTC 2024

replication Sun May 19 20:38:21 UTC 2024

replication Sun May 19 21:03:10 UTC 2024

replication Sun May 19 21:32:18 UTC 2024

replication Sun May 19 22:03:39 UTC 2024

replication Sun May 19 22:32:26 UTC 2024

replication Sun May 19 23:03:56 UTC 2024

replication Sun May 19 23:28:46 UTC 2024

Hierdoor wordt de RecoverPoint Appliance (RPA) losgekoppeld van het cluster.

Sitecontrole vRPA replicatielogboeken met deadlock:

2024/05/20 01:27:54.039 - #1 - 21594/21577 - DLManager: deadlock suspected at N6Kashya26ReplicationControl_AO_IMPLE, pid=21659/21577, config=(alertThreshold=10,alertLevel=1,killThreshold=0,coreThreshold=0,full=1), deltaTime=501, cmd=N6Kashya33SerializableMethodRequestRefCountE

2024/05/20 01:27:54.039 - #1 - 21594/21577 - DLManager: deadlock suspected at N6Kashya26ReplicationControl_AO_IMPLE, pid=21658/21577, config=(alertThreshold=10,alertLevel=1,killThreshold=0,coreThreshold=0,full=1), deltaTime=288, cmd=N6Kashya33SerializableMethodRequestRefCountE

2024/05/20 01:27:54.039 - #1 - 21594/21577 - DLManager: deadlock suspected at N6Kashya26ReplicationControl_AO_IMPLE, pid=21657/21577, config=(alertThreshold=10,alertLevel=1,killThreshold=0,coreThreshold=0,full=1), deltaTime=493, cmd=N6Kashya33SerializableMethodRequestRefCountE

2024/05/20 01:27:54.039 - #1 - 21594/21577 - DLManager: deadlock suspected at N6Kashya26ReplicationControl_AO_IMPLE, pid=21656/21577, config=(alertThreshold=10,alertLevel=1,killThreshold=0,coreThreshold=0,full=1), deltaTime=503, cmd=N6Kashya33SerializableMethodRequestRefCountE

2024/05/20 01:27:54.039 - #1 - 21594/21577 - DLManager: deadlock suspected at N6Kashya26ReplicationControl_AO_IMPLE, pid=21655/21577, config=(alertThreshold=10,alertLevel=1,killThreshold=0,coreThreshold=0,full=1), deltaTime=775, cmd=N6Kashya33SerializableMethodRequestRefCountE

2024/05/20 01:27:54.039 - #1 - 21594/21577 - DLManager: deadlock suspected at N6Kashya26ReplicationControl_AO_IMPLE, pid=21654/21577, config=(alertThreshold=10,alertLevel=1,killThreshold=0,coreThreshold=0,full=1), deltaTime=501, cmd=N6Kashya33SerializableMethodRequestRefCountE

2024/05/20 01:27:54.039 - #1 - 21594/21577 - DLManager: deadlock suspected at N6Kashya26ReplicationControl_AO_IMPLE, pid=21653/21577, config=(alertThreshold=10,alertLevel=1,killThreshold=0,coreThreshold=0,full=1), deltaTime=799, cmd=N6Kashya33SerializableMethodRequestRefCountE

RPA-controlelogboeken waaruit blijkt dat replicatie wordt gedood:

RecoverPoint-0x5d0aa2fff4d45cf4-RPA6:/home/kos/control# grep killRep result.log.latest

2024/05/20 02:54:41.348 - #2 - 23464/23290 - RepHandler: killRep called. isKill=1 reason=Replication process didn't handle last command in a timely manner. cmd=Vector(busyType=BUSY_WITH_RCS(busyWithWhat= RC (RC_preOpenTask),busyWithWhichRCs= VectorMap([KBox6 => 1]))) isCoreDump=1

2024/05/20 02:54:41.936 - #2 - 23464/23290 - RepHandler: killRep called. isKill=0 reason=Rep disconnected. Verifying it is down isCoreDump=0

2024/05/20 02:54:48.974 - #2 - 23464/23290 - RepHandler: killRep called. isKill=1 reason=Replication process didn't handle last command in a timely manner. cmd=Vector(busyType=BUSY_WITH_RCS(busyWithWhat= RC (RC_preOpenTask),busyWithWhichRCs= VectorMap([KBox6 => 1]))) isCoreDump=1

2024/05/20 02:54:49.172 - #2 - 23464/23290 - RepHandler: killRep called. isKill=0 reason=Rep disconnected. Verifying it is down isCoreDump=0

2024/05/20 03:18:43.263 - #2 - 23464/23290 - RepHandler: killRep called. isKill=1 reason=Replication process didn't handle last command in a timely manner. cmd=Vector(busyType=BUSY_WITH_RCS(busyWithWhat= RC (RC_preOpenTask),busyWithWhichRCs= VectorMap([KBox6 => 1]))) isCoreDump=1

2024/05/20 03:18:43.265 - #2 - 23464/23290 - RepHandler: killRep called. isKill=1 reason=Replication process didn't handle last command in a timely manner. cmd=Vector(busyType=BUSY_WITH_RCS(busyWithWhat= RC (RC_preOpenTask),busyWithWhichRCs= VectorMap([KBox6 => 1]))) isCoreDump=1

2024/05/20 03:18:43.266 - #2 - 23464/23290 - RepHandler: killRep called. isKill=1 reason=Replication process didn't handle last command in a timely manner. cmd=Vector(busyType=BUSY_WITH_RCS(busyWithWhat= RC (RC_preOpenTask),busyWithWhichRCs= VectorMap([KBox6 => 1]))) isCoreDump=1

2024/05/20 03:18:43.268 - #2 - 23464/23290 - RepHandler: killRep called. isKill=1 reason=Replication process didn't handle last command in a timely manner. cmd=Vector(busyType=BUSY_WITH_RCS(busyWithWhat= RC (RC_preOpenTask),busyWithWhichRCs= VectorMap([KBox6 => 1]))) isCoreDump=1

2024/05/20 03:18:43.269 - #2 - 23464/23290 - RepHandler: killRep called. isKill=1 reason=Replication process didn't handle last command in a timely manner. cmd=Vector(busyType=BUSY_WITH_RCS(busyWithWhat= RC (RC_preOpenTask),busyWithWhichRCs= VectorMap([KBox6 => 1]))) isCoreDump=1

2024/05/20 03:18:43.807 - #2 - 23464/23290 - RepHandler: killRep called. isKill=0 reason=Rep disconnected. Verifying it is down isCoreDump=0

2024/05/20 03:59:26.445 - #2 - 23464/23290 - RepHandler: killRep called. isKill=1 reason=Replication process didn't handle last command in a timely manner. cmd=Vector(busyType=BUSY_WITH_RCS(busyWithWhat= RC (RC_preOpenTask),busyWithWhichRCs= VectorMap([KBox6 => 1]))) isCoreDump=1

 

ESXi host xxxxxxx iofilterd-emcjiraf.log VSAN-objectvergrendelingen weergeven:

2024-05-23T17:15:33.517Z iofilterd-emcjiraf[2104572]: FiltLib: Opening virtual disk ("/vmfs/volumes/vsan:529b87d1fe9acb29-b0cbb19ee7de1e57/RPvStorage/5d0aa2fff4d45cf4_IOFilter_JVOL_00668.vmdk") failed: "Failed to lock the file" (16392).

2024-05-23T17:15:33.517Z iofilterd-emcjiraf[2104572]: RPVS_iof_LunInfo_OpenLun: Failed to open vmdk file of lun 668 with status Busy path /vmfs/volumes/vsan:529b87d1fe9acb29-b0cbb19ee7de1e57/RPvStorage/5d0aa2fff4d45cf4_IOFilter_JVOL_00668.vmdk

2024-05-23T17:15:33.517Z iofilterd-emcjiraf[2104572]: execute_open_cmd: Cluster id: 0x5d0aa2fff4d45cf4 lun 668 open failed. open_rc: Busy con 0xc6db9c9340

2024-05-23T17:15:33.518Z iofilterd-emcjiraf[2104572]: execute_close_cmd: cluster id: 0x5d0aa2fff4d45cf4 lun 668 is not opened (openState: 0). Ignore close request

2024-05-23T17:15:33.519Z iofilterd-emcjiraf[2104572]: execute_close_cmd: cluster id: 0x5d0aa2fff4d45cf4 lun 48 is not opened (openState: 0). Ignore close request

2024-05-23T17:15:33.519Z iofilterd-emcjiraf[2104572]: execute_close_cmd: cluster id: 0x5d0aa2fff4d45cf4 lun 507 is not opened (openState: 0). Ignore close request

2024-05-23T17:15:33.519Z iofilterd-emcjiraf[2104572]: execute_close_cmd: cluster id: 0x5d0aa2fff4d45cf4 lun 558 is not opened (openState: 0). Ignore close request

2024-05-23T17:15:33.520Z iofilterd-emcjiraf[2104572]: execute_close_cmd: cluster id: 0x5d0aa2fff4d45cf4 lun 359 is not opened (openState: 0). Ignore close request

2024-05-23T17:15:34.228Z iofilterd-emcjiraf[2104572]: OBJLIB-VSANOBJ: VsanObjLock: Object 'b05fe864-d249-c108-0391-6cfe544f26e4' locked, retrying.

2024-05-23T17:15:36.581Z iofilterd-emcjiraf[2104572]: OBJLIB-VSANOBJ: VsanObjLock: Object 'e6ebf164-9c47-0fd7-6c17-6cfe544f26e4' locked, retrying.

2024-05-23T17:15:37.526Z iofilterd-emcjiraf[2104572]: OBJLIB-VSANOBJ: VsanObjLock: Object 'b2e10165-ee40-1674-3ead-6cfe544f26e4' locked, retrying.

2024-05-23T17:15:37.530Z iofilterd-emcjiraf[2104572]: OBJLIB-VSANOBJ: VsanObjLock: Object '681efa64-16f0-8bb4-ab56-6cfe544f26e4' locked, retrying.

2024-05-23T17:15:38.537Z iofilterd-emcjiraf[2104572]: OBJLIB-VSANOBJ: VsanObjLock: Could not lock object 'b05fe864-d249-c108-0391-6cfe544f26e4': Failed to lock the file (4).

2024-05-23T17:15:38.537Z iofilterd-emcjiraf[2104572]: DISKLIB-VMFS  : "vsan://529b87d1fe9acb29-b0cbb19ee7de1e57/b05fe864-d249-c108-0391-6cfe544f26e4" : failed to open (Failed to lock the file): ObjLib_Open failed. Type 3

2024-05-23T17:15:38.537Z iofilterd-emcjiraf[2104572]: DISKLIB-LINK  : "/vmfs/volumes/vsan:529b87d1fe9acb29-b0cbb19ee7de1e57/f9f9b364-aadc-f567-41a4-6cfe544ef870/5d0aa2fff4d45cf4_IOFilter_JVOL_00116.vmdk" : failed to open (Failed to lock the file).

2024-05-23T17:15:38.537Z iofilterd-emcjiraf[2104572]: DISKLIB-CHAIN : "/vmfs/volumes/vsan:529b87d1fe9acb29-b0cbb19ee7de1e57/f9f9b364-aadc-f567-41a4-6cfe544ef870/5d0aa2fff4d45cf4_IOFilter_JVOL_00116.vmdk" : failed to open (Failed to lock the file).

2024-05-23T17:15:38.545Z iofilterd-emcjiraf[2104572]: DISKLIB-VMFS  : "vsan://529b87d1fe9acb29-b0cbb19ee7de1e57/b05fe864-d249-c108-0391-6cfe544f26e4" : open successful (524293) size = 25769803776, hd = 0. Type 3

2024-05-23T17:15:38.546Z iofilterd-emcjiraf[2104572]: PluginLdr_Load: Loaded plugin 'libvmiof-disk-emcjiraf.so' from '/usr/lib64/vmware/plugin/libvmiof-disk-emcjiraf.so'

2024-05-23T17:15:38.546Z iofilterd-emcjiraf[2104572]: DISKLIB-VMFS  : "vsan://529b87d1fe9acb29-b0cbb19ee7de1e57/b05fe864-d249-c108-0391-6cfe544f26e4" : closed.

2024-05-23T17:15:38.546Z iofilterd-emcjiraf[2104572]: DISKLIB-LIB   : Failed to open '/vmfs/volumes/vsan:529b87d1fe9acb29-b0cbb19ee7de1e57/RPvStorage/5d0aa2fff4d45cf4_IOFilter_JVOL_00116.vmdk' with flags 0xa Failed to lock the file (16392).

2024-05-23T17:15:38.546Z iofilterd-emcjiraf[2104572]: FiltLib: Opening virtual disk ("/vmfs/volumes/vsan:529b87d1fe9acb29-b0cbb19ee7de1e57/RPvStorage/5d0aa2fff4d45cf4_IOFilter_JVOL_00116.vmdk") failed: "Failed to lock the file" (16392).

2024-05-23T17:15:38.546Z iofilterd-emcjiraf[2104572]: RPVS_iof_LunInfo_OpenLun: Failed to open vmdk file of lun 116 with status Busy path /vmfs/volumes/vsan:529b87d1fe9acb29-b0cbb19ee7de1e57/RPvStorage/5d0aa2fff4d45cf4_IOFilter_JVOL_00116.vmdk

2024-05-23T17:15:38.546Z iofilterd-emcjiraf[2104572]: execute_open_cmd: Cluster id: 0x5d0aa2fff4d45cf4 lun 116 open failed. open_rc: Busy con 0xc6dc2deaa0

2024-05-23T17:15:38.547Z iofilterd-emcjiraf[2104572]: execute_close_cmd: cluster id: 0x5d0aa2fff4d45cf4 lun 457 is not opened (openState: 0). Ignore close request

2024-05-23T17:15:38.548Z iofilterd-emcjiraf[2104572]: execute_close_cmd: cluster id: 0x5d0aa2fff4d45cf4 lun 250 is not opened (openState: 0). Ignore close request

2024-05-23T17:15:38.548Z iofilterd-emcjiraf[2104572]: execute_close_cmd: cluster id: 0x5d0aa2fff4d45cf4 lun 623 is not opened (openState: 0). Ignore close request

2024-05-23T17:15:38.548Z iofilterd-emcjiraf[2104572]: execute_close_cmd: cluster id: 0x5d0aa2fff4d45cf4 lun 436 is not opened (openState: 0). Ignore close request

2024-05-23T17:15:38.549Z iofilterd-emcjiraf[2104572]: execute_close_cmd: cluster id: 0x5d0aa2fff4d45cf4 lun 62 is not opened (openState: 0). Ignore close request

2024-05-23T17:15:40.886Z iofilterd-emcjiraf[2104572]: OBJLIB-VSANOBJ: VsanObjLock: Object 'e6ebf164-9c47-0fd7-6c17-6cfe544f26e4' locked, retrying.

2024-05-23T17:15:41.837Z iofilterd-emcjiraf[2104572]: OBJLIB-VSANOBJ: VsanObjLock: Object 'b2e10165-ee40-1674-3ead-6cfe544f26e4' locked, retrying.

2024-05-23T17:15:41.842Z iofilterd-emcjiraf[2104572]: OBJLIB-VSANOBJ: VsanObjLock: Object '681efa64-16f0-8bb4-ab56-6cfe544f26e4' locked, retrying.

2024-05-23T17:15:42.561Z iofilterd-emcjiraf[2104572]: OBJLIB-VSANOBJ: VsanObjLock: Object 'bcdce564-9ed7-ff34-42df-6cfe544f26e4' locked, retrying.

In de host waar de betreffende RPA in de vmkernel.log is geplaatst, worden de balgvermeldingen afgedrukt:

2024-10-21T13:08:13.487Z cpu31:2101016)WARNING: NFSLock: 4084: File is being locked by a consumer on host HOSTNAME0503.corp.qbe.com with exclusive lock.
2024-10-21T13:08:13.487Z cpu31:2101016)NFSLock: 5032: failed to get lock on file 38233f05329208c8_IOFilter_JVOL_00244-flat.vmdk(.lck-d84c0c0000000000) 0x430b567b2f60 on DATASTORE xxx.xxx.xxx.xxx (xxx.xxx.xxx.xxx): Busy
2024-10-21T13:08:23.795Z cpu31:2101016)WARNING: NFSLock: 4084: File is being locked by a consumer on host HOSTNAME0503.corp.qbe.com with exclusive lock.
2024-10-21T13:08:23.795Z cpu31:2101016)NFSLock: 5032: failed to get lock on file 38233f05329208c8_IOFilter_JVOL_00244-flat.vmdk(.lck-d84c0c0000000000) 0x430b567b7330 on DATASTORE xxx.xxx.xxx.xxx (xxx.xxx.xxx.xxx): Busy
2024-10-21T13:08:33.854Z cpu7:2099418)WARNING: NFSLock: 4084: File is being locked by a consumer on host HOSTNAME0503.corp.qbe.com with exclusive lock.
2024-10-21T13:08:33.854Z cpu7:2099418)NFSLock: 5032: failed to get lock on file 38233f05329208c8_IOFilter_JVOL_00069-flat.vmdk(.lck-8d7b030000000000) 0x430b567db0b0 on DATASTORE xxx.xxx.xxx.xxx (xxx.xxx.xxx.xxx): Busy
2024-10-21T13:08:44.163Z cpu2:2099418)WARNING: NFSLock: 4084: File is being locked by a consumer on host HOSTNAME0503.corp.qbe.com with exclusive lock.

Cause

De oorzaak van deze situatie wordt momenteel onderzocht.
Er is momenteel geen permanente oplossing beschikbaar; het is echter een situatie die specifiek is voor RP4VM's.

De situatie kan zich in verschillende scenario's voordoen:

1. Wanneer vRPA die verantwoordelijk is voor het beschermen van een VM, wordt vMotioned naar een andere ESXi-host als de oorspronkelijke ESXi-host het logboek naar die specifieke ESXi-host vergrendelt.

2. Wanneer een consistentiegroep het eigendom verplaatst van het ene vRPA-paar naar het andere, waardoor de bovenstaande situatie wordt aangeroepen.

Resolution

Resolutie:
Er bestaat een oplossing voor dit probleem, maar hiervoor is hulp van de technische support van Dell Technologies vereist.  De juiste RP4VM's, ESXi- en vCenter-logboeken moeten worden verzameld om deze situatie te beoordelen en zo nodig worden geëscaleerd. Neem contact op met het Dell Technologies Customer Support Center of uw servicevertegenwoordiger voor technische support en raadpleeg de ID van deze Dell Technologies Knowledge Base-oplossing.

Affected Products

RecoverPoint for Virtual Machines

Products

RecoverPoint for Virtual Machines
Article Properties
Article Number: 000053722
Article Type: Solution
Last Modified: 31 Oct 2024
Version:  4
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.