RecoverPoint for VM: 저널 볼륨이 특정 ESXi 호스트로 잠기면 데이터 복제를 사용할 수 없게 되고 재부팅 규정이 적용될 수 있습니다.

Summary: 저널 볼륨이 특정 ESXi 호스트로 잠기면 데이터 복제를 사용할 수 없게 되고 재부팅 규정이 적용될 수 있습니다.

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Symptoms

RecoverPoint for Virtual Machines는 저널 볼륨을 사용하여 시점 이미지를 추적하고 관리합니다.
RP4VM 환경에서 저널 볼륨을 단일 ESXi 호스트로 잠그는 문제가 발생할 수 있습니다.
VM을 복제하도록 할당된 vRPA 쌍이 다른 ESXi 호스트로 vMotion되는 경우 DRU(Data Replication Unavailable)가 발생할 수 있습니다.

이는 특정 RPA에 할당된 모든 CG(Consistency Group)에 영향을 미칠 수 있으며, 이로 인해 상태가 DISABLE 또는 ERROR가 됩니다.

 

복제 프로세스 충돌이 나타날 수 있습니다.

replication Sun May 19 20:12:01 UTC 2024

replication Sun May 19 20:37:55 UTC 2024

replication Sun May 19 20:38:21 UTC 2024

replication Sun May 19 21:03:10 UTC 2024

replication Sun May 19 21:32:18 UTC 2024

replication Sun May 19 22:03:39 UTC 2024

replication Sun May 19 22:32:26 UTC 2024

replication Sun May 19 23:03:56 UTC 2024

replication Sun May 19 23:28:46 UTC 2024

이로 인해 RPA(RecoverPoint Appliance)가 클러스터에서 분리됩니다.

교착 상태를 표시하는 사이트 제어 vRPA 복제 로그:

2024/05/20 01:27:54.039 - #1 - 21594/21577 - DLManager: deadlock suspected at N6Kashya26ReplicationControl_AO_IMPLE, pid=21659/21577, config=(alertThreshold=10,alertLevel=1,killThreshold=0,coreThreshold=0,full=1), deltaTime=501, cmd=N6Kashya33SerializableMethodRequestRefCountE

2024/05/20 01:27:54.039 - #1 - 21594/21577 - DLManager: deadlock suspected at N6Kashya26ReplicationControl_AO_IMPLE, pid=21658/21577, config=(alertThreshold=10,alertLevel=1,killThreshold=0,coreThreshold=0,full=1), deltaTime=288, cmd=N6Kashya33SerializableMethodRequestRefCountE

2024/05/20 01:27:54.039 - #1 - 21594/21577 - DLManager: deadlock suspected at N6Kashya26ReplicationControl_AO_IMPLE, pid=21657/21577, config=(alertThreshold=10,alertLevel=1,killThreshold=0,coreThreshold=0,full=1), deltaTime=493, cmd=N6Kashya33SerializableMethodRequestRefCountE

2024/05/20 01:27:54.039 - #1 - 21594/21577 - DLManager: deadlock suspected at N6Kashya26ReplicationControl_AO_IMPLE, pid=21656/21577, config=(alertThreshold=10,alertLevel=1,killThreshold=0,coreThreshold=0,full=1), deltaTime=503, cmd=N6Kashya33SerializableMethodRequestRefCountE

2024/05/20 01:27:54.039 - #1 - 21594/21577 - DLManager: deadlock suspected at N6Kashya26ReplicationControl_AO_IMPLE, pid=21655/21577, config=(alertThreshold=10,alertLevel=1,killThreshold=0,coreThreshold=0,full=1), deltaTime=775, cmd=N6Kashya33SerializableMethodRequestRefCountE

2024/05/20 01:27:54.039 - #1 - 21594/21577 - DLManager: deadlock suspected at N6Kashya26ReplicationControl_AO_IMPLE, pid=21654/21577, config=(alertThreshold=10,alertLevel=1,killThreshold=0,coreThreshold=0,full=1), deltaTime=501, cmd=N6Kashya33SerializableMethodRequestRefCountE

2024/05/20 01:27:54.039 - #1 - 21594/21577 - DLManager: deadlock suspected at N6Kashya26ReplicationControl_AO_IMPLE, pid=21653/21577, config=(alertThreshold=10,alertLevel=1,killThreshold=0,coreThreshold=0,full=1), deltaTime=799, cmd=N6Kashya33SerializableMethodRequestRefCountE

복제가 종료되고 있음을 보여주는 RPA 제어 로그:

RecoverPoint-0x5d0aa2fff4d45cf4-RPA6:/home/kos/control# grep killRep result.log.latest

2024/05/20 02:54:41.348 - #2 - 23464/23290 - RepHandler: killRep called. isKill=1 reason=Replication process didn't handle last command in a timely manner. cmd=Vector(busyType=BUSY_WITH_RCS(busyWithWhat= RC (RC_preOpenTask),busyWithWhichRCs= VectorMap([KBox6 => 1]))) isCoreDump=1

2024/05/20 02:54:41.936 - #2 - 23464/23290 - RepHandler: killRep called. isKill=0 reason=Rep disconnected. Verifying it is down isCoreDump=0

2024/05/20 02:54:48.974 - #2 - 23464/23290 - RepHandler: killRep called. isKill=1 reason=Replication process didn't handle last command in a timely manner. cmd=Vector(busyType=BUSY_WITH_RCS(busyWithWhat= RC (RC_preOpenTask),busyWithWhichRCs= VectorMap([KBox6 => 1]))) isCoreDump=1

2024/05/20 02:54:49.172 - #2 - 23464/23290 - RepHandler: killRep called. isKill=0 reason=Rep disconnected. Verifying it is down isCoreDump=0

2024/05/20 03:18:43.263 - #2 - 23464/23290 - RepHandler: killRep called. isKill=1 reason=Replication process didn't handle last command in a timely manner. cmd=Vector(busyType=BUSY_WITH_RCS(busyWithWhat= RC (RC_preOpenTask),busyWithWhichRCs= VectorMap([KBox6 => 1]))) isCoreDump=1

2024/05/20 03:18:43.265 - #2 - 23464/23290 - RepHandler: killRep called. isKill=1 reason=Replication process didn't handle last command in a timely manner. cmd=Vector(busyType=BUSY_WITH_RCS(busyWithWhat= RC (RC_preOpenTask),busyWithWhichRCs= VectorMap([KBox6 => 1]))) isCoreDump=1

2024/05/20 03:18:43.266 - #2 - 23464/23290 - RepHandler: killRep called. isKill=1 reason=Replication process didn't handle last command in a timely manner. cmd=Vector(busyType=BUSY_WITH_RCS(busyWithWhat= RC (RC_preOpenTask),busyWithWhichRCs= VectorMap([KBox6 => 1]))) isCoreDump=1

2024/05/20 03:18:43.268 - #2 - 23464/23290 - RepHandler: killRep called. isKill=1 reason=Replication process didn't handle last command in a timely manner. cmd=Vector(busyType=BUSY_WITH_RCS(busyWithWhat= RC (RC_preOpenTask),busyWithWhichRCs= VectorMap([KBox6 => 1]))) isCoreDump=1

2024/05/20 03:18:43.269 - #2 - 23464/23290 - RepHandler: killRep called. isKill=1 reason=Replication process didn't handle last command in a timely manner. cmd=Vector(busyType=BUSY_WITH_RCS(busyWithWhat= RC (RC_preOpenTask),busyWithWhichRCs= VectorMap([KBox6 => 1]))) isCoreDump=1

2024/05/20 03:18:43.807 - #2 - 23464/23290 - RepHandler: killRep called. isKill=0 reason=Rep disconnected. Verifying it is down isCoreDump=0

2024/05/20 03:59:26.445 - #2 - 23464/23290 - RepHandler: killRep called. isKill=1 reason=Replication process didn't handle last command in a timely manner. cmd=Vector(busyType=BUSY_WITH_RCS(busyWithWhat= RC (RC_preOpenTask),busyWithWhichRCs= VectorMap([KBox6 => 1]))) isCoreDump=1

 

ESXi 호스트 xxxxxxx iofilterd-emcjiraf.log VSAN 오브젝트 잠금이 표시됩니다.

2024-05-23T17:15:33.517Z iofilterd-emcjiraf[2104572]: FiltLib: Opening virtual disk ("/vmfs/volumes/vsan:529b87d1fe9acb29-b0cbb19ee7de1e57/RPvStorage/5d0aa2fff4d45cf4_IOFilter_JVOL_00668.vmdk") failed: "Failed to lock the file" (16392).

2024-05-23T17:15:33.517Z iofilterd-emcjiraf[2104572]: RPVS_iof_LunInfo_OpenLun: Failed to open vmdk file of lun 668 with status Busy path /vmfs/volumes/vsan:529b87d1fe9acb29-b0cbb19ee7de1e57/RPvStorage/5d0aa2fff4d45cf4_IOFilter_JVOL_00668.vmdk

2024-05-23T17:15:33.517Z iofilterd-emcjiraf[2104572]: execute_open_cmd: Cluster id: 0x5d0aa2fff4d45cf4 lun 668 open failed. open_rc: Busy con 0xc6db9c9340

2024-05-23T17:15:33.518Z iofilterd-emcjiraf[2104572]: execute_close_cmd: cluster id: 0x5d0aa2fff4d45cf4 lun 668 is not opened (openState: 0). Ignore close request

2024-05-23T17:15:33.519Z iofilterd-emcjiraf[2104572]: execute_close_cmd: cluster id: 0x5d0aa2fff4d45cf4 lun 48 is not opened (openState: 0). Ignore close request

2024-05-23T17:15:33.519Z iofilterd-emcjiraf[2104572]: execute_close_cmd: cluster id: 0x5d0aa2fff4d45cf4 lun 507 is not opened (openState: 0). Ignore close request

2024-05-23T17:15:33.519Z iofilterd-emcjiraf[2104572]: execute_close_cmd: cluster id: 0x5d0aa2fff4d45cf4 lun 558 is not opened (openState: 0). Ignore close request

2024-05-23T17:15:33.520Z iofilterd-emcjiraf[2104572]: execute_close_cmd: cluster id: 0x5d0aa2fff4d45cf4 lun 359 is not opened (openState: 0). Ignore close request

2024-05-23T17:15:34.228Z iofilterd-emcjiraf[2104572]: OBJLIB-VSANOBJ: VsanObjLock: Object 'b05fe864-d249-c108-0391-6cfe544f26e4' locked, retrying.

2024-05-23T17:15:36.581Z iofilterd-emcjiraf[2104572]: OBJLIB-VSANOBJ: VsanObjLock: Object 'e6ebf164-9c47-0fd7-6c17-6cfe544f26e4' locked, retrying.

2024-05-23T17:15:37.526Z iofilterd-emcjiraf[2104572]: OBJLIB-VSANOBJ: VsanObjLock: Object 'b2e10165-ee40-1674-3ead-6cfe544f26e4' locked, retrying.

2024-05-23T17:15:37.530Z iofilterd-emcjiraf[2104572]: OBJLIB-VSANOBJ: VsanObjLock: Object '681efa64-16f0-8bb4-ab56-6cfe544f26e4' locked, retrying.

2024-05-23T17:15:38.537Z iofilterd-emcjiraf[2104572]: OBJLIB-VSANOBJ: VsanObjLock: Could not lock object 'b05fe864-d249-c108-0391-6cfe544f26e4': Failed to lock the file (4).

2024-05-23T17:15:38.537Z iofilterd-emcjiraf[2104572]: DISKLIB-VMFS  : "vsan://529b87d1fe9acb29-b0cbb19ee7de1e57/b05fe864-d249-c108-0391-6cfe544f26e4" : failed to open (Failed to lock the file): ObjLib_Open failed. Type 3

2024-05-23T17:15:38.537Z iofilterd-emcjiraf[2104572]: DISKLIB-LINK  : "/vmfs/volumes/vsan:529b87d1fe9acb29-b0cbb19ee7de1e57/f9f9b364-aadc-f567-41a4-6cfe544ef870/5d0aa2fff4d45cf4_IOFilter_JVOL_00116.vmdk" : failed to open (Failed to lock the file).

2024-05-23T17:15:38.537Z iofilterd-emcjiraf[2104572]: DISKLIB-CHAIN : "/vmfs/volumes/vsan:529b87d1fe9acb29-b0cbb19ee7de1e57/f9f9b364-aadc-f567-41a4-6cfe544ef870/5d0aa2fff4d45cf4_IOFilter_JVOL_00116.vmdk" : failed to open (Failed to lock the file).

2024-05-23T17:15:38.545Z iofilterd-emcjiraf[2104572]: DISKLIB-VMFS  : "vsan://529b87d1fe9acb29-b0cbb19ee7de1e57/b05fe864-d249-c108-0391-6cfe544f26e4" : open successful (524293) size = 25769803776, hd = 0. Type 3

2024-05-23T17:15:38.546Z iofilterd-emcjiraf[2104572]: PluginLdr_Load: Loaded plugin 'libvmiof-disk-emcjiraf.so' from '/usr/lib64/vmware/plugin/libvmiof-disk-emcjiraf.so'

2024-05-23T17:15:38.546Z iofilterd-emcjiraf[2104572]: DISKLIB-VMFS  : "vsan://529b87d1fe9acb29-b0cbb19ee7de1e57/b05fe864-d249-c108-0391-6cfe544f26e4" : closed.

2024-05-23T17:15:38.546Z iofilterd-emcjiraf[2104572]: DISKLIB-LIB   : Failed to open '/vmfs/volumes/vsan:529b87d1fe9acb29-b0cbb19ee7de1e57/RPvStorage/5d0aa2fff4d45cf4_IOFilter_JVOL_00116.vmdk' with flags 0xa Failed to lock the file (16392).

2024-05-23T17:15:38.546Z iofilterd-emcjiraf[2104572]: FiltLib: Opening virtual disk ("/vmfs/volumes/vsan:529b87d1fe9acb29-b0cbb19ee7de1e57/RPvStorage/5d0aa2fff4d45cf4_IOFilter_JVOL_00116.vmdk") failed: "Failed to lock the file" (16392).

2024-05-23T17:15:38.546Z iofilterd-emcjiraf[2104572]: RPVS_iof_LunInfo_OpenLun: Failed to open vmdk file of lun 116 with status Busy path /vmfs/volumes/vsan:529b87d1fe9acb29-b0cbb19ee7de1e57/RPvStorage/5d0aa2fff4d45cf4_IOFilter_JVOL_00116.vmdk

2024-05-23T17:15:38.546Z iofilterd-emcjiraf[2104572]: execute_open_cmd: Cluster id: 0x5d0aa2fff4d45cf4 lun 116 open failed. open_rc: Busy con 0xc6dc2deaa0

2024-05-23T17:15:38.547Z iofilterd-emcjiraf[2104572]: execute_close_cmd: cluster id: 0x5d0aa2fff4d45cf4 lun 457 is not opened (openState: 0). Ignore close request

2024-05-23T17:15:38.548Z iofilterd-emcjiraf[2104572]: execute_close_cmd: cluster id: 0x5d0aa2fff4d45cf4 lun 250 is not opened (openState: 0). Ignore close request

2024-05-23T17:15:38.548Z iofilterd-emcjiraf[2104572]: execute_close_cmd: cluster id: 0x5d0aa2fff4d45cf4 lun 623 is not opened (openState: 0). Ignore close request

2024-05-23T17:15:38.548Z iofilterd-emcjiraf[2104572]: execute_close_cmd: cluster id: 0x5d0aa2fff4d45cf4 lun 436 is not opened (openState: 0). Ignore close request

2024-05-23T17:15:38.549Z iofilterd-emcjiraf[2104572]: execute_close_cmd: cluster id: 0x5d0aa2fff4d45cf4 lun 62 is not opened (openState: 0). Ignore close request

2024-05-23T17:15:40.886Z iofilterd-emcjiraf[2104572]: OBJLIB-VSANOBJ: VsanObjLock: Object 'e6ebf164-9c47-0fd7-6c17-6cfe544f26e4' locked, retrying.

2024-05-23T17:15:41.837Z iofilterd-emcjiraf[2104572]: OBJLIB-VSANOBJ: VsanObjLock: Object 'b2e10165-ee40-1674-3ead-6cfe544f26e4' locked, retrying.

2024-05-23T17:15:41.842Z iofilterd-emcjiraf[2104572]: OBJLIB-VSANOBJ: VsanObjLock: Object '681efa64-16f0-8bb4-ab56-6cfe544f26e4' locked, retrying.

2024-05-23T17:15:42.561Z iofilterd-emcjiraf[2104572]: OBJLIB-VSANOBJ: VsanObjLock: Object 'bcdce564-9ed7-ff34-42df-6cfe544f26e4' locked, retrying.

영향을 받는 RPAvmkernel.log에 있는 호스트에는 아래 항목이 인쇄됩니다.

2024-10-21T13:08:13.487Z cpu31:2101016)WARNING: NFSLock: 4084: File is being locked by a consumer on host HOSTNAME0503.corp.qbe.com with exclusive lock.
2024-10-21T13:08:13.487Z cpu31:2101016)NFSLock: 5032: failed to get lock on file 38233f05329208c8_IOFilter_JVOL_00244-flat.vmdk(.lck-d84c0c0000000000) 0x430b567b2f60 on DATASTORE xxx.xxx.xxx.xxx (xxx.xxx.xxx.xxx): Busy
2024-10-21T13:08:23.795Z cpu31:2101016)WARNING: NFSLock: 4084: File is being locked by a consumer on host HOSTNAME0503.corp.qbe.com with exclusive lock.
2024-10-21T13:08:23.795Z cpu31:2101016)NFSLock: 5032: failed to get lock on file 38233f05329208c8_IOFilter_JVOL_00244-flat.vmdk(.lck-d84c0c0000000000) 0x430b567b7330 on DATASTORE xxx.xxx.xxx.xxx (xxx.xxx.xxx.xxx): Busy
2024-10-21T13:08:33.854Z cpu7:2099418)WARNING: NFSLock: 4084: File is being locked by a consumer on host HOSTNAME0503.corp.qbe.com with exclusive lock.
2024-10-21T13:08:33.854Z cpu7:2099418)NFSLock: 5032: failed to get lock on file 38233f05329208c8_IOFilter_JVOL_00069-flat.vmdk(.lck-8d7b030000000000) 0x430b567db0b0 on DATASTORE xxx.xxx.xxx.xxx (xxx.xxx.xxx.xxx): Busy
2024-10-21T13:08:44.163Z cpu2:2099418)WARNING: NFSLock: 4084: File is being locked by a consumer on host HOSTNAME0503.corp.qbe.com with exclusive lock.

Cause

이 상황의 원인은 현재 조사 중입니다.
현재 사용할 수 있는 영구적인 해결 방법이 없습니다. 그러나 이는 RP4VM에만 해당하는 상황입니다.

이 상황은 다양한 시나리오에서 발생할 수 있습니다.

1. 원래 ESXi 호스트가 저널을 특정 ESXi 호스트로 잠그기 때문에 VM 보호를 담당하는 vRPA가 다른 ESXi 호스트로 vMotion을 받는 경우

2. 정합성 보장 그룹이 한 vRPA 쌍에서 다른 vRPA 쌍으로 소유권을 이동하여 위의 상황이 발생하는 경우

Resolution

해상도:
이 문제에 대한 해결 방법이 있지만 Dell Technologies 기술 지원 담당자의 개입이 필요합니다.  적절한 RP4VM, ESXi 및 vCenter 로그를 수집하여 이 상황을 분류하고 필요에 따라 에스컬레이션해야 합니다. Dell Technologies 고객 지원 센터 또는 서비스 담당자에게 기술 지원을 문의하고 이 Dell Technologies 기술 자료 솔루션 ID를 참조하십시오.

Affected Products

RecoverPoint for Virtual Machines

Products

RecoverPoint for Virtual Machines
Article Properties
Article Number: 000053722
Article Type: Solution
Last Modified: 31 Oct 2024
Version:  4
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.