RP4VM: Consistency Group in Error state
Summary: RecoverPoint plugin not functional and causes consistency group not able to replicate.
Symptoms
RecoverPoint plugin not functional and causes consistency group not able to replicate.
Symptoms found in the logs:
In connector logs: /files/home/kos/connectors/logs/connectors.log
2016-04-12 09:29:18,784 [pool-6-thread-1] (VCUpdater.java:471) DEBUG - Unlocking full sync 2016-04-12 09:29:18,784 [pool-6-thread-1] (VCUpdater.java:405) ERROR - Exception caught java.lang.NullPointerException at com.emc.recoverpoint.connectors.vi.internal.SplitterUtils.isEsxSplitterInstalled(SplitterUtils.java:46) ~[vi_connector_commons.jar:?] at com.emc.recoverpoint.connectors.vi.builder.ESXStateBuilder.calcSplitters(ESXStateBuilder.java:68) ~[vi_connector_commons.jar:?] at com.emc.recoverpoint.connectors.vi.builder.ESXStateBuilder.create(ESXStateBuilder.java:32) ~[vi_connector_commons.jar:?] at com.emc.recoverpoint.connectors.vi.builder.ESXClusterStateBuilder.createEsxStateMap(ESXClusterStateBuilder.java:40) ~[vi_connector_commons.jar:?] at com.emc.recoverpoint.connectors.vi.builder.ESXClusterStateBuilder.create(ESXClusterStateBuilder.java:28) ~[vi_connector_commons.jar:?] at com.emc.recoverpoint.connectors.vi.builder.DataCenterStateBuilder.createESXClusterStateMap(DataCenterStateBuilder.java:121) ~[vi_connector_commons.jar:?] at com.emc.recoverpoint.connectors.vi.builder.DataCenterStateBuilder.createESXClusterStateMap(DataCenterStateBuilder.java:127) ~[vi_connector_commons.jar:?] at com.emc.recoverpoint.connectors.vi.builder.DataCenterStateBuilder.create(DataCenterStateBuilder.java:43) ~[vi_connector_commons.jar:?] at com.emc.recoverpoint.connectors.vi.builder.VCStateBuilder.createDataCenterStateMap(VCStateBuilder.java:72) ~[vi_connector_commons.jar:?] at com.emc.recoverpoint.connectors.vi.builder.VCStateBuilder.create(VCStateBuilder.java:26) ~[vi_connector_commons.jar:?] at com.emc.recoverpoint.connectors.vi.builder.VCViewBuilder.create(VCViewBuilder.java:24) ~[vi_connector_commons.jar:?] at com.emc.recoverpoint.connectors.vi.infra.VCUpdater.buildNewVcView(VCUpdater.java:500) ~[vc_connector.jar:?] at com.emc.recoverpoint.connectors.vi.infra.VCUpdater.performSync(VCUpdater.java:460) ~[vc_connector.jar:?] at com.emc.recoverpoint.connectors.vi.infra.VCUpdater.syncAndLogAsNeeded(VCUpdater.java:163) [vc_connector.jar:?] at com.emc.recoverpoint.connectors.vi.infra.VCUpdater.updateVCView(VCUpdater.java:135) [vc_connector.jar:?] at com.emc.recoverpoint.connectors.vi.infra.VCUpdaterConnectedState.getView(VCUpdaterConnectedState.java:16) [vc_connector.jar:?] at com.emc.recoverpoint.connectors.vi.infra.VCUpdaterNotInitializedState.getView(VCUpdaterNotInitializedState.java:14) [vc_connector.jar:?] at com.emc.recoverpoint.connectors.vi.infra.VCUpdater.getView(VCUpdater.java:122) [vc_connector.jar:?] at com.emc.recoverpoint.connectors.vi.infra.VCUpdater.run(VCUpdater.java:107) [vc_connector.jar:?] at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) [?:1.7.0_80] at java.util.concurrent.FutureTask.runAndReset(Unknown Source) [?:1.7.0_80] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(Unknown Source) [?:1.7.0_80] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) [?:1.7.0_80] at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [?:1.7.0_80] at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [?:1.7.0_80] at java.lang.Thread.run(Unknown Source) [?:1.7.0_80]
Affected versions: 4.3, 4.3.0.1, 4.3.1, 4.3.1.1
Cause
While VCUpdater is creating the 'vi_view', it goes over all ESXs to check if there is a splitter installed, and looks on the 'serviceInfo' object. In case there is an ESXi (not necessary the ESXi with RP, might be ESXi that is not in use but still exposed to the vCenter) without 'serviceInfo' object, then this problematic ESXi causes 'NullPointerException' and the 'vi_view' is not built, thus resulting plugin not being installed and replication stops.
Resolution
Workaround:
-
First, identify one or more problematic ESXs
Go to the mob and then: rootFolder -> childEntity (datacenter) -> hostFolder -> childEntity (cluster) -> (optional) childEntity (go over all domains if there are any) -> host.
Get all hosts numbers, and for each host XXX, go to: https://*IP*/mob/?moid=serviceSystem-XXX
For example:https://10.76.2.241/mob/?moid=serviceSystem-10Check the 'serviceInfo' there. One or more problematic ESXs do not have the 'serviceInfo' link on the value column.
-
Once identified, disconnect one or more problematic ESXs or perform a reboot.
Permanent Fix:
4.3.1.2