Article Number: 530815

printer Print mail Email

RecoverPoint with VMWare SRM: SRM Failover/Test Failover operations may fail after some time on "Create writable storage" step: Error - Failed to create snapshots of replica devices. SRA command 'testFailoverStart' failed. Failed opening session ...

Summary: SRM Failover/Test Failover operations fails after some time on "Create writable storage" step: Error - Failed to create snapshots of replica devices. SRA command 'testFailoverStart' failed. Failed opening session for user to site mgmt IP.

Primary Product: RecoverPoint

Product: RecoverPoint more...

Last Published: 10 Mar 2020

Article Type: Break Fix

Published Status: Online

Version: 3

RecoverPoint with VMWare SRM: SRM Failover/Test Failover operations may fail after some time on "Create writable storage" step: Error - Failed to create snapshots of replica devices. SRA command 'testFailoverStart' failed. Failed opening session ...

Article Content

Issue


SRM Test Failover or Failover operations may fail on Step 4: "Create writable storage snapshot" after a period of time on this step (31 minutes by default) with error in SRM Recovery plan steps:
Error - Failed to create snapshots of replica devices. SRA command 'testFailoverStart' failed. Failed opening session for user to site mgmt IP.

Errors from SRM logs (C:\ProgramData\VMware\VMware vCenter Site Recovery Manager\Logs):
--> Feb 20, 2019 3:12:52 PM com.emc.santorini.log.KLogger log
--> INFO: Starting to run: TestFailoverStart command
--> Feb 20, 2019 3:43:53 PM com.emc.santorini.log.KLogger logWithException
--> WARNING: Caught SocketTimeoutException. Please check your network connection to the RPAs.
--> javax.xml.ws.WebServiceException: java.net.SocketTimeoutException: Read timed out
-->     at com.sun.xml.internal.ws.transport.http.client.HttpClientTransport.readResponseCodeAndMessage(Unknown Source)
-->     at com.sun.xml.internal.ws.transport.http.client.HttpTransportPipe.createResponsePacket(Unknown Source)
-->     at com.sun.xml.internal.ws.transport.http.client.HttpTransportPipe.process(Unknown Source)
-->     at com.sun.xml.internal.ws.transport.http.client.HttpTransportPipe.processRequest(Unknown Source)
-->     at com.sun.xml.internal.ws.transport.DeferredTransportPipe.processRequest(Unknown Source)
-->     at com.sun.xml.internal.ws.api.pipe.Fiber.__doRun(Unknown Source)
-->     at com.sun.xml.internal.ws.api.pipe.Fiber._doRun(Unknown Source)
-->     at com.sun.xml.internal.ws.api.pipe.Fiber.doRun(Unknown Source)
-->     at com.sun.xml.internal.ws.api.pipe.Fiber.runSync(Unknown Source)
-->     at com.sun.xml.internal.ws.client.Stub.process(Unknown Source)
-->     at com.sun.xml.internal.ws.client.sei.SEIStub.doProcess(Unknown Source)
-->     at com.sun.xml.internal.ws.client.sei.SyncMethodHandler.invoke(Unknown Source)
-->     at com.sun.xml.internal.ws.client.sei.SyncMethodHandler.invoke(Unknown Source)
-->     at com.sun.xml.internal.ws.client.sei.SEIStub.invoke(Unknown Source)
-->     at com.sun.proxy.$Proxy36.testFailoverStartWithOpaques(Unknown Source)
-->     at com.emc.santorini.handlers.SantoriniLogic.testFailoverStart(SantoriniLogic.java:278)
-->     at com.emc.santorini.commands.TestFailoverStartCommand.execute(TestFailoverStartCommand.java:40)
-->     at com.emc.santorini.handlers.SantoriniCommandDispatcher.handleCommandAction(SantoriniCommandDispatcher.java:105)
-->     at com.emc.santorini.main.SantoriniMain.main(SantoriniMain.java:57)
--> Caused by: java.net.SocketTimeoutException: Read timed out

...

No errors on RecoverPoint side.

Cause

By default SRM timeouts are set to 5 minutes and can be increased, when increased beyond 31 minutes, a different timeout - SRA timeout may occur if the image access enable process takes more than 1860 seconds (31 minutes)  - WEB_SERVICE_REQUEST_TIMEOUT which is set to 1860 seconds by default.

Resolution

Resolution:

Change SRA timeouts to match the requested timeout changes in SRM.
SRA timeouts are set on the SRM server, under: C:\Program Files\VMware\VMware vCenter Site Recovery Manager\storage\sra\array-type-recoverpoint\conf\cancun_run.properties file.
The changes should be made on both SRM servers.

Change the following properties to match the SRM timeouts (Set in seconds), in this example it is set to 1 hour - 3600 seconds:
VERIFY_PAUSED_TSP_TIMEOUT=3600
VERIFY_REPLICATING_TIMEOUT=3600
VERIFY_TRANSFER_SNAP_SHIPPING_IDLE_TIMEOUT=3600

WEB_SERVICE_REQUEST_TIMEOUT=3600

Note: There is no need to restart services after this change, the next SRM operation will use the new settings.

Notes

Issue


SRM Test Failover or Failover operations may fail on Step 4: "Create writable storage snapshot" after a period of time on this step (31 minutes by default) with error in SRM Recovery plan steps:
Error - Failed to create snapshots of replica devices. SRA command 'testFailoverStart' failed. Failed opening session for user to site mgmt IP.

Errors from SRM logs (C:\ProgramData\VMware\VMware vCenter Site Recovery Manager\Logs):
--> Feb 20, 2019 3:12:52 PM com.emc.santorini.log.KLogger log
--> INFO: Starting to run: TestFailoverStart command
--> Feb 20, 2019 3:43:53 PM com.emc.santorini.log.KLogger logWithException
--> WARNING: Caught SocketTimeoutException. Please check your network connection to the RPAs.
--> javax.xml.ws.WebServiceException: java.net.SocketTimeoutException: Read timed out
-->     at com.sun.xml.internal.ws.transport.http.client.HttpClientTransport.readResponseCodeAndMessage(Unknown Source)
-->     at com.sun.xml.internal.ws.transport.http.client.HttpTransportPipe.createResponsePacket(Unknown Source)
-->     at com.sun.xml.internal.ws.transport.http.client.HttpTransportPipe.process(Unknown Source)
-->     at com.sun.xml.internal.ws.transport.http.client.HttpTransportPipe.processRequest(Unknown Source)
-->     at com.sun.xml.internal.ws.transport.DeferredTransportPipe.processRequest(Unknown Source)
-->     at com.sun.xml.internal.ws.api.pipe.Fiber.__doRun(Unknown Source)
-->     at com.sun.xml.internal.ws.api.pipe.Fiber._doRun(Unknown Source)
-->     at com.sun.xml.internal.ws.api.pipe.Fiber.doRun(Unknown Source)
-->     at com.sun.xml.internal.ws.api.pipe.Fiber.runSync(Unknown Source)
-->     at com.sun.xml.internal.ws.client.Stub.process(Unknown Source)
-->     at com.sun.xml.internal.ws.client.sei.SEIStub.doProcess(Unknown Source)
-->     at com.sun.xml.internal.ws.client.sei.SyncMethodHandler.invoke(Unknown Source)
-->     at com.sun.xml.internal.ws.client.sei.SyncMethodHandler.invoke(Unknown Source)
-->     at com.sun.xml.internal.ws.client.sei.SEIStub.invoke(Unknown Source)
-->     at com.sun.proxy.$Proxy36.testFailoverStartWithOpaques(Unknown Source)
-->     at com.emc.santorini.handlers.SantoriniLogic.testFailoverStart(SantoriniLogic.java:278)
-->     at com.emc.santorini.commands.TestFailoverStartCommand.execute(TestFailoverStartCommand.java:40)
-->     at com.emc.santorini.handlers.SantoriniCommandDispatcher.handleCommandAction(SantoriniCommandDispatcher.java:105)
-->     at com.emc.santorini.main.SantoriniMain.main(SantoriniMain.java:57)
--> Caused by: java.net.SocketTimeoutException: Read timed out

...

No errors on RecoverPoint side.

Cause

By default SRM timeouts are set to 5 minutes and can be increased, when increased beyond 31 minutes, a different timeout - SRA timeout may occur if the image access enable process takes more than 1860 seconds (31 minutes)  - WEB_SERVICE_REQUEST_TIMEOUT which is set to 1860 seconds by default.

Resolution

Resolution:

Change SRA timeouts to match the requested timeout changes in SRM.
SRA timeouts are set on the SRM server, under: C:\Program Files\VMware\VMware vCenter Site Recovery Manager\storage\sra\array-type-recoverpoint\conf\cancun_run.properties file.
The changes should be made on both SRM servers.

Change the following properties to match the SRM timeouts (Set in seconds), in this example it is set to 1 hour - 3600 seconds:
VERIFY_PAUSED_TSP_TIMEOUT=3600
VERIFY_REPLICATING_TIMEOUT=3600
VERIFY_TRANSFER_SNAP_SHIPPING_IDLE_TIMEOUT=3600

WEB_SERVICE_REQUEST_TIMEOUT=3600

Note: There is no need to restart services after this change, the next SRM operation will use the new settings.

Notes

Article Attachments

Attachments

Attachments

Article Properties

First Published

Thu Feb 28 2019 03:31:21 GMT

First Published

Thu Feb 28 2019 03:31:21 GMT

Rate this article

Accurate
Useful
Easy to understand
Was this article helpful?
0/3000 characters