Start a Conversation

Unsolved

This post is more than 5 years old

1819

July 8th, 2016 04:00

Data domain VTL restarting due to Keep_alive timeout

Hello All,

We have a customer with DD640. They recently had an issue with their VTL which keeps restarting. Looks like the issue is due to the keep-alive timeout. Below are the screenshot from the messages file.

Inline images 3

Kindly let me know how do I check the current Keep-alive value on the Data domain.

Regards,

62 Posts

July 8th, 2016 06:00

Hi Prassi,

The default VTL keep alive timeout is 5 or 6 minutes, depending upon the operation being performed by VTL. Normally this is sufficient which indicates that your VTL process is restarting unexpectedly.

Unfortunately the inline images you provided fail to display, at least for me.

So if the VTL process is restarting unexpectedly, do you see any core files in /ddvar/core?

Do you have any other alerts? as the VTL process depends upon the filesystem being up so if the filesystem is also experiencing issues, this can cause a knock on effect to the VTL process.

cheers, Rich.

18 Posts

July 8th, 2016 06:00

DD640_VTL restarting.PNG.png

Here is the errors from the message logs.

the only errors that we received was that the one that says VTL restarting. it kept doing that and customer has to do a reboot.

We basically need to know the root cause for this issue.

I have autosupport logs , is there a way I can send the logs to you for analysis.

18 Posts

July 8th, 2016 07:00

Hi,

Is it possible for you to look into this and respond to me ASAP.

Regards,

On Fri, Jul 8, 2016 at 2:14 PM, richbinstead

62 Posts

July 8th, 2016 07:00

Hi Prassi,

Thanks for the update. Do you have support via a 3rd party for this product? As I mentioned above, this really needs indepth triage to understand what issue VTL is encountering.

If you have no support available to you, you could log a time and materials case with EMC, but this will be on a chargeable basis. Or just consider an upgrade to a later DDOS version incase you are encountering a known problem - but again without understanding why it's restarting as it is, this may not resolve the issue.

I think you misunderstand slightly, this is not a TCP keep alive timeout. This is an internal timeout that the VTL process uses to ascertain if there is a problem when it believes a crucial operation has hung; if the timeout is reached, the VTL process is terminated so that a core file is created for analysis, which also causes the process to restart so as to resume normal operation. So increasing the TCP keep alive timeout won't actually help.

cheers, Rich.

18 Posts

July 8th, 2016 07:00

Hi,

We are not under EMC support anymore. Is there a way that you can provide

me the command to check the current tcp keep alive value?

Regards,

On Fri, Jul 8, 2016 at 3:23 PM, richbinstead

62 Posts

July 8th, 2016 07:00

Hi Prassi,

Thanks for the above message extracts. This indicates that a core file is being created, which as I mentioned above can be found in /ddvar/core

The autosupport will not be enough to triage this issue as it's going to require an indepth investigation.

Therefore please:

- Log a support case for the serial number in question with EMC and our DataDomain VTL team will be able to be engaged.

- Once the support case is raised, please attach/upload the support bundle from the system and core files from /ddvar/core that are around the date/time above (they should start with the string 'vtl.core.'

cheers, Rich.

No Events found!

Top