Moderator

 • 

9.7K Posts

October 9th, 2013 07:00

CarolFerris,

What is the host OS on the server? Also, is there more to the error than just the watchdog expired? Lastly, do you have any 3rd party hardware in the server?

Let me know what you find out.

October 9th, 2013 08:00

The OS is Windows Server 2008 R2 Enterprise.  Server administrator 7.3 is installed, however, when I connect, it no longer brings up any data about server.  It doesn't even show the host name.  I do not see anything in the iDRAC log except the messages about the reboot and then 10 minutes later we got the watchdog timer error.  We do not have any 3rd party hardware in the server.

Thanks.

October 10th, 2013 06:00

The server still shows 'critical' in the Blade chassis CMC.  I rebooted the server this morning and have been looking through logs in System Administrator (7.3).  The log entry showed this:  This was the time from yesterday when we got the watchdog error.  So the log entry was from this morning when I rebooted showing an ASR was performed yesterday.  This was another error with a red X.

1006 Thu Oct 10 06:23:41 2013 Instrumentation Service Automatic System Recovery (ASR) action was performed Action performed was: Unknown Date and time of action: Wed Oct 09 04:28:25 2013

October 11th, 2013 04:00

Hi,

I have also experienced this problem on 3 Dell R610s.


The servers are running vmware esxi 5.1, we had the watchdog timer expire on all 3 boxes within 24hrs of each other last week.

I have come to reboot each of the hosts today and have (upon manually rebooting) recieved a delayed ASR error from when the timers expired last week.

The reboots and a rescan and inventory have corrected the entries in the ome devices list and all appears to be well.


The servers ran for approx 65 days before we recieved the error, im not sure if its a time based issues and we will once again see the errors in another 65 days time.

One thing is for sure it was pretty scary recieving those alerting emails for something which from a vmware side of things were running perfectly fine.

6 Posts

November 20th, 2013 06:00

I'm glad I found this forum since I am having almost the exact same issue!  I have 5 M620 PowerEdge blade servers that are returning critical errors with the details as "Watchdog Timer has expired".  These are all running ESXi 5.1 and, like yours, are returning the Watchdog Timer error at what appears to be specific intervals (one server will experience it and then 24 hours later, a different one will have the same issue).

I also agree that having Open Manage return critical errors on your VMWare cluster is not a fun thing.

Thanks!

November 22nd, 2013 13:00

I think to fix our issue, I just cleared the log and the error has not reappeared.

No Events found!

Top