I use IPMI to monitor one of my R330's and noticed that once in a while it will throw a "failed to run command" error. Upon looking in my iDRAC logs, I was presented with the following:
PSU0800 Power Supply 1: Status = 0x00, IOUT = 0x0, VOUT= 0x0, TEMP= 0x0, FAN = 0x0, INPUT= 0x0. RAC0708 Previous reboot was due to a firmware watchdog timeout. The iDRAC firmware was rebooted with the following reason: watchdog. LOG007 The previous log entry was repeated 1 times. DIS002 Auto Discovery feature disabled.
It will happen randomly around once or twice a week, sometimes more.
|Lifecycle Controller Firmware||188.8.131.52|
|iDRAC Firmware Version||184.108.40.206|
Any help would be greatly appreciated.
edit - Forgot to mention, I also can tell the error is happening due to all the fans spin up at full speed.
Would you clarify if you have recently updated the server, or if this issue started after a recent update?
Let me know
Dell | Social Outreach Services - Enterprise
Get Support on Twitter @DellCaresPro
Download the Dell Quick Resource Locator app today to access PowerEdge support content on your mobile device! (iOS, Android, Windows)
Sorry for the late reply. This started recently after installing OMSA on VMware 6.7. It will happen completely at random.
By the way, it happened again today and Server Administrator shows this message right around the same time:
Current sensor detected a warning value Sensor location: System Board Pwr Consumption Chassis location: Main System Chassis Previous state was: OK (Normal) Current sensor value (in Watts): 658.000
Not sure if its related. It is followed by ID 1202.
2019-01-24T20:07:02-0500 RAC0708 Previous reboot was due to a firmware watchdog timeout. 2019-01-24T20:06:26-0500 PSU0800 Power Supply 1: Status = 0x00, IOUT = 0x0, VOUT= 0x0, TEMP= 0x0, FAN = 0x0, INPUT= 0x0.