Eluich
1 Nickel

Random Reboot R740

Hello,

We currently have a Windows server 2016 Datacenter server failover cluster with two PowerEdge R740 nodes.


The hardware configuration of each node is as follows:
2x Intel (R) Xeon (R) Silver 4116 CPU @ 2.10GHz Model 85 Stepping 4
RAM 196608 MB
Nvidia Tesla M60 Video Card
SAS connection with a PowerVault® 3420 SAN

Video cards are used in Discrete Device Assignment by virtual machines

We encounter a problem of brutal random reboot of nodes without error message in logs other than an event id 41 Kernel-Power "The system has rebooted without cleanly shutting down first".

EventData 
BugcheckCode 0 
BugcheckParameter1 0x0 
BugcheckParameter2 0x0 
BugcheckParameter3 0x0 
BugcheckParameter4 0x0 
SleepInProgress 0 
PowerButtonTimestamp 0 
BootAppStatus 0 
Checkpoint 0 
ConnectedStandbyInProgress false 
SystemSleepTransitionsToOn 0 
CsEntryScenarioInstanceId 0

The reboot of the nodes is not simultaneous and occurs in a totally random way.

We have no errors in hardware testing and no explicit events in Open Manage.

Do you have any idea what caused this problem ?

Best Regards

0 Kudos