Highlighted
treganmorris
1 Copper

R7425 Random OS Restarts (Epyc chipset)

We have an 8 node cluster running Server 2016 Data center configured for Hyper V that has been in place for the past year. During this time we have experienced random OS reboots across all nodes.
 
Idrac simply reports an OEM S event and event logs an unexpected shutdown.
 
Based on other user threads of a similar nature the following changes have been made to system profile but to no avail.
 
CPU:                Max performance
Memory Frequency:    Max Performance
C1E:                Disabled
C States:            Disabled
 
Server drivers and firmware are up to date.
Dell have analysed our logs and have been unable to identify any hardware issues.
 
Hyper V is integral to our business and we are desperate to resolve and stabilise, to the point we are considering buying new hardware and rebuilding.
 
Any suggestions would be gratefully received.

0 Kudos