Unsolved
3 Posts
0
1326
R7425 Random OS Restarts (Epyc chipset)
We have an 8 node cluster running Server 2016 Data center configured for Hyper V that has been in place for the past year. During this time we have experienced random OS reboots across all nodes.
Idrac simply reports an OEM S event and event logs an unexpected shutdown.
Based on other user threads of a similar nature the following changes have been made to system profile but to no avail.
CPU: Max performance
Memory Frequency: Max Performance
C1E: Disabled
C States: Disabled
Server drivers and firmware are up to date.
Dell have analysed our logs and have been unable to identify any hardware issues.
Hyper V is integral to our business and we are desperate to resolve and stabilise, to the point we are considering buying new hardware and rebuilding.
Any suggestions would be gratefully received.
Dell-DylanJ
4 Operator
4 Operator
•
2.9K Posts
0
May 7th, 2019 12:00
With it affecting all 8 nodes, I wouldn't expect it to be a hardware failure. Have you tried running Microsoft's BPA to make sure your design conforms to their recommendations?