PowerEdge: CPU Machine Check Errors
Summary: This article provides information about CPU Machine Check errors and common causes and proper handling when errors are seen.
Symptoms
What are CPU Machine Check Errors?
On PowerEdge servers and leveraging solutions that use standard BIOS and iDRAC firmware, machine checks are captured into the system event log (SEL).
These entries are also reflected in the Lifecycle Controller log (LCL) under various Enhanced Error Message Initiative (EEMI) event codes.
| Event code | Event message |
|---|---|
CPU0011 |
Uncorrectable machine check exception detected on CPU # |
CPU0012 |
Correctable machine check exception detected on CPU # |
CPU0704 |
CPU # machine check detected |
UEFI0076 |
One or more corrected machine check errors have occurred |
UEFI0078 |
One or more machine check errors occurred in the previous boot |
Log Examples:
2022-10-22 22:12:35 506 CPU9000 An OEM diagnostic event occurred. 2022-10-22 22:12:34 505 CPU9000 An OEM diagnostic event occurred. 2022-10-22 22:12:33 504 CPU9000 An OEM diagnostic event occurred. 2022-10-22 22:12:31 503 CPU0704 CPU 2 machine check error detected. 2022-10-22 22:12:31 502 UEFI0078 One or more Machine Check errors occurred in the previous boot.
2025-05-21 03:42:32 320 CPU9000 An OEM diagnostic event occurred. 2025-05-21 03:42:30 319 CPU0704 CPU 1 machine check error detected. 2025-05-21 03:42:29 318 PST0090 A problem was detected related to the previous server boot. 2025-05-21 03:42:29 317 UEFI0078 One or more Machine Check errors occurred in the previous boot.
2021-09-02 16:02:18 712 UEFI0078 One or more Machine Check errors occurred in the previous boot. 2021-09-02 16:02:18 711 CPU0000 Internal error has occurred check for additional logs.
Cause
Understanding Causes of CPU Machine Check Errors
CPU Machine Check Errors (MCEs) have multiple possible causes, ranging from hardware to software triggers. These errors can be attributed to various factors, including:
- BIOS Firmware or CPU Microcode
- Motherboard CPLD Firmware
- Memory Errors
- PCIE Fatal Bus Errors
- OS Crash or Software and Driver Faults (BSOD, PSOD, or Kernel Panics)
- CPU Faults
The hardware logs can be used to help identify possible causes by checking if other component errors accompany the CPU Machine Check Errors.
Example CPU MCEs triggered from a Memory Error:

Example CPU MCE triggered from a Fatal Bus Error:
Example CPU MCE triggered from an OS crash:
Resolution
General guidance
It is always helpful to ask these questions:
- Have there been recent changes to the system, like updates or changes to hardware or configuration?
- Are there other errors in the logs nearby that may be more informative than the machine check itself?
- How frequently does the machine check happen? Was it a one-off? Can it be readily reproduced?
- Are there environmental factors involved, such as specific workloads or power and thermal scenarios?
Firmware and drivers
Outdated or incompatible firmware and drivers are among the most common machine check culprits, as they work together to implement and control device behavior. So it is essential to review the versions being used as part of assessing any machine check investigation.
Among firmware, BIOS updates are critical:
- Most BIOS releases incorporate updates provided by the respective processor vendor, many of which include explicit fixes for machine checks.
- These UEFI updates for servers include microcode, reference code, and other module updates that control functionality including all reliability, availability, and serviceability (RAS) features among others.
- Simultaneously, do not overlook other firmware in the system.
- Virtually any device in the system may be the culprit, including on rare occasion the iDRAC.
Identifying and Resolving CPU Machine Check Errors
To identify CPU Machine Check Errors, start by checking the hardware logs Lifecycle (LC) or System Event Log (SEL) from the IDRAC directly or gather a TSR or SupportAssist Collection to review the logs.
- PowerEdge: Export a SupportAssist Collection Using an iDRAC
- PowerEdge: How to View or Clear the System event log
- IDRAC9 User's Guide - Viewing Lifecycle Log from the Web Interface
Look to see if the CPU MCE errors are preceded by any other errors and if they are focus troubleshooting on those components.
Troubleshooting Steps
- Update all available firmware and monitor the results for any changes in error behavior.
- If only one CPU is showing errors, swap the CPUs to determine if the error follows the CPU to the other socket.
- If the MCE is triggered from another components error, focus the troubleshooting on that component.
- Check what components are controlled by the CPU with the MCE.
- For example: If it is a CPU1 MCE, check all risers and PCIE slots that are controlled by CPU1 and any devices installed in those slots, as well as memory on CPU1 side, check all A-DIMMs for errors.
- To verify which CPU controls each riser or slot see the Servers Installation and Service Manual and look under Installing and removing system components > Expansion cards and expansion card risers > Expansion card installation guidelines.
- For more information about identifying which CPU controls the risers or slots see: PowerEdge: Troubleshooting PCIe device detection issues
- To rule out OS-related MCE triggers, test outside of the OS to see if the errors are still triggered outside of the OS.
- Run ePSA diagnostics to see if any errors are triggered during the tests.
- Boot the Support Live Image (SLI) media to test if errors are generated in that OS environment.
Run Stress Tests In Support Live Image
Duration: 00:02:38 (hh:mm:ss)
When available, closed caption (subtitles) language settings can be chosen using the CC icon on this video player.