Summary:This article provides the recommended steps to help troubleshoot memory-related events on Dell PowerEdge servers.
Please select a product to check article relevancy
This article applies to This article does not apply toThis article is not tied to any specific product.Not all product versions are identified in this article.
Your server may report memory events such as (but not limited to):
MEM0802
MEM6102
MEM6101
MEM5100
MEM5104
UEFI0103 - Memory initialization error on slot:
MEM6101 - Diagnostic warning in memory device at Check device and system configuration. (Extended ID: )
MEM0001 - Uncorrectable event consumed; may cause server reboot if OS cannot recover.
MEM9072 - Patrol scrub found uncorrectable error (not consumed); no impact unless OS uses the memory.
MEM6104 - Uncorrectable error; extended bytes show if the address was consumed or identified by patrol scrub.
Initial Troubleshooting Steps
Most of the above issues are resolved or accurately diagnosed by updating the firmware of specific components. Firmware updates contain fixes for known issues and enhancements, making them a critical first step towards resolution.
Update the firmware of the following components:
CPLD
iDRAC
BIOS
Note: If the CPLD firmware is not available for the server model, then that is not an issue, proceed with the rest of the updates
Once the initial steps have been completed, the issue might be resolved or further troubleshooting might be required depending on the information in the TSR logs, to identify the defective component.
Upon reviewing the TSR logs, the following error messages might be identified:
Single Bit memory events (degraded memory) found in the logs:
Turn the system off, disconnect the power, press, and hold the power button for 10 s to remove all flea power
Check if the machine is in a supported memory configuration, if not, remove additional DIMMs until a supported configuration is reached
Close it and connect it to power again
Turn it on
Collect a new TSR and check for memory events again
Depending on the outcome of the advanced troubleshooting steps, a part replacement is needed for either the memory DIMM, if the memory event changed slot, or the motherboard if the memory event remains on the same slot.