Data Domain: Memory Card or DIMM With Failed or Faulty Error

Summary: This document serves to help with identifying the error or and fault and provide a resolution path.

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Symptoms

Data Domain (DD) systems monitor the status of system memory hardware (DIMMs). If any DIMM-related errors are encountered, an appropriate Alert notification is posted.

Applies to:
  • All Data Domain systems
  • All software versions of Data Domain Operating System (DDOS)
Possible Alert Notifications posted by DDOS:
DIMM-00001: Correctable ECC logging limit reached
DIMM-00002: Multibit Uncorrectable ECC error
DIMM-00003: A memory card has failed
ENVIRONMENT-00009: Memory correctable ECC errors exceed warning threshold
ENVIRONMENT-00013: Memory uncorrectable ECC error alert. 
ENVIRONMENT-00044: Memory riser fault has been detected
MEM-00001: DIMM failure detected after install. DDFS ""will not be started.
MEM-00002: Memory size(nnnnnnnnKB) goes below the configured size(nnnnnnnnKB)

Cause

The DIMMs installed on Data Domain systems have Error Checking Code (ECC) which allows for Correctable Memory Errors to be fixed on-the-fly. If an error threshold is breached, then DDOS identifies the fault and an appropriate Alert will be generated on the system.

Uncorrectable memory errors may cause a system reboot and is considered a hard memory fault. Total failure of any single DIMM or Memory Riser may result in a System Down event and prevent the Filesystem from being enabled. This is because the Data Domain File System (DDFS) process fills most of the physical memory.

NOTE: Other symptoms or alerts may mask memory errors - for example, CPU Machine Check Error - Deeper log analysis and troubleshooting may be required.

Resolution

NOTE: If an DIMM error is reported on Dell PowerEdge based systems, the first action to recover is to reboot the DataDomain unit. This will initiate PPR (POST Package Repair) to recover the DIMM.

Efforts must be made to determine the cause of the alert and identify the affected component DIMMs, CPU, or Motherboard, and replace parts as needed. 

If possible, gather a Support Bundle and create a Service Request with your contracted Service Provider. The following video shows how to gather a Support Bundle: Gather a Support Bundle This hyperlink is taking you to a website outside of Dell Technologies.

Resolution Guidelines:

  • For Dell PowerEdge based systems, initiate a system reboot to facilitate automatic POST-Package Repair (PPR); for the recovery of the DIMM.
    • Improvements in BIOS Firmware allow for PPR to recover DIMM correctable & uncorrectable Errors (Reference)
  • Compare current system state with an Auto-Support from BEFORE the DIMM failure or alert
  • Useful DD-CLI (SSH) commands for checking memory:
# alerts show current
# system show meminfo
# enclosure show memory
# log view debug/messages.engineering  ('q' to quit)
  • Use DDOS Offline Diagnostics to test and determine fault. Go to Dell Support to access the Dell EMC Data Domain Operating System 6.x Offline Diagnostics Suite User Guide
  • If possible, perform physical troubleshooting methods to eliminate and determine faulty component (using documented replacement guides and procedures).
  • Reseat the DIMM - ensure that both sides are latched properly.
  • Swap it with a known good DIMM from another slot, channel, bank, or controller:
  • If a system is down (no boot) due to a suspected memory/dimm fault, try a minimal boot option (remove peripheral devices, or cards and leave 1x DIMM in slot '0')

Additional Information

References:

Affected Products

Data Domain, Integrated Data Protection Appliance Family

Products

PowerProtect Data Protection Appliance, Data Domain, Data Domain Deduplication Storage Systems, PowerProtect Data Protection Hardware
Article Properties
Article Number: 000204330
Article Type: Solution
Last Modified: 03 Mar 2025
Version:  11
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.