Data Domain General Health Check

Summary: Summary: This document provides actions that Tech Support would complete when performing a general health check on a Data Domain (DD) System. It includes general commands and outputs to help identify alerts or misconfigurations. ...

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Instructions

Applies to:

  • All Data Domain Operating System (DDOS) versions
  • All current models 

 

Note: DDOS >=8.3 includes the HealthCheck Tool - "support healthcheck hardware" (See Step 12).

12-Step Health Check:


Step 1 - connect to the DD system by using SSH (for example PuTTY) as an administrative user.

Step 2 - Ensure that the Filesystem is enabled.

# system show serialno
# date
# filesys status
The filesystem is enabled and running.


Step 3 - Ensure that the DDOS version is supported for the DD model.

# system show model
# system show version


Article 81247: DDOS Software Versions

Step 4 - Any alerts that impact the health of the system must be addressed.

# alerts show current

Article: 14723: Data Domain - How to Check Alerts on a Data Domain System.

Step 5 - Ensure that /data is below 90%.
To maintain expected performance levels, Data Domain recommendation is to always keep the 'use%' below 90%.

# df


Example Output:

Active Tier:
Resource           Size GiB    Used GiB   Avail GiB   Use%   Cleanable GiB*
----------------   --------   ---------   ---------   ----   --------------
/data: pre-comp           -   7259347.5           -      -                -
/data: post-comp   304690.8    251252.4     53438.5    82%           51616.1 
/ddvar                 29.5        12.5        15.6    44%                -
----------------   --------   ---------   ---------   ----   --------------

Article 54303: Data Domain: How to resolve capacity issues.

 



Step 6a - Ensure that there are no Failed (F), Reconstructing (R) or Absent disks (A).

# disk show state


Example Output:

sysadmin## disk show state
Enclosure   Disk
            1  2  3  4  5  6  7  8  9  10 11 12 13 14 15 16
---------   ------------------------------------------------
1           .  .  .  .  s  .  .  .  .  .  .  .
2           .  .  .  .  .  .  .  .  .  A  .  .  .  .  S  R
3           E  .  .  .  .  .  .  .  .  C  .  .  .  .  .  .
---------   ------------------------------------------------
Legend   State          Count
------   ------------   -----
.        In Use Disks   25
s        Spare Disks    1
R        Spare (reconstructing) Disks 1
C        Copy Recovery Disks 1
A        Absent Disks   1
E        Exceeded Error Threshold
------   ------------   -----

Article: 21916: Data Domain - Disk State Description

Step 6b: Check disk reliability output to see if proactive disk replacement is needed.
Ensure that there are no disks with "Reallocated Sectors" above 1000 or increasing daily.

# disk show reliability-data

Example Output:

Disk Show Reliability-Data
--------------------------
Disk         ATA Bus   Reallocated   Temperature
 (enc/disk)   CRC Err   Sectors
----------   -------   -----------   -----------
1.1          0         0             29 C   84 F
1.2          0         0             29 C   84 F
1.3          0         0             29 C   84 F
1.4          0         0             27 C   81 F
2.1          0         0             26 C   79 F
2.2          0         0             25 C   77 F
2.3          0         0             24 C   75 F
2.4          0         0             24 C   75 F
2.5         89         0             25 C   77 F
2.6          0         0             25 C   77 F
2.7          0         3156          24 C   75 F
2.8          0         0             23 C   73 F
2.9          0         0             24 C   75 F
2.10         0         0             24 C   75 F
2.11         0         0             23 C   73 F
2.12         0         0             23 C   73 F
2.13         0         0             25 C   77 F
2.14         0         0             24 C   75 F
2.15         0         0             22 C   72 F
2.16         0         0             22 C   72 F
 

Step 7 - Test communications on the ports with cables connected for 5 minutes. If there is an error, it is recommended to reseat the cables or LCCs.

# enclosure show topology
# enclosure test topology port 5 minutes


Article: 35680: Data Domain: SAS Cable Configuration, Topology checks and Testing

Step 8 - System misconfiguration:  If the output indicates one or more component errors, it must be addressed.

# enclosure show misconfiguration


Example Output:

Enclosure Show Misconfiguration
-------------------------------
Memory Risers:
    No misconfiguration found.
Memory DIMMs:
    No misconfiguration found.
IO Cards:
    No misconfiguration found.
CPUs:
    No misconfiguration found.
Disks:
    No misconfiguration found.

 

 

Step 9 - If replication is configured, check for any errors. If there is an error, it must be addressed.

# replication status


Article: 43349:  Data Domain - Replication Status

Step 10 - If the VTL library is in use.

# vtl status


Article: 12128: Troubleshooting Data Domain VTL Target Visibility

Step 11 - If High Availability System (HA)

# ha status


Example Output:

SE@apollo-440-n1-p0(active:0)## ha status
HA System name: apollo-440-n1.chaos.local
HA System status: highly available
 
Node Name                         Node id   Role      HA State
-------------------------------   -------   -------   --------
apollo-440-n1-p0.chaos.local   0         active    online
apollo-440-n1-p1.chaos.local   1         standby   online
-------------------------------   -------   -------   --------
# ha status detailed

Example Output:

SE@apollo-440-n1-p0(active:0)## ha status detailed
HA System name: apollo-440-n1.chaos.local
HA System Status: highly available
Interconnect Status: ok
Primary Heartbeat Status:  ok
External LAN Heartbeat Status: not ok
Hardware compatibility check: ok
Software Version Check:   ok
 
Node apollo-440-n1-p0.chaos.local:
        Role:      active
        HA State:  online
        Node Health: ok
 
Node apollo-440-n1-p1.chaos.local:
        Role:     standby
        HA State: online
        Node Health: ok
 
Mirroring Status:
Component Name   Status
--------------   ------
nvram            ok
registry         ok
sms              ok
ddboost          ok
cifs             ok
--------------   ------

Article 17861: Healthcheck for Data Domain HA (DDHA) appliances 

 

Step 12 - Run Hardware health check (DDOS >= 8.3.x)

# support hardware healthcheck


HARDWARE Health Check Summary:
+-------------------+--------+
| Component         | Status |
+-------------------+--------+
| Storage Disk      | PASS   |
| Power-Supply Unit | PASS   |
| FAN               | PASS   |
| SAS Controller    | PASS   |
| QAT               | PASS   |
| NvRAM             | PASS   |
| DIMMs             | PASS   |
| IO Cards          | PASS   |
| CPU               | PASS   |
| NIC H/W Errors    | PASS   |
+-------------------+--------+


TSR logs:
Special consideration for Dell PowerEdge Based Data Domain systems (for Example: DD6900, DD9400, DD9900, DD3300, and newer)
Connect to iDRAC and check system status and health -  gather a TSE Log (if necessary).

Article 21925: Data Domain: How to Collect a TSR Log.

Final Step for Recertification request - Finally reboot the system and once the system is back online check for current alerts. Any alerts that impact the health of the system must be addressed.

If any further assistance is required, please open a Service Request with your Support Provider.
 

Additional Information

See this video:
 

Affected Products

Data Domain

Products

Data Domain, Data Domain, Data Domain Deduplication Storage Systems, Data Domain Replicator, DD OS, DD6300 Appliance, DD6800 Appliance, DD6900 Appliance, DD7200 Appliance, DD9300 Appliance, DD9400 Appliance, DD9800 Appliance, DD9900 Appliance
Article Properties
Article Number: 000197930
Article Type: How To
Last Modified: 16 Sept 2025
Version:  7
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.