ECS: How to interpret the different disk states and their meaning from ECS UI - Hardware Monitoring

Summary: You can use the Hardware Health tab in the ECS UI to obtain the health of disks and nodes. This knowledge article describes the different states: Good, Suspect, Bad, Missing, Removed, Not Accessible ...

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Instructions

The Hardware Health tab is accessed from the ECS Portal at Monitor > System Health > Hardware Health. The following states describe hardware health:

  • Good: The node is in normal operating condition.
  • Suspect: Either the node is transitioning from good to bad because of decreasing hardware metrics, or there is a problem with a lower-level hardware component, or the hardware is not detectable by the system because of connectivity problems.
  • Bad: The node needs replacement.

Disks states have the following meanings:

  • Good: The system is reading from and writing to the disk.
  • Suspect: The system no longer writes to the disk but reads from it. Swarms of suspect disks are likely caused by connectivity problems at a node. These disks transition back to Good when the connectivity issues clear up.
  • Bad: The system neither reads from nor writes to the disk. Replace the disk. Once a disk has been identified as bad by the ECS system, it cannot be reused anywhere in the ECS system. Because of ECS data protection, when a disk fails, copies of the data that was once on the disk are re-created on other disks in the system. A bad disk only represents a loss of capacity to the system--not a loss of data. When the disk is replaced, the new disk does not have data restored to it. It becomes raw capacity for the system.
  • Missing: The disk is a known disk that is unreachable. The disk may be transitioning between states, disconnected, or pulled.
  • Removed: The disk is one that the system has completed recovery on and removed from the storage engine's list of valid disks. The history of all the removed disks will be displayed on the ECS UI.
  • Not Accessible: If a node is not accessible, then all its disks have this status. It indicates that the actual status of the disk is not available to ECS.

Procedure

  1. Select Monitor System Health and select the Hardware Health tab.
    By default the Offline Nodes subtab displays. This table may be empty if all nodes are online. Similarly, the Offline Disks subtab may be empty if all disks are online.
  2. Select the Offline Nodes and Offline Disks subtabs to view a summary.
  3. Select the All Nodes and Disks subtab to drill down to nodes and disks.
  4. Click the node name to drill down to its disk health page
Note: The Slot Info value always matches the physical slot ID in ECS U-Series, C-Series, and D-Series Appliances. This makes Slot Info useful for quickly locating a disk during a disk replacement service. Some Certified Hardware installations with ECS Software may not report useful or reliable data for Slot Info.

Additional Information

This knowledge article is an extract of the ECS online Help what is accessible from within the ECS UI Help page and what will direct to the "ECS Monitoring Guide". For the ECS UI Help page, click on the question mark (?) in the upper right-hand corner.
The "ECS Monitoring Guide" is also available for review on the Dell support website.

Affected Products

ECS Appliance

Products

ECS Appliance, Elastic Cloud Storage
Article Properties
Article Number: 000021514
Article Type: How To
Last Modified: 09 Feb 2024
Version:  4
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.