Avamar: Gen 5a Physical Disk Failure EventID 52807

Summary: A fault has occurred on a hard drive. Utilize the table below to know the correct action plan.

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Symptoms

A message like below may be seen in mccli event show, the Avamar User Interface (AUI) events page or in /var/log/messages.
Note the message ID for the next step.  Message ID is highlighted below.

HARDWARE: Jun  2 11:37:02 hostname hardware_monitor[36324]: MessageID:PDR1001,  Created:2023-06-02T07:36:52-04:00,  Severity:Critical,  Message:Fault detected on drive 17 in disk drive bay 1.

Cause

There are several causes that are listed below.

Resolution

Use the table below to locate the MessageID, and the recommended action.
 
PDR1001 Detailed Description
The controller detected a failure on the disk and has taken the disk offline.
Recommended Action
Remove and re-seat the failed drive. If the problem persists, contact technical support.
 
PDR88
 
Detailed Description
When hard drives in spun down power state are configured, the drives should transition to spun up power state. If the drive is not functioning properly, this transition could fail.
Recommended Action
Replace the hard drive and try again. Contact technical support if the issue persists.
 
PDR1101
 
Detailed Description
The controller detected a failure on the disk and has taken the disk offline.
Recommended Action
Remove and reseat the failed disk. If the problem persists, contact technical support. See the product documentation to choose a convenient contact method.
 
PDR1016
 
Detailed Description
The controller detected that the drive was removed.
Recommended Action
Verify drive installation. Remove and reseat the failed drive. If the problem persists, contact technical support. See the product documentation to choose a convenient contact method.
 
PDR1116 Detailed Description
The controller detected a drive removal.
Recommended Action
If unintended, verify drive installation. Remove and reseat the indicated disk. If the problem persists, contact technical support. See the product documentation to choose a convenient contact method.
 
PDR12
 
 
 
Detailed Description
The hard drive has failed or is corrupt.
Recommended Action
Replace the failed or corrupt disk. Identify a disk that has failed by locating the disk that has a red "X" for its status. Restart the initialization.
 
PDR13 Detailed Description
A hard drive in the virtual disk has failed or is corrupted. In addition, you may have cancelled the rebuild.
Recommended Action
Replace the failed or corrupt disk, and then start the rebuild operation.
 
PDR20
 
 
 
Detailed Description
A disk has received a SMART alert (predictive failure) after a configuration change. The disk is likely to fail soon.
Recommended Action
Replace the disk that has received the SMART alert. If the hard drive is a member of a non-redundant virtual disk, then back up the data before replacing the disk. Removing a hard drive that is included in a non-redundant virtual disk causes the virtual disk to fail and may cause data loss.
 
PDR3
 
Detailed Description
The RAID Controller may not be able to read/write data to the hard drive drive indicated in the message. This may be due to a failure with the hard drive drive or because the hard drive drive was removed from the system.
Recommended Action
Remove and re-insert the hard drive drive identified in the message and make sure the hard drive drive is inserted properly. If the issue persists, replace the hard drive drive.
 
PDR47
 
 
 
Detailed Description
The controller encountered an unrecoverable medium error when attempting to read a block on the hard drive and marked that block as invalid. If the controller encountered the unrecoverable medium error on a source hard drive during a rebuild or reconfigure operation, it punctures the corresponding block on the target hard drive. The invalid block clears on a write operation.
Recommended Action
Back up the data from the disk. Start disk initialization and wait for it to complete, and then restore the data from a backup copy.
 
PDR57
 
Detailed Description
The bad block table is the table used for remapping bad disk blocks. This table fills as bad disk blocks are remapped. When the table is full, bad disk blocks are no longer remapped, which means that disk errors are no longer corrected. At this point, data loss can occur.
Recommended Action
Replace the disk generating this message and restore from a backup copy. You may have lost data.
 
PDR64
 
Detailed Description
The rebuild or recovery operation encountered an unrecoverable disk media error.
Recommended Action
Replace the disk.
 

Additional Information

Field Engineers could know which drive to replace by simply looking for the drive that has a blinking amber light in the second position.  However, you may be called on to assist a Field Engineer with the replacement by manually blinking the alert light on the disk. The commands for blinking and UNblinking are below.  
root@avanode:~/#: omconfig storage pdisk controller=0 action=blink pdisk=0:1:0
Command successful!

root@avanode:~/#: omconfig storage pdisk controller=0 action=unblink pdisk=0:1:0
Command successful!
It is important to unblink the disk afterwards, as there is no timeout value and it blinks indefinitely.

The following webpage contains a guide to all possible light blink combinations.

https://internal.software/blink/#/reference/R740xd/Hard%20Drive

Affected Products

Avamar Data Store Gen5A
Article Properties
Article Number: 000218950
Article Type: Solution
Last Modified: 18 Dec 2023
Version:  3
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.