Unsolved

1 Rookie

 • 

51 Posts

1405

November 24th, 2021 19:00

R410 RAM Single bit warning error rate exceeded

Hello,

I have an R410 which is displaying that 6 of the 8 memory modules have the following faults:

Single bit warning error rate exceeded
Single-bit failure error rate exceeded

The tow positions that are OK are DIMM_A1 and DIMM_B1, the others all show this error.

I know the server is making use of these memory modules because the VM's wouldn't be able to run if it didn't have that amount of memory available so I'm not sure if these are latched errors?

My questions are...

1) What would cause the server to mark them as faulty?
2) As I'm confident the server is making use of the modules marked as faulty, is there any other way of trying to 'clear' the error without rebooting?
3) Could the errors be latched and could they potentially clear after a reboot?
4) If I have to reboot, should i perform any other tasks at the same time? (reseating the modules, other checks)?

Any other tips for troubleshooting?

Many thanks in advance for any help.

 

Moderator

 • 

3K Posts

November 25th, 2021 01:00

Hello,
Thanks for reaching out to us. I would like to provide general information and suggestions about SBE warnings. A memory device correction rate exceeded an acceptable value, a memory spare bank was activated, or a multibit ECC error occurred. The system continues to function normally (except for a multibiterror).

I would recommend clearing the SEL logs first and then updating the server's BIOS and IDRAC because some bugs are fixed with system firmware updates.

You can try to understand whether the problem is with the slot or memory by changing the places of the DIMMs (You can do X test between DIMMs on a known good slot). In some cases, this warning may go away when DIMMs are reseat.

If these errors do not improve after the FW update and the problem is not caused by the slot, memory changes are usually made.

Hope this helps!

1 Rookie

 • 

51 Posts

November 25th, 2021 02:00

Hello Erman,

Thanks for your reply, much appreciated. This BIOS is already the latest version.

Can this be done via OMSA? The host is running Windows and a reboot is not desirable.

Thanks.

Moderator

 • 

3K Posts

November 25th, 2021 03:00

Hi, I understand there are a few ways to do it. I can quote from this article https://dell.to/3xjLE2R;

To access the System Event Logs in OMSA:
 

    1. Open OMSA

 

  • Click on "System" on the left side

 

  • Click on the "Logs" Tab in the middle of the page. Now the Server Event Log is shown directly (Figure 7 (English only ))
    SLN292270_en_US__7omsa sel 1

1 Rookie

 • 

51 Posts

November 25th, 2021 04:00

Hello Erman,

Thank you again for your reply.

I have done as you suggested, but this seems to only clear down the logs in the history, not clear down the alert status of the memory modules.

What I was hoping to do was 'rese't or 'clear' the alert to see if it was a latched alarm as the memory modules seem to be working correctly regardless of the critical status.

Thank you.

0 events found

No Events found!

Top