Unsolved

This post is more than 5 years old

4 Posts

146248

January 4th, 2016 13:00

PowerEdge T610 - ECC Single-Bit Correction Warning Rate Exceeeded

I am wondering if I need to be concerned about this issue.

Have a few of these servers exhibiting these errors on the memory:

Single bit warning error rate exceeded
Single-bit failure error rate exceeded

Wondering if we can clear these errors and that will help or what my next steps are. Is there Dell firmware that will fix this?

Thanks in advance

Dyron

11 Legend

 • 

16.3K Posts

January 4th, 2016 16:00

The server can correct single bit errors (SBE's), but it keeps track of how many it corrects. If it corrects "too many", it triggers an alert. You can clear the errors by clearing the ESM logs or rebooting, but they'll just come back once the limit for SBE corrections is reached again. You need to determine the faulty DIMM(s) (or slots) and replace it.

4 Posts

January 5th, 2016 12:00

Hello,

Thanks for the response. All 3 DIMM's have an error, DIMM 1&2, have Single bit warning error rate exceeded and DIMM 3 has Single bit warning error rate exceeded, Single-bit failure error rate exceeded..

So I am thinking either all slots have issues or all the DIMM's are bad. Moving them around most likely the problems with follow.

Any other thoughts, please let me know

Thanks

11 Legend

 • 

16.3K Posts

January 10th, 2016 07:00

Yes, most likely the problem will follow the DIMM, BUT ... 1) you shouldn't just assume the most likely scenario is yours, and 2) sometimes the act of simply reseating the memory will fix the issue.

1. Reseat all the memory.

2. Clear the ESM/hardware log.

3. Boot to 32-bit Diagnostics and run MPMemory to test the RAM.

4. Replace any DIMM's that fail.

Failure to do steps 1-3 will result in possibly diagnostic failure, causing you to replace memory needlessly.

No Events found!

Top