Start a Conversation

Unsolved

GS

7 Posts

5517

June 13th, 2018 07:00

Single-bit overflow ECC errors PERC H710

PowerEdge T620 restarted last night for Windows updates.  This morning the system was stuck at:

Memory/battery problems were detected.  We pressed "any key" to continue.  Then...

Single-bit ECC errors were detected.   We pressed "X" to continue  Then...

Single-bit overflow ECC errors were detected.  If you continue data corruption can occur.   We have not moved on from this screen yet. 

What are the chances of corruption?  And what is best way to permanently resolving this?

Unfortunately this server is at a remote location about 2 hours away and I can't access remotely at this point.  Thanks for any advice.

Greg

Moderator

 • 

6.2K Posts

June 13th, 2018 09:00

Hello

Single bit errors are memory errors. They may be related to the memory on the PERC that is used as cache or system memory. The slot that is reporting the correctable errors should be mentioned in the error. If it is reported against the PERC then the controller should be replaced. If it is reported against a system module then you should move the module around to determine if it is a slot, module, or memory controller issue.

Data corruption can occur if the PERC battery is not functional. Data corruption can also occur with faulty memory. Single-bit  errors are correctable, but it is a sign of problems. The warning is letting you know that a module is showing signs of failure. If a multi-bit or uncorrectable error occurs the system will halt. This could cause data loss/corruption.

False errors can be reported during an improper shutdown. If there were issues during the restart then I suggest choosing the option to continue, let the system updates complete, and then perform another restart to see if the issues persist.

Thanks

1 Message

August 22nd, 2023 18:44

Update the latest BIOS firmware and check. Probably it will fix your issue.

No Events found!

Top