This post is more than 5 years old
8 Posts
0
1083181
PowerEdge R720 memory error limit reached
I have a PowerEdge R720 running XenServer 6.2. We have received this error for the second time in the last 3 weeks: "MEM0005 Persistent correctable memory error limit reached for DIMM1,DIMM2,DIMM3,DIMM4,DIMM5,DIMM6,DIMM7,DIMM8 reseat memory".
Two weeks ago when we got this error the first time, We powered off the server and made sure all the memory was properly seated (it was). We powered the server back on and the error was gone. Since this is a production box requiring HA, we did not take a bunch of time to run extra diagnostic tests.
The error came back this morning. Does anyone have an idea of how we can resolve this issue? Thanks
DELL-Chris H
Moderator
Moderator
•
8.8K Posts
0
April 7th, 2014 11:00
WeadonJ,
The error you are receiving is stating that the memory is operational, but is at an early indicator of a possible future uncorrectable error. My suggestion would be to reseat the dimms, then run diagnostics on them to see if there is anything that shows up.
I would suggest booting to this diagnostic for the memory test. - http://www.dell.com/support/home/us/en/555/Drivers/DriversDetails?driverId=TRWYD&fileId=3332974416&osCode=CXS03&productCode=poweredge-r720&languageCode=EN&categoryId=DI
Also, what revision is the BIOS and ESM/Drac sitting at? Are either up to date?
Let me know what you find out.
weadonj
8 Posts
0
April 7th, 2014 14:00
weadonj
8 Posts
0
April 7th, 2014 15:00
DELL-Chris H
Moderator
Moderator
•
8.8K Posts
0
April 8th, 2014 05:00
Both BIOS and Drac are a few updates out of date. If you go to this page you can find the updates needed - http://downloads.dell.com/published/Pages/poweredge-r720.html
I would suggest walking the updates up to current and not just jumping to the latest update.
weadonj
8 Posts
0
April 9th, 2014 16:00
DELL-Chris H
Moderator
Moderator
•
8.8K Posts
0
April 10th, 2014 06:00
WeadonJ,
Swap dimm B8 with another matching dimm in the server. Then boot and see if the error follows the dimm or remains at the slot. If the error follows the dimm then I would replace the dimm. If the error remains at same location, after swapping dimm, then the issue will be in the slot on the motherboard.
weadonj
8 Posts
0
April 22nd, 2014 08:00
adeel7454
5 Posts
0
September 7th, 2015 04:00
An issue that is being faced by us is the server doesn’t show the full installed RAM Module on Board, and its successfully passes through the booting process without showing any error, like if we have installed 16x20=320 GB RAM it will show sometime 288 GB and sometime 272 GB, the board has also been replaced with new one and BIOS has also been upgraded to latest. we have sent it to various vendors but no one has been able to diagnose it, please help me in this regard.