A good 1st step is to swap DIMMs as suggested by your supplier to see if the error follows the DIMM or not. I would suggest swapping with DIMM1. Also, you need to clear the SBE log. The hardware log will pick up on the SBE log and re-report the error if it is still in the memory log. Steps can be found at:
You can either run diagnostics on the memory to try and trigger errors or monitor the server to see if the error comes back. If the error returns to DIMM4, the problem is most likely with the system board. If the error follow the DIMM to slot 1 the problem will be with the DIMM.
I swapped modules and everything was ok for about 1 month... then the warning moved to DIMM5 - looks like it went with the module.
But I did not checked Logs later because PE2950 display didn't show up anything (steady blue, no warn nor anything else...) - yes, my bad...
Few days ago I noticed orange colored display again....
Now I got nuke bomb dropped :( "E2119 Fatal SB Mem CRC" on display, in logs move dreadful description:
Multi-bit memory errors detected on a memory device at location(s) DIMM1,DIMM2,DIMM3,DIMM4,DIMM5,DIMM6,DIMM7,DIMM8.
My supplier advise me to run server on one memory pair , and then check other pairs in firsts slots. If error continue then I have MB broken (worst case scenario).
So for now I grab backups everyday and look for a moment to get the memory swapping thing done.
The bios I have (form OMSA):
Version 2.7.0
Release Date 10/30/2010
And the BMC:
Name Baseboard Management Controller
Version 2.37.00
I found that newest bios is from 2013 v2.3.1 , but it have number much lower that the current one I have flashed - I have v2.7.0 .
BMC is up to date: v2.37.00
Ps. I checked updates with my service tag from server.
Dell-JimmyP
311 Posts
0
January 23rd, 2018 12:00
Hi,
A good 1st step is to swap DIMMs as suggested by your supplier to see if the error follows the DIMM or not. I would suggest swapping with DIMM1. Also, you need to clear the SBE log. The hardware log will pick up on the SBE log and re-report the error if it is still in the memory log. Steps can be found at:
http://www.dell.com/support/article/sln131078/
I recommend making sure that the BIOS and BMC are current. Often there will updates and fixes. The updates can be found at:
http://www.dell.com/support/home/product-support/product/poweredge-2950/
You can either run diagnostics on the memory to try and trigger errors or monitor the server to see if the error comes back. If the error returns to DIMM4, the problem is most likely with the system board. If the error follow the DIMM to slot 1 the problem will be with the DIMM.
przemo.w
6 Posts
0
July 11th, 2018 01:00
Hello,
I swapped modules and everything was ok for about 1 month... then the warning moved to DIMM5 - looks like it went with the module.
But I did not checked Logs later because PE2950 display didn't show up anything (steady blue, no warn nor anything else...) - yes, my bad...
Few days ago I noticed orange colored display again....
Now I got nuke bomb dropped :( "E2119 Fatal SB Mem CRC" on display, in logs move dreadful description:
My supplier advise me to run server on one memory pair , and then check other pairs in firsts slots. If error continue then I have MB broken (worst case scenario).
So for now I grab backups everyday and look for a moment to get the memory swapping thing done.
The bios I have (form OMSA):
And the BMC:
I found that newest bios is from 2013 v2.3.1 , but it have number much lower that the current one I have flashed - I have v2.7.0 .
BMC is up to date: v2.37.00
Ps. I checked updates with my service tag from server.