I would first look if the server is up to date on the BIOS and Drac specifically. They will both have a hand maintaining stability with the system. As well as handling the error reporting for the server. Since the BIOS is current I would look at the iDrac/BMC, what version is it at currently?
In the future I would suggest swapping the dimm with another dimm in the same server. That lets you see if the error follows the dimm, indicating an issue with the dimm. Or if it stays at the slot, which tells us if the issue is the board.
Lastly, make sure the processor is firmly seated on the connection.
Sorry, I should have been more clear. I changed DIMM's 7&8 for "new" ones after swapping two of the modules around seemed to make the issue follow the DIMM.
The BIOS and BMC are both uptodate. All of these servers have recently been updated using the latest Dell SUU
I have tried removing and re-seating all of the memory modules in the affected servers. I have not tried the processors yet, as I didn't want to go disturbing too much.
Could it be that the newer BIOS / BMC firmware are causing false / more sensitive reports?
DELL-Chris H
Moderator
•
9.7K Posts
0
February 10th, 2015 07:00
JamesM85,
I would first look if the server is up to date on the BIOS and Drac specifically. They will both have a hand maintaining stability with the system. As well as handling the error reporting for the server. Since the BIOS is current I would look at the iDrac/BMC, what version is it at currently?
In the future I would suggest swapping the dimm with another dimm in the same server. That lets you see if the error follows the dimm, indicating an issue with the dimm. Or if it stays at the slot, which tells us if the issue is the board.
Lastly, make sure the processor is firmly seated on the connection.
Let me know what you see.
JamesM85
3 Posts
0
February 10th, 2015 07:00
Sorry, I should have been more clear. I changed DIMM's 7&8 for "new" ones after swapping two of the modules around seemed to make the issue follow the DIMM.
The BIOS and BMC are both uptodate. All of these servers have recently been updated using the latest Dell SUU
I have tried removing and re-seating all of the memory modules in the affected servers. I have not tried the processors yet, as I didn't want to go disturbing too much.
Could it be that the newer BIOS / BMC firmware are causing false / more sensitive reports?
JamesM85
3 Posts
0
February 13th, 2015 07:00
Does anyone else have any suggestions?
We now have 6 PowerEdge 1950 servers with this issue, within the space of 4 days - to me that is more than a coincidence.
Has anyone else started having these issues? Are there any known firmware / hardware bugs?