Unsolved
This post is more than 5 years old
5 Posts
0
5308
February 25th, 2019 20:00
R610 MEMORY ISSUE - MEMBIST Failure and no BIOS changes
I recently got a Dell R610 server and have some strange memory issues. BIOS and IDRAC is updated to the latest.
I have got the machine with 96GB and when booting there was always a MEMBIST failure. I took all memories out and tested just one ram on each CPU side (8GB) of the same type.
Also unpowered the machine. Hold the button for 30sec. Took out the BIOS battery.
The error I get here is always MEMBIST failure - the following DIMM has been disabled by BIOS: DIMM B1 I since i have many ram left I tried different ram, swapped and so on but the same issue.
I also swapped the CPU but the same error on B1 Now even more interesting when I put 3x8 on each CPU side (in the white slots), the system boots suddenly the B1 error disappears and now it shows that DIMM B3 is having an issue.
I also went into BIOS and tried to change the Memory Operating Mode but it´s colored out and I can not change to any other mode.
The only changes in there are System Memory Testing which is on and Node Interleaving which is OFF.
Everything else is blued out and can not be changed. How comes?
After spending hours on this issue, swapping CPU, testing many different DIMMs, I have no clue what is going on any why can´t I even change the settings in the BIOS. Please advice. Thanks a lot!


Dell-DylanJ
6 Operator
•
2.9K Posts
0
February 26th, 2019 07:00
Hello,
The matrix that I am looking at shows a 12x8GB RDIMM configuration as supported in Optimized mode for the R610.
Based on your description, it sounds like you might have a motherboard issue. Correct my understanding if I'm wrong, but the issue didn't appear to follow an individual DIMM and switching the processors didn't result in the error moving to the A side. The processor will have the memory controller, so we know that we've tested different memory and different controllers. I'd be inclined to suspect the motherboard.
What you might do to test is strip the server down to a single processor configuration with memory in only the A side. This way you can use the A side as a control group to had single components until an error is flagged on the A side. If nothing ever flags as you switch hardware out, then that should further indicate the B side of the board.