Unsolved
4 Posts
0
1270
October 13th, 2021 12:00
R720 Replacing bad memory stick
We have an R720 with a single CPU (E5-2697 v2) and 8x16GB memory sticks. One of the sticks is failing. It says "Single-bit failure error rate exceeded". So we need to replace it but can't find any 16GB memory that matches and is readily available. So we're thinking of adding 8GB memory.
Can we fill 6 slots with 8GB and 6 slots with 16GB sticks? I'm not seeing that as one of the listed configurations in the manual.
Thanks in advance for your help,
John
0 events found
No Events found!


DELL-Shine K
6 Operator
•
3K Posts
1
October 13th, 2021 18:00
You can mix memory of different size. Please ensure you are following various guideline mentioned in below link
https://www.dell.com/support/manuals/en-us/poweredge-r720/720720xdom/general-memory-module-installation-guidelines?guid=guid-0f97a40c-2a63-4dc8-a8b2-607006d75804&lang=en-us
Below link have a specific combination where you can install eight 16GB and three 8GB DIMM. I believe the combination mentioned by you will also work. You can give a try
2R, x4, 1333 MT/s
A1, A2, A3, A4, A5, A6, A7, A8, A9, A11
https://www.dell.com/support/manuals/en-us/poweredge-r720/720720xdom/sample-memory-configurations?guid=guid-afe58651-3836-4641-8d7b-463afac05ea6&lang=en-us
DELL-Young E
Moderator
•
5.4K Posts
•
37 Points
1
October 13th, 2021 18:00
Shine's right. You technically can. Would it give you the optimal performance? That we could have reservation on. Have a good one!
beauars21
4 Posts
0
October 16th, 2021 10:00
Thanks Shine and Young. We found some 16GB memory and it should arrive in the next few days. So I think we'll wait and see if the production server fails before trying the 6x16GB plus 6x8GB configuration.
beauars21
4 Posts
0
October 25th, 2021 12:00
We replaced the bad memory stick today but the openmanage server is still saying it's faulty. We have the same error we had with the original stick. It says "Single-bit failure error rate exceeded" for DIMM_A5.
The new memory stick snapped right in there. It couldn't have been smoother. When the server booted up, it recognized the new memory, or at least that there was change to the memory, and it gave a message about optimizing and restarting again. It did that. The bios said there was 128 GB and we have 8x16GB so it looked good. Also, in the performance monitor it shows 128 GB now.
Any suggestions?
I'm attaching some screen shots, first the task manager showing 128, then the open manage screen for the bad stick, then an open manage screen for a good stick (so you can confirm it's the same memory specs).
DELL-Charles R
Moderator
•
4.7K Posts
•
25.5K Points
0
October 25th, 2021 13:00
Hello beauars21,
Have you checked you are on latest BIOS 2.9.0?
Does the DIMM meat the General Memory Module Installation Guidelines that ShineK posted?
https://dell.to/2ZldLS8
If you clear the System Event Log (SEL) in the DRAC does that clear the Single Bit error?
I'd also recommend running the built in hardware diagnostics
Boot to F11 on Dell Splash screen, selecting Boot Manager -> System Utilities -> Launch Dell Diagnostics. Note any messages and continue testing.
If you still have the SBE error, split the DIMMs in A1 and A5 to different slots:
Can you confirm you have slots A1, A5, A2, A6, A3, A7 and A4, A8 populated?
Swap:
A1 with A2
A5 with A8
Clear the SEL log and run diags again to check results.
Memory slot numbers:
https://dell.to/3vHuVpo
beauars21
4 Posts
0
October 25th, 2021 14:00
We are on bios version 2.2 (from 2014) and smbios version 2.7. We'll get that updated.
Yes, our memory setup meets the guidelines.
The same error is appearing in the event viewer with a timestamp right after we booted up. I assume that's where the open manage is getting its info?
It's a production server so we don't have much time to take it down and test stuff but we'll try your plan to swap sticks in A1 and A5 with A2 and A8. Also, we bought 2 replacement DIMMs but only put one in. Should we use the 2nd new DIMM too?
DELL-Joey C
Moderator
•
4.2K Posts
•
20.9K Points
0
October 25th, 2021 19:00
Hi @beauars21,
OpenManage gets it's event logs from both OS and the LifeCycle Controller. Try clearing the logs in iDRAC/LCC. Majority of memory error will be cleared after BIOS update, unless it's a mainboard slot issue. After you have updated the BIOS and cleared the logs, try swapping the memory as suggested and check if the issue persist. Sometimes, drain power can too help clear the error, try - Hard Reset https://dell.to/3pCzYqb
Leave the new replacement DIMM as a spare since you have 2, 1 is already installed.
Ultimately, if you have done most of the troubleshooting as the suggested above and you still have the error, you might a mainboard replacement since this is a production server and would not want any disruption.