Unsolved

This post is more than 5 years old

2 Intern

 • 

548 Posts

1801

August 8th, 2016 00:00

T610 upgrade issues

My T610 has 2x Dell supplied Nanya NT4GC72B8PB0NL-CG sticks which are 4GB dual rank DDR3 RDIMM's capable of running at 133MHz CL-9 timing. These RDIMM's were installed into my T610 by Dell and occupied slots A2 & A3. Only one CPU was installed by Dell, an e5605. The BIOS Memory Mode was set to "Advanced ECC". All this worked without issue of fuss.

I have since aquired 6 Hynix HMT151R7TFR4C-H9 sticks which are 4GB dual rank DDR3 RDIMM's capable of running at 1333Mhz with CL-9 timing. And like a Bull at a gate, i simply removed theDell RDIMMS and slot blanks and installed the Samsung RDIMM's.

Doing this resulted in Configuration warnings and MEMTEST lane failure errors (associated with slots A1 & A4) and as a result i could only see 16GB of RAM. Changing BIOS Memory Mode from "Advanced ECC" to "Optimised Mode" removed the Configuration warnings but i still got MEMTEST failures.

So, i then removed the A1 & A4 RDIMMS and changed BIOS Memory Mode back to "Advanced ECC" and all was well with the server. No POST errors and running Server Configurator Hardware diagnostics Memory (EvLog, Stress, Integ1, Integ2, XMATS32, WCMATS, WCHch, Scan) tests showed that all these tests passed.

Feeling that these 4x Samsung RDIMM's were OK, I changed BIOS Memory Mode to "Optimised mode" and moved these RDIMM's from slots A2, A5, A3 & A6 to slots A1, A4, A2 & A5 (leaving the 2 slots clostest to the CPU free). Unfortunately i got MEMTEST errors and could only see 8GB or RAM.

As a result i don't think the memory is at fault but i'm not sure what is the best way to verify the memory is indeed 100% OK?

How can i further fault find and isolate the issue to the motherboard or processor (remembering i only have one CPU installed)?

Could i install all RDIMMS into A2, A5, A3, A6, B2, B5, B3 & B6 with BIOS set to "Advanced ECC" even though i only have one CPU? No, see this post

Lastly, how can i get rid of the following messages during POST?
Message PR1: Replaced parts detected... replacement action(s) will not be performed.
Message PR11: Parts Replacement license is not present... replacement action(s) will not be performed.

I undestand that this "Parts Replacement license" is associated with the IDRAC6 and requres a Dell branded uSD card (which i do not have) to be installed into the IDRAC-enterprise (which is installed) for it to be "licensed". Consumer uSD card costs $5 but no doubt cost $50 for the Dell branded uSD card!

7 Practitioner

 • 

9.7K Posts

 • 

48K Points

August 8th, 2016 14:00

Skylarking,

The issue looks like it may be an issue with the motherboards dimm slot. What I would suggest you do to verify is to set the memory mode to Optimizer, then remove all the dimms from the server and test each of the dimms in slot A1. If they work then test them in pairs in A1 and A4.

Let me know what you find.

2 Intern

 • 

548 Posts

August 10th, 2016 03:00

Thanks for the replay Chris :emotion-2:

As mentioned, i've already tested 4 Samsung DIMMs in Advanced ECC mode when installed in A2, A5, A3 & A6 and these 4 DIMMs passed the tests.

After my Post and before your reply, i also tested the remaining 2 Samsung DIMMs (the ones initially flagged as 'faulty' when i put all 6 in at once). This later test was with BIOS in Optimizer mode and the DIMMs were installed in A1 & A2. All tests passed.

However, i've been running the "express" tests which still take quite some time to complete.  And since it's the first time i'm using Dell's MpMemory test, i don't have any experiance to draw upon. I've had no luck finding any docs that describe these MpMemory tests in some sort of detail, which doesn't help. Meanwhile some of the individual tests, like XMATS32, XMATS & WCMch, seem to pause part way through and/or utilise just 1 CPU for part of the test while again pausing for a while! It's all rather confusing and as a result i'm not sure i understand enough about the tests and what they verify. At the moment, it seems all i can reply on is the green "Pass" (and presumably the red "Fail" that i haven't seen yet, touch wood).

So, are long test times of >10 min for one 4GB DIMM with periodic test pause/drop to single CPU/pause/jump to multiple CPU/etx normal for the MpMemory "express" tests?

Should i be doing a full test (EvLog, Stress, Integ1, Integ2, XMATS32, XCMATS, WCMch, Scan, + MATS, MarchB, TestY, ECC) which will take much much much longer than before?

After reading your post, i retested just one of the 'faulty' DIMMS on its own in A1 with BIOS set to Optimiser mode and it passed all "express" tests. But as these tests take a rather long time, i got impatient and rather than continue testing individual DIMMs, i tested 3 DIMMS in A1, A2 & A3 which returned a Pass. 

I'm reasonably confident that all 6 Samsung DIMMs are OK but my confidence may be simply due to a lack of understanding :emotion-18:

What difference is there when testing individual DIMMs in A1 as compared to testing 3 DIMMs in A1, A2 & A3 when pass results are seen? How certian of good DIMMs can one be?

Meanwhile, i'm confused with the second part of the test where you suggested i run DIMMs in A1 & A4 (Optimiser mode). This is not the recommended configuration for 2 DIMMs according to the Technical Guide Book and results in "Warning: Unsupported memory configuration is detected" on boot.. In any case, such a test with DIMMs in A1 & A4 passed!

As a result, with all DIMMS having passed and all slots having passed in one or another test, i again installed all 6 Samsung DIMMs in Optimised mode but this time all 24GB of ECC RAM was found without warning/error when booting the system!

It's hardly reasuring that i have found no faults and it didn't work but now works. As such, i'm now running a full barage of tests with all sub tests selected (guess it will take all night).

End restult is that it's a long and time consuming process and i'm not sure i'm doing things in an effective and efficient way, lack of knowledge and such :emotion-18: but i think it's working :emotion-4:

2 Intern

 • 

548 Posts

August 10th, 2016 08:00

Well a full MpMemory test on the 24GB of Samsung EEC DDR3 RAM took just over 3 hours with no errors found. Thats just over 30 minutes per 4GB DIMM.

So all DIMMs are good, all slots are good. End result is a working PE T610 :emotion-5:

Why it didn't work the first time is anyones guess :emotion-43:

7 Practitioner

 • 

9.7K Posts

 • 

48K Points

August 10th, 2016 12:00

Glad to hear. The reason I had asked for the test on A1 individually, then the A1 and A4, was to test the slots as well as the dimms. Sorry for the confusion on that. Glad you are up and running.

0 events found

No Events found!

Top