Start a Conversation

Unsolved

CA

1 Rookie

 • 

47 Posts

668

October 30th, 2022 13:00

Upgraded RAM and CPU on a Dell Precision T3610, now I get WHEA errors on specific tests

I am actually not even sure if this is a CPU or RAM problem, or something else, I originally assumed CPU only because I just installed a new one, but I am not 100% sure and it looks like it might be a RAM problem. The issue is that I upgraded the RAM a month ago, and the CPU just yesterday, but only now am noticing what appear to be RAM errors.

I have a Dell Precision T3610. About a month ago I upgraded it's RAM to a 8x16GB DDR3 EEC configuration. I ran several RAM and memory tests and got no errors, but I wasn't very through as I didn't think to stress the CPU that much or check the Event Log.

Yesterday I replaced it's Xeon E5-1620 v2 CPU with a 2667 v2. I again ran Memtest86 (which took about 10 hours) and got no errors, then I ran the latest Memtest86+ (which is finally out of beta) overnight and got no errors.

Then I booted into Windows and ran several 10-30 minute tests in Prime95 on various CPU, RAM, and both stress-testing configurations, half with and half without Furmark also running... no errors.

So just to be through I then got the latest OCCT and ran a RAM test on all of my RAM... no errors. I then ran the CPU test on Extreme... and that's when I noticed it got a WHEA error.

According to the Event Viewer it says "Event 47: A corrected hardware error has occurred. Component: Memory. Error Source: Unknown Error Source". The details were mostly 0 in every field and had a Physical Address listed. I then tried running OCCT's 2021 Linpack test with it's default 2GB of RAM usage and had no issues. I ran it again trying to set it to use as much of my RAM as possible and I again got a WHEA 47 error. Both of these seemed to happen at the same physical address according to the details. I tried immediately running this same test again expecting it to give me an error at the same time and memory address again... but the third time it passed with no errors.

I tried the build-in Dell Diagnostics and even in through test mode it found no errors

Is this something to be worried about? Can this used CPU I got possibly be damaged and it survived all of those tests but then crapped out during a random test on OCCT? Is it even the CPU or the RAM that's at fault here? Or could it even be a hardware or BIOS incompatibility? (Sadly, the last BIOS update was in 2019 so I doubt an update to fix it will ever happen if it's a BIOS issue.)

I tried Googling about this and most of the answers I got are that one should not be getting any WHEA errors whatsoever... but almost all of those were in regards to people overclocking their CPUs and the OC being unstable, usually the advice was to turn down the OC and/or increase voltage, neither of which I can do since I am not OCing and the BIOS does not let me adjust any such settings. All of these were in regards to consumer CPUs/RAM as well.

At first I found it hard to believe that my RAM passed days of testing when I installed it a month ago, as well as the PC being on 24/7 for that whole month without any errors, and then with the new CPU all those tests still passed but a single OCCT CPU test managed to catch a possible defect in either my RAM or my CPU.

On the other hand though, now that I am checking my Event Log I see that there was a ton of "Event 2: WHEA-Logger" during when I was doing the Prime95 testing (Prime95 itself never showed any errors though) with very little details. The event log just says "A corrected hardware error has occurred" and to check the data section for details, which was pretty sparse on the details anyway.

It's starting to sound like the new CPU, but then I checked further and the only other times I can find a few WHEA errors in my event log, which are all Event 2s, are around the time I installed that new RAM about a month ago and did Prime95 stress testing on it (although not as often as I am getting no). I didn't do additional CPU stress testing at the time since I still had the same CPU I had been using for nearly two years now back then and didn't think to stress the older CPU with the new RAM.

So now I can't tell if it's the CPU, the RAM, or a combination of both and the new CPU just made it even worse.... or what I can even do about this (I kinda need this computer and use it daily so I can't keep it in pieces for days for testing, I don't really have a backup left). 

Not even sure what tools or software I can use to try to pin-point the problem better.

Here are screenshots of the errors: https://imgur.com/a/lWaRti9

Can anyone give me any ideas on what to try?

1 Rookie

 • 

47 Posts

October 31st, 2022 00:00

How can I determine which stick it is? Most of the errors are Event 2 which give almost no information, it's just the two times I got an event 47 that had a physical address listed.

1 Rookie

 • 

41 Posts

October 31st, 2022 00:00

 I don't know the different types of errors but I got errors reported with my ECC RAM when I upgraded my T3610 with used RAM from eBay. It was one bad stick. I got a partial refund and ordered a few more sticks, and all is well now. I would think if it's the same location and if it started right after the RAM upgrade, then it's a bad stick

4 Operator

 • 

1.1K Posts

November 1st, 2022 04:00

Well, one way would be to test with just 1 dimm at a time... it's a supported configuration

1 Rookie

 • 

47 Posts

November 1st, 2022 12:00

I tried looking it up but I can't find what slots I am supposed to use first if I only have 1, 2, or 4 modules.

4 Operator

 • 

1.1K Posts

November 1st, 2022 13:00

Try the leftmost white slot , for single

1 Rookie

 • 

47 Posts

November 1st, 2022 14:00

So, do I fill the first white port from the bottom first then?

 

Do you know which ports I have to fill if I only populate four of them for quad channel?

1 Rookie

 • 

47 Posts

November 1st, 2022 14:00

Left? My slots are above and below the CPU, not to the right or left of it.

4 Operator

 • 

1.1K Posts

November 1st, 2022 14:00

My bad, was looking at the mb with a different orientation than if being in operation. Below , bottom one

4 Operator

 • 

1.1K Posts

November 1st, 2022 16:00

The first white port from the bottom is an assumption i'm making. It's anyway a very quick thing to try and test.

As for the other question, I'm not sure... seems there are threads on this forum asking about the ram and the results are a bit vague, I think

No Events found!

Top