Start a Conversation

Unsolved

This post is more than 5 years old

28577

December 14th, 2017 08:00

CPU 1 MEM VDDQ PG voltage is outside of range

CPU 1 MEM VDDQ PG voltage is outside of range

and

CPU 1 MEM VTT PG voltage is outside of range

I have 3 Dell Poweredge R520's.  They all had 24GB of RAM from the factory.  We did a vitualization project and needed to add more RAM.  So I bought 8GB HMT31GR7CFR4A-H9 which is 1.35v to upgrade the 1st server.  So I added the 6 sticks.  Everything boots up but as soon as it gets to loading VMWare it reboot loops with the errors above.  Okay I did something wrong.  So I tried various things but ultimately had to remove the 2 4GB modules even though your supposed to be able to mix and match and its been up and running ever since.  But it did throw these errors initially.

So now i'm on the 2nd server and I order RAM.  This time I order 6 8GB HMT31GR7BFR4C-H9 which are 1.5v but I read the manual and supposedly it supports 1.5v.  If I just run 3 sticks on each CPU which is 1 8GB stick per channel I have no issue's as soon as I add A4 and B4 I start getting these errors?

Now, I have my 3rd server and I moved the exact RAM to it and it throws the same errors.  So you'd think well maybe its a bad RAM stick.  Well I've tried a bunch of different configurations with no luck in either server.  

So I guess my big questions are does the R520 support 1.5V RAM?    Is it something with this RAM and these servers?  

Moderator

 • 

8.7K Posts

December 14th, 2017 11:00

Hi,

It does support 1.5v memory. Is the BIOS up to date? What is the existing memory that is in the system?

3 Posts

December 14th, 2017 12:00

Yes BIOS is up to date, i'm pretty sure the new RAM just isn't DELL compatible.  I did a chat with Dell. I took the 2nd CPU out and booted it with 4 sticks of the new RAM and it crashes.  But if i boot it with 4 sticks of the OLD 1.35v RAM it boots just fine.  

1 Rookie

 • 

25 Posts

October 5th, 2018 02:00

1 Rookie

 • 

25 Posts

October 5th, 2018 04:00

I have the same errors on my R520 ( 2x CPU, 128 GB RAM ).

Each time this error is logged, each time Windows crashes into BSOD and reboots (Kernel-Power event 41: "this system has rebooted without cleanly shutting down first. This error could be caused if the system stopped responding, crashed or lost power unexpectedly.")

HardwareDiag2.PNG

 

SystemEventLog.PNGFirmwareInventory.PNG

CPU.PNGIndividualMemory.PNG

Memory modules are all the same:

CurrentOperatingSpeed   1066 MHz
Device TypeMemory
LastSystemInventoryTime   2018-10-05T04:48:03
LastUpdateTime   2018-09-26T22:22:59
ManufactureDate   Mon Nov 01 07:00:00 2010 UTC
Manufacturer   Samsung
MemoryType   DDR-3
Model   DDR3 DIMM
PartNumber   M393B2K70CM0-CF8
PrimaryStatus   OK
Rank   Quad Rank
SerialNumber   42633DFF
Size   16384 MB
Speed   1066 MHz

 

Please advise.

October 18th, 2018 10:00

I have the same error messages on a PowerEdge R520 running VMWare ESXi 6.5.0U2 using Crucial RAM (CT3361848, DDR3-1600 ECC RDIMM) in combination with the stock Dell memory (V89WF):

VLT0304 CPU 1 MEM VDDQ PG voltage is outside of range.
VLT0304 CPU 1 MEM VTT PG voltage is outside of range.

The server crashes completely and reboots anywhere from a few minutes to several hours after system startup.

The interesting thing is that this server ran for over a year with a mix of Dell/Crucial RAM with no problems at all. Only after upgrading the BIOS from 2.5.1 -> 2.6.0, the errors began immediately. Unfortunately, rolling back to the 2.5.1 BIOS didn't help as the voltage errors seem to be permanent with the Crucial RAM. The system is unstable running BIOS 2.5.1 or 2.60.

I tried exchanging the Crucial DIMMs from the manufacturer, but the replacements exhibited the same problems and the errors occur no matter which DIMMs are populated. I have worked extensively to troubleshoot this issue both internally and with Dell support.

At this point our only recourse is to populate the system exclusively with Dell memory.

1 Rookie

 • 

25 Posts

October 19th, 2018 09:00

 

As you are mentioning that the issue started when upgrading BIOS from 2.5.1 to 2.6.0, but rolling back the BIOS to 2.5.1 did not solved the problem, maybe the problem has been introduced by the Xeon microcode update included with 2.6.0 ?

This could explain that rolling back the BIOS did not help because the Xeon microcode is not downgraded with the BIOS rollback. ( and the new Xeon microcode could be not correctly supported by the current hardware or BIOS ? )

 My 2 CPUs are Xeon E5-2430 stepping C2 microcode version 0x713 (normally not in the change list below)

 

Dell Server BIOS PowerEdge R520 Version 2.6.0

PowerEdge R520 BIOS Version 2.6.0
Fixes & Enhancements
Fixes
-None for this release.

Enhancements
-Enhancement to address security vulnerability CVE-2018-3639 (https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2018-3639).
-Enhancement to address security vulnerability CVE-2018-3640 (https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2018-3640).
-Updated the Intel Xeon Processor E5-2600 v2 Product Family Processor Microcode to version 0x42D.
-Updated the Intel Xeon Processor E5-2600 Product Family Processor (C-1 Stepping) Microcode to version 0x61D.
-Updated the Intel Xeon Processor E5-2600 Product Family Processor (C-2 Stepping) Microcode to version 0x714.

October 25th, 2018 08:00

Thanks for the suggestion! I hadn't considered that the Intel microcode update included with BIOS 2.6.0 wouldn't be rolled back along with the BIOS downgrade. That could certainly be the cause of this problem, because the system was stable for a year with the mixed memory and only exhibited problems after the 2.6.0 update. The ProSupport technicians swore up and down that the BIOS update wasn't the problem, but if it included a separate CPU update that would certainly be applicable.

This particular PowerEdge R520 has a pair of Intel Xeon CPU E5-2430 0 @ 2.20GHz, Model 45 Stepping 7. Not sure how to find the microcode version for the CPU, the rest I found in iDRAC inventory.

 

1 Rookie

 • 

25 Posts

October 25th, 2018 09:00


@networksvc wrote:

This particular PowerEdge R520 has a pair of @Intel Xeon CPU E5-2430 0 @ 2.20GHz, Model 45 Stepping 7. Not sure how to find the microcode version for the CPU, the rest I found in iDRAC inventory.

 


@mine has exactly the same pair of Intel Xeon E5-2430 @ 2.2 GHz model 45 stepping 7

So this seems to be an issue at least with all R520 using this dual CPU configuration, maybe a design issue ?

Help from DELL would be appreciated !

1 Rookie

 • 

25 Posts

October 26th, 2018 03:00


@networksvc wrote:

Thanks for the suggestion! I hadn't considered that the Intel microcode update included with BIOS 2.6.0 wouldn't be rolled back along with the BIOS downgrade. That could certainly be the cause of this problem, because the system was stable for a year with the mixed memory and only exhibited problems after the 2.6.0 update. The ProSupport technicians swore up and down that the BIOS update wasn't the problem, but if it included a separate CPU update that would certainly be applicable.

This particular PowerEdge R520 has a pair of @Intel Xeon CPU E5-2430 0 @ 2.20GHz, Model 45 Stepping 7. Not sure how to find the microcode version for the CPU, the rest I found in iDRAC inventory.

 


 

Did you disabled Hyperthreading ( virtual cores ) in BIOS ?

October 26th, 2018 06:00


Did you disabled Hyperthreading ( virtual cores ) in BIOS ?


I have not, but do you have reason to believe this would resolve the issue? What kind of performance hit might be expected?

This machine is a hypervisor running VMWare ESXi 6.5.

1 Rookie

 • 

25 Posts

October 28th, 2018 05:00


@networksvc wrote:

Did you disabled Hyperthreading ( virtual cores ) in BIOS ?


I have not, but do you have reason to believe this would resolve the issue? What kind of performance hit might be expected?


Mine has HT disabled. So it is not related.

I presume the issue could rise with the amount of memory modules installed.

Maybe the total power consumption is exceeding the design of what server's design power supply can handle, and this could explain the voltage is going down the normal and is logged as the error by iDRAC 

 

This could explain why some memory modules are fine and other manufacturers modules are consumming more electric power maybe. The total electric charge with number of CPU + number of memory modules could resultat in voltage going down.

This is just an idea, as DELL is not responding.

November 2nd, 2018 13:00

UPDATE:

I received a new set of Dell-branded DIMMs that are manufactured by Samsung. If I install them alongside the stock Dell-branded Micron DIMMs, the system crashes with the CPU voltage error. The Samsung memory works fine on its own though... AND it works alongside the Crucial DIMMs that were causing the problem previously. So the only issues I have is using the original Dell/Micron RAM. I'm asking Dell to replace those DIMMs under warranty.

1 Message

May 5th, 2020 13:00

@karibooit looks like you are using quad-rank 16 GB RDIMMs. This is unsupported in the R520 according to the Owner's Manual (see the note in yellow at the top of the section). A colleague of mine saw the same voltage errors described in this thread while using quad-rank 16 GB RDIMMs in the R520 in a single E5-2430L v2 configuration. After discovering the note in the manual, he switched to dual-rank 8 GB LV RDIMMs and the errors went away.

In my case, I am using dual-rank 16 GB HP/Micron DDR3-1600 PC3-12800 ECC Reg CL11 1.5v (MT36JSF2G72PZ-1G6E1HI) in an R520 in a dual E5-2430L v2 configuration. The part number is consistent across all sticks. This is supported as far as I can tell. However, I too was seeing the same errors: the machine would POST and recognize the RAM, yet about 15s in to OS initialization, the machine would reboot, and MEM VTT and VDDQ errors would end up in the log.

Rotating the DIMMs made no difference. The machine and OS would start fine as long as the A3,B3 pair were not populated. Oddly, with all six DIMMs installed in A1,A2,A3,B1,B2,B3, memtest86 ran the full 24-hour test regimen with zero errors.

Ultimately, at the behest of another colleague, I moved the A3,B3 DIMMs to A4,B4 (leaving A3,B3 empty) and lo and behold, the OS started. It's not clear from the documentation that this is a supported layout and it's not clear what's wrong the third channel in my R520 but it seems to have worked.

Rob
https://impetus.us/

1 Message

October 12th, 2022 11:00

I was having these same kinds of issues with my eBay R320 after upgrading to BIOS 2.9.0 from 2.1.3.  Tried multiple sets of RAM which I swapped from other Dell servers I own, so I know the RAM was good. On BIOS 2.9.0, the system could not go more than 12 hours without some kind of MEM voltage error causing a flat-out reboot.

Anyway, after reading several threads on the Dell community forums, I figured it must be either a bad CPU or bad combination of CPU and BIOS.  The server originally had a Xeon E5-1410 quad core.  I replaced the CPU with a ten core Xeon E5-2470 v2.  I let the system run on this new CPU @ BIOS version 2.1.3 for several days, then "rolled back" to 2.9.0 using the iDrac lifecycle manager.  It has been over 24 hours on the 2.9.0 BIOS and E5-2470 v2, so I'm hopeful this will work.

 

No Events found!

Top