Start a Conversation

Unsolved

This post is more than 5 years old

27036

December 26th, 2013 20:00

M18xR2 Graphics card failure (1033) and tentative replacement

Hello,

My m18xr2 has started BSOD'ing in the past few days. It seems to crash under graphical stress only (including during nividia graphic stress test) and gives me a reboot with a 1033 error. I've also received the beeps indicating a graphics card issue. Recently, I started experiencing cpu temps reaching the 100c* ceiling. Though i dont know that the GPU's were reaching such heights, i feel that the gpu's (hopefully just 1) have overheated. Much of what i researched seems to point in this direction. I have updated drivers, done autopsy (though it said everything was fine), ran a system scan (said everything was fine) and tried a few other ways to see if it could be software or malicious software related.

The computer boots up fine, does tasks fine, no graphical tears or anything that isnt normal until the screen goes black and becomes unresponsive. I have dual SLI nividia 675m's in there and im wondering if maybe one has become damaged and the other is still salvageable. I turned off SLI (or so i think, disabled 3d performance and the indicator doesnt say sli anymore) and it still crashes so i have a feeling it is just the first gpu.

All that being said, here are my questions:

1. Is there a way to make it so only the second gpu is in use so that i may test if it still crashes while exclusively using that gpu? from my understanding there is not and the gpu in slot 1 has to be working.

2. I have MINOR experience in disassembly (i've gone as far as removing the hardrives prior to this). After looking through guides, It seems like you can pull out the gpu, but will i need to retape cords or re-thermal paste the gpu? or will previous adhesives hold after i inspect the gpu.

3. For the past 4-5 months, I've been using Hiwinfo64 to help regulate fan speeds as i was noticing increased cpu temps. Could having all 3 fans on generate MORE heat then they push out and thus have ruined the card? or could it be dirty heat sinks or something of that ilk?

I currently dont have the money to send it directly through alienware, nor buy another gpu as im traveling. But if i could remove gpu 2, place it in 1, and have the computer run without crashing for the immediate future, i could go from there.

Any help is appreciated!!!

Kojak

December 27th, 2013 11:00

bump

Moderator

 • 

2.7K Posts

December 27th, 2013 13:00

Hello Kojak!

Definitely those video cards are overheating, with this tool you will be able to check the temperature of each video card, if you are in warranty send me a PM with your service tag so I can check if there is something I can do for you and if your are out of warranty I would recommend you clean all the fans and heat sinks and you can put some thermal paste, there are disassemble videos on YouTube so can have a guide on how to open the system. 

December 27th, 2013 17:00

Thanks for the reply!

Sadly i am out of warranty by 6months to a year and dont think i can afford to send it in or have the card replaced at the moment : (  I supplied additional information to some other forums and thought i'd supply you with some too just in case it helped.


FIRST UPDATE:

Problem signature:
Problem Event Name: BlueScreen
OS Version: 6.1.7601.2.1.0.768.3
Locale ID: 1033

Additional information about the problem:
BCCode: 116
BCP1: FFFFFA8012356010
BCP2: FFFFF8800F645E64
BCP3: 0000000000000000
BCP4: 0000000000000002
OS Version: 6_1_7601
Service Pack: 1_0
Product: 768_1



I have a few others, but they are all 1033 with CCode 116's, tho the BCP1-4 does vary slightly (i assume thats normal).

SECOND UPDATE:
In addition, I downloaded GPU-z and am using HIwinfo64 to monitor heat as i test it now. Currently, the cpu is sitting at around 65c while sitting at the lobby screen of SMITE. This isnt a graphically stressful screen though it does have action moving across it. GPU-Z however is showing a whopping 90-94C resting heat temp for GPU1 and it isnt even playing the game. Could this mean that its the TEMP and not necessarily damage that is 1033ing me?

In addition, i should note that i do force use all fans (I've opened the case and tested to make sure they all are spinning). Using all the fans maxxing at 5000 when at 90c on the cpu, i was never getting the cpu higher then 90, though it looks like the gpu could have been reaching much higher. the upper left portion of my laptop (where gpu1 sits) used to get very hot to the touch. not hot enough to burn but hot enough to hesitate touching there. I didnt talk it up to much as my old macbook pro could become molten hot lava (which isnt an excuse, just an experience) 

THIRD UPDATE:
the computers gpu doesn't reach over 95, and average is probably 90, which I've read isn't critical damage heat. It continues to run, no tears no nothing and seems to bsod (with no actual bsod other then restarting and a message) around every 30 minutes with no warning while playing or having recently played a game that spins it up. I have had the laptop on and using chrome for around an hour now and it hasnt crashed. It did time at around 30 minutes everytime when i was spinning it up


Lasty,

I still need to blow the computer out and check the heatsinks, I'm currently not in a place that i can easily and readily find one, but i hope i can do it in the next 2 days.

December 27th, 2013 20:00

Hey Andrewsi!

Wow, thanks for such a helpful post, the most helpful by far.

I must have mislead you, resting gpu temp is NOT 90, its resting now and sits at 60-65, it was reaching 90 and hit a ceiling of about 95. Sadly, i was not using a gpu monitoring app before now, so i cant tell if it hit higher then that prior to this issue. A few people have said that i may have chipped a good 1 to 2 years of lifespan off the lap

I've been finding constantly conflicting reports of whether or not the temps could have caused permanent damage.

And i do have minor experience in disassembly, most of my experience was way back in the mid 2000's tho, so i dont know how much crosses over. I've watched a few videos on how to thermal paste (looks fairly straightforward if you are accurate and careful), but im not sure i know how to reseat the pads, or know which pads are appropriate to be using in this specific in this laptop. I also dont know if i'd need to buy other forms of adhesive to retape wires (i noticed to take the keyboard off you have to untape a cord path of some kind).

Would you happen to know if this particular error report (1033, bccode 116) could be caused by the laptop shutting down to save itself from dangerous temps? is that a thing? haha. and also, the temps that you have suggested are the temps of cards inside a laptop correct? not for towers and such.

Thanks so much man!

December 27th, 2013 20:00

sorry for so many replies but! when not playing games or other graphical things, i assume the laptop uses integrated graphics? so the cards are totally not in use. I ask because if the graphics cards even have minor stress while using chrome and such, then i can hope that they havent recieved damage as there has been no bsod.

December 27th, 2013 20:00

and yes, i meant the primary gpu :) upper left hand corner.

901 Posts

December 27th, 2013 20:00

Hi Kojakattack.

Firstly 90 degree's C at idle is wayyyyy to high for your GPU.

Can I confirm that when you say GPU1 you mean the primary GPU (Left side - Usually called GPU0)

If that is the case then it should be idling at somewhere between 40 and 55 degrees going upto high 70's or low 80's under load and playing games.

All is not lost just yet, you also mentioned you have some experience with dis-assembly? If your up to it then it may be a good idea to remove the primary GPU and repaste it. If it has been hot for a while it may also need the thermal pads repositioned and/or replaced, at least 50% of the GPU's I have removed from Alienware computers have the thermal pad issues, they look like they have been assembled by a blind person, the pads are supposed to be sitting nicely on each individual component, but 50% of them are an absolute disgrace, pads mashed together, sitting half on half off, missing, broken, dried out - Even screws not firmly tightened on GPU heatsink - These are new laptops I'm talking about, not second hand or refurbs.

So before you go swapping or spending up big try:

1. Cleaning your fans and heatsink fins.

2. Repasting your GPU and repositioning the thermal pads. (IC Diamond is the best but hard to use/apply, Artic silver 5 is good and a lot easier to use)

 

See how you go after cleaning and fresh thermal paste, here's some links you may find useful

 

ftp://ftp.dell.com/Manuals/all-products/esuprt_laptop/esuprt_alienware_laptops/alienware-m18x-r2_owner%27s%20manual_EN-US.pdf

 

http://forum.notebookreview.com/alienware-17-m17x/561599-m17x-r3-cpu-gpu-re-pasting-guide-w-pics.html

 

http://esports.alienwarearena.com/forums/topic/10060/hardware/-thermal-paste-re-paste-lounge-/

901 Posts

December 30th, 2013 19:00

Hi again kojakattack - I forgot to ask what kind of GPU's you have Nvidia or AMD.

60-65 degrees C in my opinion is still about 10-15% to high if you are running Nvidia GPU's, personally I'm not happy unless I can get idle temps under 60 C, but much prefer low-mid 50's.

Low-mid 90's C when playing games (Unless it's something like Metro - Last Light or BF4 or your in the dessert) is too high.

Mid 90's wont shorten your GPU life unless you consistently keep it there for quite a while, I've had Nvidia and AMD GPU's reach 105 degree C during initial testing, of course I fix that as soon as I found it and those laptops have not had any trouble, you should see some of my previous posts, I've had Alienware laptops that have had the thermal pads EATEN by insects

http://en.community.dell.com/owners-club/alienware/f/3746/t/19457149.aspx

As you can imagine, the temps on that laptops were cooking eggs!!!

That was over 18 months ago and the guys who purchased that laptop come back in November this year to upgrade his GPU's to GTX680's he told me the temps were fine all the time he had it.

Long and the short of it is, you wont know until you repaste and reseat the GPU's.

If you've had a bit of prior experience then you should be fine, there are step by step guides on how to do everything I've mentioned, the thermal past application is like spreading cement when laying bricks, you start with a small amount in the middle of the GPU diode and spread it out evenly across the whole surface, do not miss any part, use an old laminated (CLEAN) plastic card to spread it evenly. Use metho spirits to clean the diode and heatsink of any and all old compound, use tweezers and possibly a razor blade to remove and re-apply old thermal tape. You will have to make a judgement call on the thermal tape, if its been mashed together you may not be able to reuse it. There are pictures available to view on the internet to help you apply the thermal tape to the right component, they are hard to find but they ARE out there.

Don't worry about the "untape cord path" they are just ribbon cables with double sided tape on them to keep them in position, theres nothing that is not "re-useable" during dis-assembly/re-assembly except the GPU thermal paste and the thermal tape if it is damaged.

Sorry I don't know if the error codes you have are associated with your current fault, try sending a PM to Telsa1856 on this forum, he's pretty clued in with that sort of stuff.

Your laptop will not operate on integrated graphics unless you tell it to by hitting the function and F7 key to reboot, default graphics are always the main GPU's.

Feel free to PM me if you want me to help you through the dis-assembly

No Events found!

Top