E171F PCIe fatal error on R710

I have two PowerEdge R710 servers, bought in 2010. The first one came up with an error message about a year ago, and today, the other one went down with the exact same error.

E171F PCIe fatal error on Bus 0 Device 4 Function 0.

On the first occation, I found descriptions telling me to shut down the server, drain the power, and remove/reinstall all the add in cards. This did not help at all on the first server, and I haven't bothered doing it on the second one. There is obviously a severe weakness on these servers. Can anyone tell me which component this refers to, so I may replace it?

0 Kudos
6 Replies
Moderator
Moderator

RE: E171F PCIe fatal error on R710

Thorvald,

What is the OS on the server? As different OS's have different methods. Bus 0 Device 4 function 0 is referring to something residing on the center riser. Is there anything installed in slot 1 or 2 on the riser? If not then the issue may be with the last slot, the raid controller. 

Let us know what you see on the riser and we can go from there.

Chris Hawk

Dell | Social Outreach Services - Enterprise
Get Support on Twitter @DellCaresPro 
Download the Dell Quick Resource Locator app today to access PowerEdge support content on your mobile device! (iOS, Android, Windows)

0 Kudos

RE: E171F PCIe fatal error on R710

Both servers are running VMWare ESXi 5.1.

I'll have to come back later regarding what's on the riser card.

0 Kudos

RE: E171F PCIe fatal error on R710

Hi

Riser 2 is empty. Riser 1 has the RAID-controller at the bottom, and a network card in the uppermost slot.

It seems to me, that the server fails whenever I put load on it. It may run for weeks without error, but if I try to actually use it, it fails.

Regards
Thorvald

0 Kudos

RE: E171F PCIe fatal error on R710

0 Kudos
Dylank
1 Nickel

RE: E171F PCIe fatal error on R710

I would say your best bet would be to try pulling that lone network card and seeing if it keeps flagging it.  If it keeps doing it at that point my view on it is it's likely either the risers themselves or the raid controller as the error is clearly referencing the expansion slots.

0 Kudos

RE: E171F PCIe fatal error on R710

Since the error occurs whenever i put load on the server, and the network card is not used (I use only the four onboard NIC's), I suupose there is not much of a chance that the error disappears if I remove the NIC.

Two identical servers + two indentical errors = hw sucks + my last Dell!

😞

0 Kudos