I have two PowerEdge R710 servers, bought in 2010. The first one came up with an error message about a year ago, and today, the other one went down with the exact same error.
E171F PCIe fatal error on Bus 0 Device 4 Function 0.
On the first occation, I found descriptions telling me to shut down the server, drain the power, and remove/reinstall all the add in cards. This did not help at all on the first server, and I haven't bothered doing it on the second one. There is obviously a severe weakness on these servers. Can anyone tell me which component this refers to, so I may replace it?
What is the OS on the server? As different OS's have different methods. Bus 0 Device 4 function 0 is referring to something residing on the center riser. Is there anything installed in slot 1 or 2 on the riser? If not then the issue may be with the last slot, the raid controller.
Let us know what you see on the riser and we can go from there.
Riser 2 is empty. Riser 1 has the RAID-controller at the bottom, and a network card in the uppermost slot.
It seems to me, that the server fails whenever I put load on it. It may run for weeks without error, but if I try to actually use it, it fails.
I would say your best bet would be to try pulling that lone network card and seeing if it keeps flagging it. If it keeps doing it at that point my view on it is it's likely either the risers themselves or the raid controller as the error is clearly referencing the expansion slots.
Since the error occurs whenever i put load on the server, and the network card is not used (I use only the four onboard NIC's), I suupose there is not much of a chance that the error disappears if I remove the NIC.
Two identical servers + two indentical errors = hw sucks + my last Dell!