Unsolved
This post is more than 5 years old
29 Posts
0
6202
R720 HW error cause reboot or core crash
Hi!
In have the next error in a R720 server.
The idrac show me A PCI parity error was detected on a component at bus 0 device 5 function 0.
And show me problem with PSU1, detect and not detect, etc...
the diagnose tool say me:
Error Code:2000-0251
Validation: 74526
DELL-Daniel My
Moderator
Moderator
•
6.2K Posts
0
October 2nd, 2017 11:00
Hello
What device is in that slot? This is most likely a PCIe card.
That just says that there are errors in the hardware log. You must clear the log before running diagnostics if it contains errors. I recommend saving the logs before clearing.
Thanks
raprop
29 Posts
0
October 2nd, 2017 12:00
Is a Xenserver an run lspci -vv
I think this is the device
00:05.0 System peripheral: Intel Corporation Xeon E5/Core i7 Address Map, VTd_Misc, System Management (rev 07)
Subsystem: Dell Device 048c
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- Capabilities: [40] Express (v2) Root Complex Integrated Endpoint, MSI 00
DevCap: MaxPayload 128 bytes, PhantFunc 0
ExtTag- RBE-
DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
MaxPayload 128 bytes, MaxReadReq 128 bytes
DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
All firmware are updated, bios, psu, nic daugher card, nic pci card, raid, idrac, etc...
I clean the hardware log, many error of this type..
2017-09-28T19:06:41-0300 RDU0011 The power supplies are redundant.
2017-09-28T19:06:36-0300 RDU0012 Power supply redundancy is lost.
the idrac in the register of lifecycle have many error from past days
Today put the de bios in high perfomance, before have high perfomance radc i thing...
maybe is some parameter in the latest bios!?
Thx
DELL-Daniel My
Moderator
Moderator
•
6.2K Posts
0
October 2nd, 2017 16:00
That doesn't seem to be a PCIe slot. It looks like the systems management bus.
All of the power messages could be caused by improper shutdowns. If the system is crashing due to PCIe fatal errors then the PSU errors could be a symptom of the problem.
I would suggest that you start with whatever changed recently. If anything was changed when this issue started then that is the likely cause. This could be a hardware or software issue.
Thanks