Unsolved
This post is more than 5 years old
1 Rookie
•
1 Message
0
1246908
PCIE Fatal Err: Critical Event sensor, bus fatal error (Bus 0 Device 1 Function 0) was asserted
I found this errors from /var/log/messages and System Event Logs:
May 21 11:14:23 Server Administrator: Instrumentation Service EventID: 1014 System software event: Description: Err Reg Pointer: Link Tuning sensor, OEM Diagnostic data event was asserted Date and time of action: Tue Jul 7 07:54:11 1970
May 21 11:14:23 Server Administrator: Instrumentation Service EventID: 1014 System software event: Description: PCIE Fatal Err: Critical Event sensor, bus fatal error (Bus 0 Device 1 Function 0) was asserted Date and time of action: Tue Jul 7 07:54:11 1970
May 21 11:14:23 Server Administrator: Instrumentation Service EventID: 1014 System software event: Description: PCIE Fatal Err: Critical Event sensor, bus fatal error (Bus 1 Device 0 Function 0) was asserted Date and time of action: Tue Jul 7 07:54:11 1970
System: PowerEdge R710
Server OS: Centos 5.5
lspci
00:00.0 Host bridge: Intel Corporation 5520 I/O Hub to ESI Port (rev 13)
00:01.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 1 (rev 13)
00:03.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 3 (rev 13)
00:04.0 PCI bridge: Intel Corporation 5520/X58 I/O Hub PCI Express Root Port 4 (rev 13)
00:05.0 PCI bridge: Intel Corporation 5520/X58 I/O Hub PCI Express Root Port 5 (rev 13)
00:06.0 PCI bridge: Intel Corporation 5520/X58 I/O Hub PCI Express Root Port 6 (rev 13)
00:07.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 7 (rev 13)
00:09.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 9 (rev 13)
00:14.0 PIC: Intel Corporation 5520/5500/X58 I/O Hub System Management Registers (rev 13)
00:14.1 PIC: Intel Corporation 5520/5500/X58 I/O Hub GPIO and Scratch Pad Registers (rev 13)
00:14.2 PIC: Intel Corporation 5520/5500/X58 I/O Hub Control Status and RAS Registers (rev 13)
00:1a.0 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #4 (rev 02)
00:1a.1 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #5 (rev 02)
00:1a.7 USB Controller: Intel Corporation 82801I (ICH9 Family) USB2 EHCI Controller #2 (rev 02)
00:1d.0 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #1 (rev 02)
00:1d.1 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #2 (rev 02)
00:1d.7 USB Controller: Intel Corporation 82801I (ICH9 Family) USB2 EHCI Controller #1 (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 92)
00:1f.0 ISA bridge: Intel Corporation 82801IB (ICH9) LPC Interface Controller (rev 02)
00:1f.2 IDE interface: Intel Corporation 82801IB (ICH9) 2 port SATA IDE Controller (rev 02)
01:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20)
01:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20)
02:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20)
02:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20)
03:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 1078 (rev 04)
07:00.0 PCI bridge: Intel Corporation 6702PXH PCI Express-to-PCI Bridge A (rev 09)
08:08.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068 PCI-X Fusion-MPT SAS (rev 01)
09:03.0 VGA compatible controller: Matrox Graphics, Inc. MGA G200eW WPCM450 (rev 0a)
Is this something I should worry about? I did several reboots and error didn't come back.
Not sure if I should log a job with Dell support.
Let me know if you need more info.
Thanks in adavance.
coafark
38 Posts
0
December 10th, 2010 06:00
I found my R710 PE server (ESX 4.1i installed) hung at starting up today. Same error as above showing (PCIE Fatal Err) on my ESX server. Anyone have a clue to the above poster?
vivekrs
2 Posts
0
March 1st, 2012 01:00
coafark: Was this resolved? If yes, how?
I got the exact same error on my PE R710 , followed by a hardware error. My PCIe SSD Drive is not working any more:
System software event:
Description: Err Reg Pointer: OEM sensor, OEM Diagnostic data event was asserted
Date and time of action: Sun Mar 01 10:28:45 1970
System software event:
Description: PCIE Fatal Err: Critical Event sensor, bus fatal error (Bus 0 Device 9 Function 0) was asserted
Date and time of action: Sun Mar 01 10:28:45 1970
System software event:
Description: PCIE Fatal Err: Critical Event sensor, bus fatal error (Slot 3) was asserted
Date and time of action: Sun Mar 01 10:28:45 1970
-----
A fatal hardware error has occurred.
Component: PCI Express Root Port
Error Source: Generic
Bus:Device:Function: 0x0:0x9:0x0
Vendor ID:Device ID: 0x8086:0x3410
Class Code: 0x60400
abhishek.singh
1 Message
0
July 3rd, 2013 00:00
I am seeing a similar error on R620. My system gets hung on a similar error "PCI1320 Bus fatal error on bus 7 device 0 function 1. Power cycle system". Any one found a solution yet?
zsstorm
3 Posts
0
December 3rd, 2014 06:00
Normally lspci from Linux will give you the readout to see what is on Bus 7 Device 0 Function 1. Most systems will have different cards and therefore addresses will be different. It's recommended to run a DSET report at support.dell.com/dset and call into support. In the data folder there is a RAWXML folder with a slots.xml listed. You can can get a pretty good idea on what device is causing it by looking for busnumber or devicenumber. Always update your system to the latest and greatest for best practice and remove any third party cards to test.
zsstorm
3 Posts
0
December 3rd, 2014 06:00
That normally points to the network. One of your NICs and possibly the motherboard need an update. I'd recommend either the Lifecycle Controller and perform a platform update or using a bootable repository from Dell's latest repository iso dell.app.box.com/Bootabler710. ; It may come back, and it may not. If it does, it may be an onboard NIC failure and may need parts replacement worse case scenario.
zsstorm
3 Posts
0
December 3rd, 2014 07:00
vivekrs...your issue is whatever card you have in slot 3. If it's dell hardware, update the firmware on the card and the motherboard. If it's third party, update the firmware on the card and on the board. If the error persists, pull the card. If it's dell hardware, call support. They may need you to swap card slots to be sure and see if the error follows the card or only when the card is in slot 3.
vivekrs
2 Posts
0
December 3rd, 2014 10:00
Looks like somebody is on a spring-cleaning spree... :)
This thread is from eons ago - I don't even remember posting that message! Whatever I did (or did not) fixed (or circumvented) the error I was having at the time.
Thanks for your help though zsstorm!