PowerEdge Hardware General

Last reply by 01-21-2015 Unsolved
Start a Discussion
2 Bronze
2 Bronze

PowerEdge 730 fatal error randomly !!

Hello everyone,

my fresh new server crash randomly on these errors :

A fatal error was detected on a component at bus 0 device 1 function 0.

A fatal error was detected on a component at bus 3 device 0 function 0.

Details of my usage :

- RAID 6 on my 8 disks local storage

- Firmware on Lifecycle controller are all updates

- Health status of the server on IDRAC8 : all good

- I'm using this hypervisor on my server VMware ESXi : 

VMware-VMvisor-Installer-5.5.0.update02-2068190.x86_64-Dell_Customized-A00.iso

I don't know how to do.

Someone could help me on this case please ?

Thank you in advance

Replies (3)

Hi,

We would need to find out what devices they are referring to, this will be different on every system depending on what hardware is installed. We will need to determine what the devices are either by booting to a live linux disk and running lspci or getting a DSET report. http://www.dell.com/support/article/us/en/19/SLN155926/EN


Thanks,

DELL-Josh Cr
Social Media and Communities Professional
Dell Technologies | Enterprise Support Services
#IWork4Dell

Did I answer your query? Please click on ‘Accept as Solution’. ‘Kudo’ the posts you like!

2 Bronze
2 Bronze

Hello, thank you for your quick answer.

So, when i'm using lspci on my ESXi SSH console, it's say that :

~ # lspci -v
0000:00:00.0 Host bridge Bridge: Intel Corporation Haswell-E DMI2 [PCIe RP[0000:00:00.0]]
Class 0600: 8086:2f00

0000:00:01.0 PCI bridge Bridge: Intel Corporation Haswell-E PCI Express Root Port 1 [PCIe RP[0000:00:01.0]]
Class 0604: 8086:2f02

0000:00:02.0 PCI bridge Bridge: Intel Corporation Haswell-E PCI Express Root Port 2 [PCIe RP[0000:00:02.0]]
Class 0604: 8086:2f04

0000:00:02.2 PCI bridge Bridge: Intel Corporation Haswell-E PCI Express Root Port 2 [PCIe RP[0000:00:02.2]]
Class 0604: 8086:2f06

0000:00:03.0 PCI bridge Bridge: Intel Corporation Haswell-E PCI Express Root Port 3 [PCIe RP[0000:00:03.0]]
Class 0604: 8086:2f08

0000:00:03.1 PCI bridge Bridge: Intel Corporation Haswell-E PCI Express Root Port 3 [PCIe RP[0000:00:03.1]]
Class 0604: 8086:2f09

0000:00:03.2 PCI bridge Bridge: Intel Corporation Haswell-E PCI Express Root Port 3 [PCIe RP[0000:00:03.2]]
Class 0604: 8086:2f0a

0000:00:05.0 System peripheral Generic system peripheral: Intel Corporation Haswell-E Address Map, VTd_Misc, System Management
Class 0880: 8086:2f28

0000:00:05.1 System peripheral Generic system peripheral: Intel Corporation Haswell-E Hot Plug
Class 0880: 8086:2f29

0000:00:05.2 System peripheral Generic system peripheral: Intel Corporation Haswell-E RAS, Control Status and Global Errors
Class 0880: 8086:2f2a

0000:00:05.4 PIC Generic system peripheral: Intel Corporation Haswell-E I/O Apic
Class 0800: 8086:2f2c

0000:00:11.0 : Intel Corporation Wellsburg SPSR
Class ff00: 8086:8d7c

0000:00:11.4 SATA controller Mass storage controller: Intel Corporation Wellsburg AHCI Controller [vmhba1]
Class 0106: 8086:8d62

0000:00:16.0 Communication controller Communication controller: Intel Corporation Wellsburg MEI Controller #1
Class 0780: 8086:8d3a

0000:00:16.1 Communication controller Communication controller: Intel Corporation Wellsburg MEI Controller #2
Class 0780: 8086:8d3b

0000:00:1a.0 USB controller Serial bus controller: Intel Corporation Wellsburg USB Enhanced Host Controller #2
Class 0c03: 8086:8d2d

0000:00:1c.0 PCI bridge Bridge: Intel Corporation Wellsburg PCI Express Root Port #1 [PCIe RP[0000:00:1c.0]]
Class 0604: 8086:8d10

0000:00:1c.7 PCI bridge Bridge: Intel Corporation Wellsburg PCI Express Root Port #8 [PCIe RP[0000:00:1c.7]]
Class 0604: 8086:8d1e

0000:00:1d.0 USB controller Serial bus controller: Intel Corporation Wellsburg USB Enhanced Host Controller #1
Class 0c03: 8086:8d26

0000:00:1f.0 ISA bridge Bridge: Intel Corporation Wellsburg LPC Controller
Class 0601: 8086:8d44

0000:00:1f.2 SATA controller Mass storage controller: Intel Corporation Wellsburg AHCI Controller [vmhba2]
Class 0106: 8086:8d02

0000:01:00.0 Ethernet controller Network controller: Broadcom Corporation NetXtreme BCM5720 Gigabit Ethernet [vmnic0]
Class 0200: 14e4:165f

0000:01:00.1 Ethernet controller Network controller: Broadcom Corporation NetXtreme BCM5720 Gigabit Ethernet [vmnic1]
Class 0200: 14e4:165f

0000:02:00.0 Ethernet controller Network controller: Broadcom Corporation NetXtreme BCM5720 Gigabit Ethernet [vmnic2]
Class 0200: 14e4:165f

0000:02:00.1 Ethernet controller Network controller: Broadcom Corporation NetXtreme BCM5720 Gigabit Ethernet [vmnic3]
Class 0200: 14e4:165f

0000:03:00.0 RAID bus controller Mass storage controller: LSI / Symbios Logic Dell PERC H730 Mini [vmhba0]
Class 0104: 1000:005d

I've no more information at this time.

For more details, i attached screen of my RAID configuration, etc.

Thanks in advance

So based on that the devices are 0000:00:01.0 PCI bridge Bridge: Intel Corporation Haswell-E PCI Express Root Port 1 [PCIe RP[0000:00:01.0]]

And 0000:03:00.0 RAID bus controller Mass storage controller: LSI / Symbios Logic Dell PERC H730 Mini [vmhba0]

 

So that would be the RAID controller that is causing it. Since it is a new system I would try to reseat the controller and make sure that everything is contacting well. The RAID config you posted looks fine.


Thanks,

DELL-Josh Cr
Social Media and Communities Professional
Dell Technologies | Enterprise Support Services
#IWork4Dell

Did I answer your query? Please click on ‘Accept as Solution’. ‘Kudo’ the posts you like!

Top Contributor
Latest Solutions