How to configure ESX/ESXi host to capture a VMKernel coredump from a purple diagnostic screen
Summary: Hhow to configure ESX/ESXi host to capture a VMkernel coredump from a purple diagnostic screen
Symptoms
This article provides information on how to configure ESX/ESXi host to capture a VMkernel coredump from a purple diagnostic screen.
- Configure ESX/ESXi to capture a VMkernel coredump
- Network Dump Collection for ESXi 5.0
- Disk Dump Collection for ESX/ESX 3.x to 5.x
- Interpreting an ESX/ESXi host purple diagnostic screen
Configure ESX/ESXi to capture a VMkernel coredump
When a VMware ESX/ESXi host encounters a critical error and halts, it will attempt to send a diagnostic information to disk and /or network, depending on configuration. If the coredump feature had previously been configured, but no coredump was saved, there could be an issue with connectivity to the storage, or writing to the storage. If the coredump was previously configured, and not collected, then consider saving a coredump to a different location.
Coredumps can be saved via Network Dump Collection or Disk Dump Collection.
Network Dump Collection for ESXi 5.0
-
Network Dump Collector in VMware vSphere - VMware KB (1032051)
-
Configuring ESXi 5.0 to capture a VMkernel coredump via Network Dump Collector - VMware KB (2002955)
-
Note:
Disk Dump Collection for ESX/ESX 3.x to 5.x
- Configuring ESXi 5.0 to capture a VMkernel coredump via Disk Dump - VMware KB (2004299)
- Configuring ESX/ESXi 3.x to 4.x to capture a VMkernel coredump via Disk Dump - VMware KB (2004297)
Note: ESX 3.x to 4.1 can generate a coredump via the service console Linux kernel. The VMKcore diagnostic coredump partition is used to store the VMkernel coredump, but the service console is placed on a VMFS datastore. Depending on the type of failure, a coredump can be generated from one or both components. See Configuring an ESX host to capture a Service Console coredump - VMware KB (1032962)
Interpreting an ESX/ESXi host purple diagnostic screen
- Knowing where and what to look for can provide a great deal of insight as to what the issue *might* be.
- Some issues like Exception 14 (Page Fault) can vary from hardware to software. When in doubt, it is always a great idea to ensure the hardware is working as expected and all PowerEdge BIOS / Firmware updates have been read and applied. Run hardware diagnostics and document any issue for reference.
- One trick available from the local console is pressing num lock, scroll lock, or cap lock, and see if any LED's light up on the keyboard. If the LED's light up, this is not a CPU or system board lock up. Start checking ESX/ESXi build information, release notes, and reference the Purple Screen information.
- Take a screenshot or picture of the purple screen for later reference.
- Check for any error messages in the PowerEdge System Event Log (SEL) and note the status of the LCD and hardware lights.
- SEL log information can be obtained via iDRAC Web Interface.
Reference: VMware KB 1000328