[Isilon] Gen6 node split from cluster after reboot
Summary: Gen6 nodes in the cluster are logging events that nodes cannot communicate with the BMC and may show nodes in a RO state. Rebooting the node causes it to split from the cluster.
This article applies to
This article does not apply to
This article is not tied to any specific product.
Not all product versions are identified in this article.
Symptoms
From the cluster, isi event may be showing the following error:
14.99844 03/14 13:38 C 17 203764 The Baseboard Management Controller (BMC) located in chassis xxxxxxxxxxx, slot 4 is not responding. This controller monitors hardware components such as batteries and power supplies. To ensure these hardware components continue to be monitored, service the BMC as soon as possible.
If a reboot of the node is attempted where there is a serial connection, the following is observed in ePOST output:
Copyright (c) EMC Corporation , 2021
Disk Array Subsystem Controller
Model: Infinity Banshee: Isilon
DiagName: Extended POST
DiagRev: Rev 28.15
Build Date: Mon May 17 22:52:58 2021
BiosRev: 37.41
UEFIFWVolRev: Rev 03.43
FixedSERDESRev: Rev 08.00
BMCMainAppRev: 00.00
BMCSSPRev: 00.00
BMCEMCBBRev: 00.00
StartTime: 03/10/2022 22:10:45
SaSerialNo:
ABCDabcEaFaGHIabJabcKabcdefLabcdefMabcdefNOabPabcQRS
SPI_Buffer_Mgmt::Initialize(): Could not read NVRAM mem persistence struct
TUVWXYZAABBCCabDDabEEabFFabGGHHIIJJKKLLMMNNabOOPPabcQQRRSSTTUUVVWWXXYYZZAAABBBCCCDDD
************************************************************
* Extended POST Messages
************************************************************
WARNING:Failed to set fault/status code value:0x01000000 (offset:0x02A8)
INFORMATION:POST Start
WARNING:INIT: IPMI Error Reading Chassis Resume PROM (0x03E0)
WARNING:INIT: IPMI Error Reading SP Resume PROM (0x03E0)
WARNING:Failed to read Boot Options structure from Virtual EEPROM (0x03E0)
WARNING:FRU capability Register in SP resume is not set correctly, VRD components are not recognized
WARNING:Error reading SLIC status sensor or sensor scanning could be disabled for HBA0 card (0x03E0)
WARNING:Error reading SLIC status sensor or sensor scanning could be disabled for HBA1 card (0x03E0)
WARNING:Error reading SLIC status sensor or sensor scanning could be disabled for Disk Interface Card (0x03E0)
WARNING:Unable to read VEEPROM Shared Data Region (0x03E0)
WARNING:Failed to read Boot Options structure from Virtual EEPROM (0x03E0)
WARNING:Failed to set fault/status code value:0x01000002 (offset:0x02A8)
WARNING:Couldn't read Chassis Status
WARNING:Couldn't read sensor or sensor scanning could be disabled (0x90)
WARNING:Couldn't read sensor or sensor scanning could be disabled (0x98)
WARNING:Failed to set fault/status code value:0x01000002 (offset:0x02A8)
WARNING:Couldn't read CMD Sequencer board ID (0x03E0)
WARNING:CMD Sequencer: Failed to read Board ID. No +/-6CMD tolerance applied (BMC code: 0x00)
WARNING:Failed to set fault/status code value:0x01000002 (offset:0x02A8)
WARNING:Failed to set fault/status code value:0x01000002 (offset:0x02A8)
WARNING:Failed to set fault/status code value:0x01000002 (offset:0x02A8)
WARNING:Failed to set fault/status code value:0x01000002 (offset:0x02A8)
WARNING:Failed to set fault/status code value:0x01000025 (offset:0x02A8)
WARNING:Failed to set fault/status code value:0x01000002 (offset:0x02A8)
WARNING:Failed to set fault/status code value:0x01000026 (offset:0x02A8)
WARNING:Failed to set fault/status code value:0x01000027 (offset:0x02A8)
WARNING:Failed to set fault/status code value:0x01000002 (offset:0x02A8)
WARNING:Failed to set fault/status code value:0x01000003 (offset:0x02A8)
WARNING:Failed to set fault/status code value:0x01000002 (offset:0x02A8)
WARNING:Failed to set fault/status code value:0x0100002B (offset:0x02A8)
WARNING:Couldn't read Infinity DIB CMD board ID (0x03E0)
WARNING:Infinity DIB CMD: Failed to read Board ID. No +/-6CMD tolerance applied (BMC code: 0x00)
WARNING:Failed to set fault/status code value:0x01000015 (offset:0x02A8)
WARNING:PSA not present
WARNING:Failed to set fault/status code value:0x01000016 (offset:0x02A8)
WARNING:PSB not present
WARNING:Failed to set fault/status code value:0x01000017 (offset:0x02A8)
WARNING:Failed to set fault/status code value:0x01000002 (offset:0x02A8)
WARNING:Failed to set fault/status code value:0x01000002 (offset:0x02A8)
WARNING:Failed to set fault/status code value:0x01000002 (offset:0x02A8)
WARNING:Failed to set fault/status code value:0x01000002 (offset:0x02A8)
WARNING:Failed to set fault/status code value:0x01000025 (offset:0x02A8)
WARNING:Failed to set fault/status code value:0x01000002 (offset:0x02A8)
WARNING:Failed to set fault/status code value:0x01000002 (offset:0x02A8)
WARNING:Failed to set fault/status code value:0x01000002 (offset:0x02A8)
WARNING:Failed to set fault/status code value:0x01000002 (offset:0x02A8)
WARNING:Failed to set fault/status code value:0x01000025 (offset:0x02A8)
WARNING:Failed to set fault/status code value:0x01000026 (offset:0x02A8)
WARNING:M.2 SATA 0 FW FileSystem Inaccessible (0x00FF)
WARNING:Failed to set fault/status code value:0x01000027 (offset:0x02A8)
WARNING:Firmware Update Skipped Due To PSA Status
WARNING:Failed to set fault/status code value:0x01000002 (offset:0x02A8)
WARNING:Failed to set fault/status code value:0x01000002 (offset:0x02A8)
WARNING:Failed to set fault/status code value:0x01000002 (offset:0x02A8)
WARNING:Failed to set fault/status code value:0x01000002 (offset:0x02A8)
WARNING:Failed to set fault/status code value:0x01000002 (offset:0x02A8)
WARNING:Failed to set fault/status code value:0x01000002 (offset:0x02A8)
WARNING:Failed to set fault/status code value:0x01000002 (offset:0x02A8)
WARNING:Failed to set fault/status code value:0x01000002 (offset:0x02A8)
WARNING:Failed to set fault/status code value:0x01000002 (offset:0x02A8)
WARNING:Failed to set fault/status code value:0x01000002 (offset:0x02A8)
WARNING:Failed to set fault/status code value:0x01000002 (offset:0x02A8)
WARNING:Failed to set fault/status code value:0x0100002B (offset:0x02A8)
WARNING:Failed to set fault/status code value:0x0100002B (offset:0x02A8)
WARNING:Failed to set fault/status code value:0x0100002B (offset:0x02A8)
WARNING:Failed to set fault/status code value:0x01000002 (offset:0x02A8)
WARNING:Failed to set fault/status code value:0x01000015 (offset:0x02A8)
WARNING:Failed to set fault/status code value:0x01000016 (offset:0x02A8)
WARNING:Failed to set fault/status code value:0x01000017 (offset:0x02A8)
************************************************************
EndTime: 03/10/2022 22:10:58
.... Storage System Failure - Contact your Service Representative ...
ErrorCode: 0x000003E0
ErrorDesc:
FRU: Motherboard
Device: BMC
Description: IPMI Transport Protocol Not Found Error!
Rev: 28.15
Send_IPMI_CMD()
Failed to get BMC self test results
Self Test Results*
P/N:
S/N:
EndError:
ErrorTime: 03/10/2022 22:10:45
ErrorCode: 0x000003EA
ErrorDesc:
FRU: Motherboard
Device: NVRAM
Description: NVRAM IPMI Protocol Not Found Error!
Rev: 28.15
EMC NVRAM IPMI Protocol not found!
Initialize NVRAM*
P/N:
S/N:
EndError:
ErrorTime: 03/10/2022 22:10:45
ErrorCode: 0x000001BB
ErrorDesc:
FRU: Motherboard
Device: CMOS Bank 1
Description: BIOS SMI Handler is disabled Error!
Rev: 28.15
CMOS Init*
P/N:
S/N:
EndError:
ErrorTime: 03/10/2022 22:10:45
ErrorCode: 0x000003E0
ErrorDesc:
FRU: Motherboard
Device: LAN Management Port
Description: IPMI Transport Protocol Not Found Error!
Rev: 28.15
Send_IPMI_CMD()
Failed to get/verify BMC MAC address
Verify-Set BMC MAC Addr*
P/N:
S/N:
EndError:
ErrorTime: 03/10/2022 22:10:46
ErrorCode: 0x000003E0
ErrorDesc:
FRU: Motherboard
Device: BIOS ROM
Description: IPMI Transport Protocol Not Found Error!
Rev: 28.15
Send_IPMI_CMD()
Failed to set SSP power reset request(code: 0x01)
EndError:
ErrorTime: 03/10/2022 22:10:57
ErrorCode: 0x000003E0
ErrorDesc:
FRU: Motherboard
Device: M.2 SATA 1
Description: IPMI Transport Protocol Not Found Error!
Rev: 28.15
Send_IPMI_CMD()
Failed to get sensor reading
Sensor ID: 0x90
EndError:
ErrorTime: 03/10/2022 22:10:53
EMC Extended POST End: 03/10/2022 22:11:08
The node continues to boot, however, at the end, the following message appears:
Secondary backup method None is unknown
None backup is invalid.
Initializing with a new backup of current data.
Secondary backup method None is unknown
None backup does not exist.
Installation config method is PSF: Pulling PSI config files from backup
Secondary backup method None is unknown
Unable to pull files from None backup
Installation config method is PSF: Unable to pull secondary backup files
Installation config method is PSF: Copying PSI config files
PSF file path /mfg/psi/psf.json does not exist in primary backup
Failed to copy PSI config files: Chassis is missing a PSI receipt.
A receipt must be provided before the boot will be allowed to continue.
To prevent a DL situation, contact Dell EMC Customer Support immediately:
United States: 1 800 782 4362 (1 800 SVC 4EMC)
Canada: 1 800 543 4782 (1 800 543 4SVC)
Worldwide Country Code: 1 508 497 7901
Command Options:
1) Enter recovery shell
2) Continue booting
3) Reboot
option>
Cause
The BMC on the nodes stopped responding due to a UDP storm on the 1GbE network.
The issue is specific to Gen6 (A200/A2000/H400/H500/H600/H5600/F800/F810) only. It does not affect PowerScale or PowerScale Hybrid nodes.
The issue is specific to Gen6 (A200/A2000/H400/H500/H600/H5600/F800/F810) only. It does not affect PowerScale or PowerScale Hybrid nodes.
Resolution
Workaround:
Apply Node Firmware Package (NFP) 11.6 or newer.
- Unplug the 1GbE interface from the affected nodes.
- Enter the recovery shell.
- Reboot the node.
Apply Node Firmware Package (NFP) 11.6 or newer.
Affected Products
Isilon A200, Isilon A2000, Isilon F800, Isilon F810, Isilon Gen6, Isilon H400, Isilon H500, Isilon H5600, Isilon H600Article Properties
Article Number: 000213534
Article Type: Solution
Last Modified: 08 Sept 2023
Version: 2
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.