PowerStore: Landing Page for In Market off-release System Health Checks
Summary: Occasionally health checks should be added after the PowerStoreOS is released. These health checks are supplied by the thin package mechanism, and identify various known issues in the PowerStore cluster. ...
Instructions
Background
Occasionally new issues are found following the release of a PowerStoreOS which are not detected by the OS's integrated health checks and alerts features. The Heath Check thin package feature is used to deliver new health checks to an installed PowerStoreOS.
The Health check package contains health checks that are performed prior to a Non-Disruptive Upgrade (NDU). The package also includes general system health checks that are invoked, on demand, from the PowerStore Manager (Monitoring > System Checks > Run System Check).
The health check package must be uploaded to the PowerStore cluster and then installed.
| IMPORTANT |
|---|
| For detailed instructions on installing and using the health check package, see one of these KB articles: |
Table of Contents
- Health Check package for 4.2.x
- Health Check package for 4.0.x and 4.1.x
- Health Check package for 3.x
- Health Check package for 2.1.x
Health Check package for 4.2.x
The table below lists the health checks that are in the PowerStore-health_check-4.2.0.0-2598072-retail.tgz.bin Health Check thin package. The package is compatible with PowerStoreOS version 4.2.x only.
The Health Check Package contains validations used by System Checks, Pre-Upgrade Health Check (PUHC) and RxDefinitions.
Note: Please refer to PowerStore: RxDefinitions Issues Landing Page for information on tests that are included with this package.
Distribution:
Posted in Drivers & Downloads: (Requires login to the Dell support site to view the document)
The package must be downloaded from this site unless automatic download is enabled. The package is uploaded automatically to the cluster if the automatic download option is enabled (PowerStore Manager: Settings > Upgrades > Automatic download is enabled). The automatic download feature is disabled by default.
How to Install:
After the package is uploaded, it must be installed (PowerStore Manager: Upgrades > Upgrade).
How to run:
- Health checks are run from the PowerStore Manager UI (Monitoring > System Checks > Run System Check.)
- Alternatively, the installed health check can be run using the service script
svc_health_check. - PUHC checks are run from the PowerStore Manager UI Upgrades page. It is run when the Health Check button is pressed and when the Upgrade button is pressed.
Pre-Upgrade Health Checks (PUHC)
LEGEND: In this table, the symbol ♦ indicates checks introduced or enhanced in this most recent Health Check thin package.
| Test name | Description | KB Article |
|---|---|---|
| PS Redundancy | Detects non-redundant power supplies. | 000214821 PowerStore: Pre-Upgrade Health Check (PUHC) detects non-redundant power supply |
| scheduled_vm_snapshot | Detects if there are any VM scheduled snapshots in progress | 000214504 PowerStore: Pre-Upgrade Health Check (PUHC) checks if all snapshot commands are all in completed state. |
| off_release_ens24_drive_missing_path_check | Checks the system for non-redundant NVMe drive paths that are either failed or missing | 000242170 PowerStore: Pre-Upgrade Health Check (PUHC) to detect non-redundant NVMe drive paths |
| off_release_incomplete_commands_check | PUHC detects incomplete Control Path (CP) commands that are not in the approved list of commands allowed during NDU | 000269892 PowerStore: Pre upgrade health check (PUHC) to detect incomplete CP commands |
| unsupported_drive_check | PUHC detects a system with drives not covered by DriveDB.json v0.6.6.0 | 000316788 PowerStore Health Checks to Identify a System with Drives not covered by DriveDB.json v0.6.6.0 |
| nvram_full_fips_check | PUHC detects NVRAM disk's full FIPS is not set correctly | 000296977 PowerStore: System Health Check and/or Pre-Upgrade Health Check 3.x/4.x | Replacement of NVMe NVRAM disks may incorrectly raise FIPS alerts |
kernel_slab_check |
Detect if any node has slab allocation over the skbuff_fclone_cache limit | 000261124 PowerStore: Health Checks Detect if any Node has Slab Allocation Over the skbuff_fclone_cache Limit |
System Checks
LEGEND: In this table, the symbol ♦ indicates checks introduced/enhanced in this most recent Health Check thin package.
| Test name | Description | KB Article |
|---|---|---|
| four_port_iom_state_check | Detects if 4PortCard is incorrectly set to fru_is_not_in_use | 000347815 PowerStore: System Health Check detects 4PortCard Incorrectly Set to fru_is_not_in_use |
| indus_encryption_offset_check | Detects drive invalid encryption band location for NVMe expansion enclosures. | 000220624 PowerStore: System Check detects drive invalid encryption band location for ENS24 enclosure |
| unfinished_ndu_check | Detects the existence of an unfinished upgrade. | 000213265 PowerStore: System Health Check has detected remnants of failed NDU commits |
| indus_drive_paths_check | Detects unstable paths to ENS24 NVMe expansion enclosure | 000212444 PowerStore: System Health Check detects unstable paths to the ENS24 NVMe expansion enclosure. |
| cpu_ierr_check | Checks for CPU internal error | 000196192 PowerStore: System Health Checks detects an issue in the CPU IERR Check |
| active_system_alert_check | Detects active Major and Critical alerts. | 000192609 PowerStore: Active alerts detected by health checks |
| cyc_node_space_check | Detects node's /cyc_node directory has insufficient space. | 000198173 PowerStore: System Health Checks Detects Lack of Space in cyc_node |
| time_skew_check | Detects unsupported large time skew | 000196199 PowerStore: Health Checks detects high time skew on Nodes and BMC |
| component_sn_check | Detects inconsistent serial number for BBU or PSU | 000196196 PowerStore: System Health Checks detects component Serial Numbers are not consistent: Component SN Check |
|
Detects if a volume group with no member in Control Path (CP) exists | 000238653 PowerStore: Health Checks detect if a volume group with no members exists |
kernel_slab_check |
Detect if any node has slab allocation over the skbuff_fclone_cache limit | 000261124 PowerStore: Health Checks Detect if any Node has Slab Allocation Over the skbuff_fclone_cache Limit |
| dp_mem_allocation_override_check | Detects wrong Data Path (DP) memory settings | 000253246 PowerStore: System Health Checks detects wrong Data Path memory settings |
| unsupported_drive_check | Detects a system with drives not covered by DriveDB.json v0.6.6.0 | 000316788 PowerStore Health Checks to Identify a System with Drives not covered by DriveDB.json v0.6.6.0 |
Health Check package for 4.0.x and 4.1.x
The table below lists the health checks that are in the PowerStore-health_check-4.1.0.0-2606757-retail.tgz.bin Health Check thin package. The package is compatible with PowerStoreOS versions 4.0.x and 4.1.x only.
The Health Check Package contains both validations used by both System Checks and Pre-Upgrade Health Check (PUHC).
Distribution:
Posted in Drivers & Downloads: (Requires login to the Dell support site to view the document)
The package must be downloaded from this site unless automatic download is enabled. The package is uploaded automatically to the cluster if the automatic download option is enabled (PowerStore Manager: Settings > Upgrades > Automatic download is enabled). The automatic download feature is disabled by default.
How to Install:
After the package is uploaded, it must be installed (PowerStore Manager: Upgrades > Upgrade).
How to run:
- Health checks are run from the PowerStore Manager UI (Monitoring > System Checks > Run System Check.)
- Alternatively, the installed health check can be run using the service script
svc_health_check. - PUHC checks are run from the PowerStore Manager UI Upgrades page. It is run when the Health Check button is pressed and when the Upgrade button is pressed.
Note: When using this Health Check package, also upload and install the RxDefinitions package. For further details on installing the RxDefinitions package, see article PowerStore: Landing Page for RxDefinitions Issues.
Pre-Upgrade Health Checks (PUHC)
LEGEND: In this table, the symbol ♦ indicates checks introduced or enhanced in this most recent Health Check thin package.
| Test name | Description | KB Article |
|---|---|---|
| off_release_drive_firmware_check_oe | PUHC prevent system instability due to Issue during disk firmware upgrade | 000367346 PowerStore: Pre-Upgrade Health Check (PUHC) to Prevent System Instability Due to Issue During Disk Firmware Upgrade |
| off_release_volume_metro_reservation_mode_check | Detects if one or more Metro volumes are stuck in a synchronization (sync DB) state. | 000346226 PowerStore: Pre-upgrade health check (PUHC) to detect 4.0.X metro system with SYM metro volumes stuck in sync DB status |
| off_release_cpdb_locale_check | Detects if the locale or encoding settings in the PowerStore management or local database is correct | 000334747 PowerStore: Database encoding or locale settings of CP database server(s) (localdb/managementdb) are not correctly set |
| off_release_dpe_drive_check | Check if a PowerStore500T with Indus has enough DPE drives. | 000227055 PowerStore: A Pre-Upgrade Health Check assesses if a PowerStore500T with an NVMe expansion enclosure (ENS24) has enough of DPE of drives. |
| off_release_check_iscsi_rep_block_size_failed | Detects if there are volumes of 4096 bytes size being replicated over iSCSI protocol. | 000221547 PowerStore: Pre-Upgrade Health Check (PUHC) detects volumes of 4096 VOLUME SECTOR SIZE being replicated over iSCSI protocol |
| off_release_rba_configuration_check | Determines if the RBA tier is configured. | 000218438 PowerStore: Pre-Upgrade Health Check to detect if RBA tier is enabled. |
| iom_activation_check ♦ | Prevent NDU for IOM/SLIC without activation. | 000216558 PowerStore: Health check has detected that a NVMe Expansion Enclosure (ENS24) was added but not recognized. |
| PS Redundancy | Detects non-redundant power supplies. | 000214821 PowerStore: Pre-Upgrade Health Check (PUHC) detects non-redundant power supply |
| SAS drives with firmware port locked | Detects a locked firmware port | 000207951 PowerStore: Pre-Upgrade Health Check for locked firmware port in Samsung SAS drives |
| off_release_ssd_in_rg_check | Detects if an SSD is not in a DRE group. | 000218650 PowerStore: Pre-Upgrade Heath Check Detects that Not All the SSDs are in a RAID Group |
| scheduled_vm_snapshot | Detects if there are any VM scheduled snapshots in progress | 000214504 PowerStore: Pre-Upgrade Health Check (PUHC) checks if all snapshot commands are all in completed state. |
| off_release_scsi3_reservation_check | Checks the system for any stale SCSI3 reservations | 000246358 PowerStore: Pre-Upgrade Health Check to Detect a Stale SCSI-3 Reservation Issue |
| off_release_ens24_drive_missing_path_check | Checks the system for non-redundant NVMe drive paths that are either failed or missing | 000242170 PowerStore: Pre-Upgrade Health Check (PUHC) to detect non-redundant NVMe drive paths |
| pd_manifest_version_check | PUHC to ensure Compatibility and Version Control for Rx-Definitions Package in PowerStore | 000228279 PowerStore: Pre upgrade health check (PUHC) to ensure Compatibility and Version Control for Rx-Definitions Package in PowerStore |
| off_release_incomplete_commands_check | PUHC detects incomplete Control Path (CP) commands that are not in the approved list of commands allowed during NDU | 000269892 PowerStore: PUHC to detect incomplete CP commands - off_release_incomplete_commands_check |
| unsupported_drive_check | PUHC detects a system with drives not covered by DriveDB.json v0.6.6.0 | 000316788 PowerStore: Health Checks to Identify a System with Drives not covered by DriveDB.json v0.6.6.0 |
| nvram_full_fips_check | PUHC detects NVRAM disk's full FIPS is not set correctly | 000296977 PowerStore: System Health Check and/or Pre-Upgrade Health Check 3.x/4.x | Replacement of NVMe NVRAM disks may incorrectly raise FIPS alerts |
| off_release_removed_third_party_certificate_check ♦ | Detects a missing third-party certificate chain | 000261401 PowerStore: Pre upgrade health check (PUHC) to detect the reset_certificates REST endpoint issue |
System Checks
LEGEND: In this table, the symbol ♦ indicates checks introduced/enhanced in this most recent Health Check thin package.
| Test name | Description | KB Article |
|---|---|---|
| four_port_iom_state_check | Detects if 4PortCard is incorrectly set to fru_is_not_in_use | 000347815 PowerStore: System Health Check detects 4PortCard Incorrectly Set to fru_is_not_in_use |
| indus_encryption_offset_check | Detects drive invalid encryption band location for NVMe expansion enclosures. | 000220624 PowerStore: System Check detects drive invalid encryption band location for ENS24 enclosure |
| unfinished_ndu_check | Detects the existence of an unfinished upgrade. | 000213265 PowerStore: System Health Check has detected remnants of failed NDU commits |
| indus_drive_paths_check | Detects unstable paths to ENS24 NVMe expansion enclosure | 000212444 PowerStore: System Health Check detects unstable paths to the ENS24 NVMe expansion enclosure. |
| cpu_ierr_check | Checks for CPU internal error | 000196192 PowerStore: System Health Checks detects an issue in the CPU IERR Check |
| active_system_alert_check | Detects active Major and Critical alerts. | 000192609 PowerStore: Active alerts detected by health checks |
| cyc_node_space_check | Detects node's /cyc_node directory has insufficient space. | 000198173 PowerStore: System Health Checks Detects Lack of Space in cyc_node |
| time_skew_check | Detects unsupported large time skew | 000196199 PowerStore: Health Checks detects high time skew on Nodes and BMC |
| component_sn_check ♦ | Detects inconsistent serial number for BBU or PSU | 000196196 PowerStore: System Health Checks detects component Serial Numbers are not consistent: Component SN Check |
| component_stale_fw_check | Detects if the firmware is up to date and if it is compatible with the Dell X.509 signature | 000201500 PowerStore: System Health Checks detects if a firmware upgrade is required |
| symmd_on_disk_check | Detects an out-of-date System Manager Metadata on Disk (SYMMD) data on disks | 000228110 PowerStore: System Health Checks detects an out-of-date System Manager Metadata on Disk (SYMMD) data on disks |
|
Detects if a volume group with no member in Control Path (CP) exists | 000238653 PowerStore: Health Checks detect if a volume group with no members exists |
| stale_scsi3_reservation_check | Detects for any stale SCSI3 reservations | 000259473 PowerStore: System Health Checks detects a stale SCSI3 reservation issue |
kernel_slab_check |
Detect if any node has slab allocation over the skbuff_fclone_cache limit | 000261124 PowerStore: Health Checks Detect if any Node has Slab Allocation Over the skbuff_fclone_cache Limit |
| dp_mem_allocation_override_check | Detects wrong Data Path (DP) memory settings | 000253246 PowerStore: System Health Checks detects wrong Data Path memory settings |
| nvme_tcp_dmc_protection_check | Detects if an appliance running 4.1.0.0 may be need follow-up to prevent potential for a data integrity issue | 000325258 PowerStore Health Check to Prevent Data Integrity Scenario on PowerStoreOS 4.1.0.0 clusters with NVMe-TCP LUNs |
| unsupported_drive_check | Detects a system with drives not covered by DriveDB.json v0.6.6.0 | 000316788 PowerStore Health Checks to Identify a System with Drives not covered by DriveDB.json v0.6.6.0 |
| nvram_full_fips_check | Detects NVRAM disk's full FIPS is not set correctly | 000296977 PowerStore: System Health Check and/or Pre-Upgrade Health Check 3.x/4.x | Replacement of NVMe NVRAM disks may incorrectly raise FIPS alerts |
Health Check package for 3.x
The table below lists the health checks that are in the PowerStore-health_check-3.6.1.5-2613754-retail.tgz.bin Health Check thin package. The package is compatible with PowerStoreOS versions 3.0.x, 3.2.x 3.5.x and 3.6 (including 3.6.1). It is not compatible with 2.x or 4.x.
The Health Check Package contains both validations used by both System Checks and Pre-Upgrade Health Check (PUHC).
Distribution:
Posted in Drivers & Downloads: (Requires login to the Dell support site to view the document)
The package must be downloaded from this site unless automatic download is enabled. The package is uploaded automatically to the cluster if the automatic download option is enabled (PowerStore Manager: Settings > Upgrades > Automatic download is enabled). The automatic download feature is disabled by default.
How to Install:
After the package is uploaded, it must be installed (PowerStore Manager: Upgrades > Upgrade).
How to run:
- Health checks are run from the PowerStore Manager UI (Monitoring > System Checks > Run System Check.)
- Alternatively, the installed health check can be run using the service script
svc_health_check. - PUHC checks are run from the PowerStore Manager UI Upgrades page. It is run when the Health Check button is pressed and when the Upgrade button is pressed.
Pre-Upgrade Health Checks (PUHC)
LEGEND: In this table, the symbol ♦ indicates checks introduced/enhanced in this most recent Health Check thin package.
| Test name | Description | KB Article |
|---|---|---|
| off_release_missing_pg_hba_conf_template_check ♦ | Detects if pg_hba config template file is missing | 000380454 PowerStore: Pre-Upgrade Health Check (PUHC) detects missing database pg_hba config template file |
| off_release_nvme_discovered_initiators_check | Detects if unassigned NVME_FC initiators are zoned-in and connected to the PowerStore appliance | 000347818 PowerStore: Pre-Upgrade Health to Identify Initiators with no Corresponding Host |
| off_release_cp_db_location_check | In a multi-appliance cluster, detects if some internal PowerStore components are hosted by the same appliance | 000359310 PowerStore: Pre-Upgrade Health Check (PUHC) to Confirm that Primary ControlPath (CP) and Primary Database (DB) are Hosted by the Same Appliance |
| off_release_cpdb_locale_check | Detects if the locale or encoding settings in the PowerStore management or local database is correct | 000334747 PowerStore: Database encoding or locale settings of CP database server(s) (localdb/managementdb) are not correctly set |
| off_release_check_ndu_pause_rule | Detects is "NDU Pause" feature is enabled on source OS and validates that destination OS also supports it if applicable | 000318226 PowerStore: Block Non-Disruptive Upgrade (NDU) to Unsupported Build |
| off_release_locked_drive_check | Detects if a drive is in a locked state | 000294377 PowerStore: Per-Upgrade Health Check to Detect if a Drive is Locked |
| off_release_nvram_full_fips_mode | Detects if full FIPS configuration of NVRAM disks is set correctly | 000296977 PowerStore: System Health Check and/or Pre-Upgrade Health Check 3.x/4.x | Replacement of NVMe NVRAM disks may incorrectly raise FIPS alerts |
| off_release_unsupported_drive_check | Detects if all drives installed are found in the system driveDB.json file | 000316788 PowerStore Health Checks to Identify a System with Drives not covered by DriveDB.json v0.6.6.0 |
| off_release_check_db_rep_mode_failed | Detects and stops clusters of 3 or 4 appliances from upgrading to PowerStoreOS 4.1 | 000286668 PowerStore: After an upgrade to PowerStoreOS 4.1 on clusters with three or more appliances, management database replication may remain in async mode |
| off_release_drive_wear_check | Determines if there are drives whose wear level is too high. | 000227058 PowerStore: The Pre-Upgrade Health Check (PUHC) determines if there are drives whose wear level is too high. |
| off_release_dpe_drive_check | Check if a PowerStore500T with Indus has enough DPE drives, | 000227055 PowerStore: A Pre-Upgrade Health Check assesses if a PowerStore500T with an NVMe expansion enclosure (ENS24) has enough of DPE of drives. |
| off_release_drivedb_check | Detects incorrect drive database file. | 000224852 PowerStore: Pre-Upgrade Health Check (PUHC) detects incorrect drive Database signature. |
| off_release_check_iscsi_rep_block_size_failed | Detects if there are volumes of 4096 bytes size being replicated over iSCSI protocol. | 000221547 PowerStore: Pre-Upgrade Health Check (PUHC) detects volumes of 4096 VOLUME SECTOR SIZE being replicated over iSCSI protocol |
| efi_boot_check off_release_efi_boot_check |
Check that the correct boot entry option is used. | 000222187 PowerStore: Pre-Upgrade Heath Check detects if reboot was using an incorrect boot entry option |
| off_release_rba_configuration_check | Determines if the RBA tier is configured. | 000218438 PowerStore: Pre-Upgrade Health Check to detect if RBA tier is enabled. |
| iom_activation_check | Prevent NDU for IOM/SLIC without activation. | 000216558 PowerStore: Health check has detected that a NVMe Expansion Enclosure (ENS24) was added but not recognized. |
| silent_drive_failure_check | Detects if an underlying firmware upgrade process is running. | 000216659 PowerStore: Health check detects the missing SSD issue |
| off_release_check_proc_install_disk_firmware | Detects if an underlying firmware upgrade process is running. | 000218391 PowerStore: Pre-Upgrade Heath Check detects an underlying firmware upgrade process is running |
| off_release_ssd_in_rg_check | Detects if an SSD is not in a DRE group. | 000218650 PowerStore: Pre-Upgrade Heath Check Detects that Not All the SSDs are in a RAID Group |
| SAS drives with firmware port locked | Detects a locked firmware port | 000207951 PowerStore: Pre-Upgrade Health Check for locked firmware port in Samsung SAS drives |
| PS Redundancy | Detects non-redundant power supplies. | 000214821 PowerStore: Pre-Upgrade Health Check (PUHC) detects non-redundant power supply |
| replication_session_state | Detects replication session is in progress. | 000214505 PowerStore: Pre-upgrade Health Check (PUHC) detects replication is in a state that prevents NDU. |
| scheduled_vm_snapshot | Detects if there are any VM scheduled snapshots in progress. | 000214504 PowerStore: Pre-Upgrade Health Check (PUHC) checks if all snapshot commands are all in completed state. |
| off_release_check_chap_authentication | off_release_check_chap_authentication | 000214503 PowerStore: Pre-Upgrade Health Check (PUHC) detects if the CHAP transit connection is properly configured. |
| The maintenance window is configured. | Detects if a maintenance window has been configured. Only relevant for OS versions where the system does not automatically enable the maintenance window before NDU. | 000212508 PowerStore: Pre-Upgrade Health Check (PUHC) detects that the maintenance window has not been configured |
| Detect a secondary IP issue. | Detects secondary IP issue on NVMe expansion enclosure | 000215560 PowerStore: Pre-upgrade Health Check (PUHC) detects an issue with the secondary IP setting on an ENS24 NVMe expansion enclosure. |
| SDNAS snapshot limit | Detects if SDNAS snapshots exceeded their limit. | 000206131 PowerStore: System Health Check detects that the SDNAS snapshot limit has been exceeded |
| Duplicate FW entry check | Detects duplicate component firmware entries in a node's resume (registry). | 000203390 PowerStore: System Health Checks detects duplicate firmware entries. |
| Initiator connectivity check | Detects if any nonredundant initiators exist. | 000196194 PowerStore: System Health Checks Detected Nonredundant Initiators. |
| Reboot flag set | Detects if the reboot flag is set. | 000205908 PowerStore: System Health Checks detects the reboot flag is set. |
| Recovery partition image check | Detects incorrect filename in recovery partition. | 000200075 PowerStore: System Health Checks for incorrect filename in recovery partition. |
| off_release_sdnas_remote_network_alert_check | Detects NAS replication interface mismatch | 000201904 PowerStore: Pre-Upgrade Health Check (PUHC) detects Interface mismatch for NAS replication sessions |
| off_release_stale_scsi3_reg_check | Checks the system for any stale SCSI3 reservations | 000246358 PowerStore: Pre-Upgrade Health Check (PUHC) to detect a stale SCSI3 reservation issue |
| empty_vg_no_memebers | Detects if a volume group with no member in Control Path (CP) exists | 000238653 PowerStore: Health Checks detect if a volume group with no members exists |
| off_release_ens24_drive_missing_path_check | Checks the system for non-redundant NVMe drive paths that are either failed or missing | 000242170 PowerStore: Pre-Upgrade Health Check (PUHC) to detect non-redundant NVMe drive paths |
| off_release_scsi3_reservation_check | Checks the system for any existing SCSI3 reservations | 000233544 PowerStore: Pre-Upgrade Heath Check Detects existing SCSI3 reservations |
| off_release_sdnas_last_event_id_check | Detects if the SdnasLastProcessedEventId number is large enough to cause an Out of Memory issue | 000227865 PowerStore: Pre upgrade health check (PUHC) to detect if the SdnasLastProcessedEventId number is large enough to cause an Out of Memory issue |
| off_release_new_firmware_zip_existence_check | Detects a lost pre-staged FW zip file issue to prevent NDU failure | 000228388 PowerStore: Pre upgrade health check (PUHC) to detect if new FW zip is wiped out from the stage of FW partition |
| off_release_sdnas_memory_config | Detects if there is a mismatch between the platform's limit for the NAS container versus what the NAS attempts to internally allocate | 000250803 PowerStore: Pre upgrade health check (PUHC) to detect an SDNAS memory configuration issue |
| off_release_nvme_reservation_check | Detects if NVMe Reservations are present on the cluster | 000269902 PowerStore: Pre upgrade health check (PUHC) to detect if NVMe reservations are present on the cluster |
| off_release_dp_mem_override_file_exists | Detects the presence of an existing Data Path (DP) memory override file | 000271492 PowerStore: Pre upgrade health check (PUHC) to detect OOM override for Data Path Memory |
| off_release_removed_third_party_certificate_check | Detects a missing third-party certificate chain | 000261401 PowerStore: Pre upgrade health check (PUHC) to detect the reset_certificates REST endpoint issue |
| user_db_fewer_records | Detects if necessary user.db entries are lost or different between the nodes of the affected appliance | 000263789 PowerStore: Pre upgrade health check (PUHC) to detect if necessary user.db entries are lost, or different between nodes |
| off_release_vol_in_destroying_state_check | Detects if volumes in “Destroying” state | 000258991 PowerStore: Pre upgrade health check (PUHC) to detect volumes in “Destroying” state |
| off_release_incomplete_commands_check | Detects incomplete Control Path (CP) commands that are not in the approved list of commands allowed during NDU | 000269892 PowerStore: Pre upgrade health check (PUHC) to detect incomplete CP commands not in approved list of commands allowed during NDU |
| off_release_inter_cluster_tcp_conn_check | Detects partial inter-cluster TCP connections | 000285925 PowerStore: Pre upgrade health check (PUHC) to detect partial inter-cluster TCP connections |
off_release_stale_gsips_in_remote_rtps_check |
Detects any non-existing global storage discovery IP (GSIP) left over in the remote relative target ports (RTPs) | 000266610 PowerStore: Pre-Upgrade Health Check to Detect a stale GSIP in remote RTPs |
System Checks
LEGEND: In this table, the symbol ♦ indicates checks introduced/enhanced in this most recent Health Check thin package.
| Test name | Description | KB Article |
|---|---|---|
| off_release_nvram_full_fips_mode | Detects if full FIPS configuration of NVRAM disks is set correctly | 000296977 PowerStore: System Health Check and/or Pre-Upgrade Health Check 3.x/4.x | Replacement of NVMe NVRAM disks may incorrectly raise FIPS alerts |
| off_release_unsupported_drive_check | Detects if all drives installed are found in the system driveDB.json file | 000316788 PowerStore Health Checks to Identify a System with Drives not covered by DriveDB.json v0.6.6.0 |
| ppds_dsb_check | Detects outdated DSB information | 000224714 PowerStore: System Checks detects that the platform data service information is out of date. |
| indus_encryption_offset_check | Detects drive invalid encryption band location for NVMe expansion enclosures. | 000220624 PowerStore: System Check detects NVMe expansion enclosure (ENS24) drive invalid encryption band location. |
| dp_dedupe_destage_leak_check | Detect unwanted destages to cause excessive drive wear. | 000220203 PowerStore: System Check detects unnecessary de-stages |
| kr_link_boot_option_check | Detect if the KR link PXE boot option is not enabled on both nodes of 500T appliances. | 000220804 PowerStore: System Check detects if the KR link PXE boot option is not enabled on both nodes for PowerStore 500T appliances. |
| iom_activation_check | Prevent NDU for IOM/SLIC without activation. | 000216558 PowerStore: Health check has detected that a NVMe Expansion Enclosure (ENS24) was added but not recognized. |
| dp_resiliency_mode_check | Detect ungraceful exit of resiliency mode (vDisk issue) | 000217840 PowerStore: System Checks detects that the appliance unnecessarily remains in resiliency mode. |
| sdnas_capacity_alert_check | Detect if FS capacity alert disabled after upgrade. | 000217839 PowerStore: Filesystem usage capacity alerts disabled after upgrade. |
| unfinished_ndu_check | Detects the existence of an unfinished upgrade. | 000213265 PowerStore: System health Check has detected remnants of failed NDU commits. |
| silent_drive_failure_check | Detects issue of KB 000216381 PowerStore: SSD Failed Without an Alert Being Displayed. | 000216659 PowerStore: Heath check detects the missing SSD issue. |
| target_port_group_id_check | Detects a target port group issue affecting mapping of an NVMeoF volume | 000216953 PowerStore: System health check detects a target port group issue that may affect mapping of an NVMeoF volume. |
| indus_drive_paths_check | Detects unstable paths to ENS24 NVMe expansion enclosure | 000212444 PowerStore: System Health Check detects unstable paths to the ENS24 NVMe expansion enclosure. |
| dimm_sn_check | Detects inconsistencies in DIMM serial numbers | 00207658 PowerStore: System Health Checks detects inconsistent DIMM Serial Numbers. |
| recovery_partition_image_check | Detects an incorrect filename in the recovery partition | 000200075 PowerStore: System Health Checks for incorrect filename in recovery partition. |
| duplicate_fw_entry_check | Detects duplicate component firmware entries in a node's resume (registry). | 000203390 PowerStore: System Health Checks detects duplicate firmware entries. |
| cpu_ierr_check | Checks for CPU internal error | 000196192 PowerStore: System Health Checks detects an issue in the CPU IERR Check. |
| InitiatorConnectivityCheck | Detects nonredundant initiators | 000196194 PowerStore: System Health Checks Detected Nonredundant Initiators. |
| icd_network_check | Detects missing connectivity to ToR | 000196193 PowerStore: System Health Checks detected an ICD Network Connectivity issue. |
| dimm_correctable_error_check | Detects DIMM Correctable Errors (CE) count (5k threshold) | 000199245 PowerStore: System Health Checks detects excessive DIMM Correctable Errors (CE) count. |
| active_system_alert_check | Detects active Major and Critical alerts. | 000192609 PowerStore: Active alerts were detected by health checks. |
| cyc_node_space_check | Detects node's /cyc_node directory has insufficient space. | 000198173 PowerStore: System Health Checks detects lack of space in /cyc_node. |
| time_skew_check | Detects unsupported large time skew | 000196199 PowerStore: System Health Checks detects high time skew on Nodes and BMC. |
| db_tmpfiles_check | Detects database temporary files larger than expected | 000196198 PowerStore: System Health Checks detects large database temporary files. |
| bbu_sensor_check | Detects failure in various BBU health checks | 000196197PowerStore: System Health Checks detects invalid battery status. |
| component_sn_check | Detects inconsistent serial number for BBU or PSU | 000196196 PowerStore: System Health Checks detects that component Serial Numbers are not consistent: fru_items_sn_check |
| fsck_leftover_check | Detects if there exists the fsck generated file cyc-sys-mode-override.txt. | 000201738 PowerStore: System Health Checks detects recovery file. |
| component_stale_fw_check | Detects if the firmware is up to date and if it is compatible with the Dell X.509 signature | 000201500 PowerStore: System Health Checks detects if a firmware upgrade is required. |
| transit_connection_check | Detects if transit connection objects exist in the Data Path | 000226767 PowerStore: System Check detects orphan transit connection objects |
symmd_on_disk_check |
Detect if the latest System Manager Metadata on Disk (SYMMD) data is saved on disk | 000228110 PowerStore: System Health Checks detects an out-of-date System Manager Metadata on Disk (SYMMD) data on disks |
|
Detect if any node has slab allocation over the skbuff_fclone_cache limit | 000261124 PowerStore: System Health Checks detects if any node has slab allocation over the skbuff_fclone_cache limit |
Health Check package for 2.1.x
The table below lists the health checks that are in the PowerStore-health_check-2.1.1.2-2069723-retail.tgz.bin Health Check thin package. The package is compatible with PowerStoreOS versions 2.1.x. It is not compatible with versions 3.x or 4.x.
This package contains System Checks for general health monitoring and for pre-upgrade health checks. It is recommended for the system heath to be checked periodically and prior to performing maintenance operations. The pre-upgrade health checks are required to be performed before performing an NDU.
Distribution:
Posted in Drivers & Downloads: (Requires login to the Dell support site to view the document)
The package must be downloaded from this site.
How to Install:
After the package is uploaded, it must be installed (PowerStore Manager: Upgrades > Upgrade).
How to run:
- System health checks are run from the PowerStore Manager UI (Monitoring > System Checks > Run System Check.)
- Pre-upgrade health checks are run from the PowerStore Manager UI (Monitoring > System Checks > Upgrade Extension.)
- Alternatively, the installed health check can be run using the service script
svc_health_check.
| Test name | Description | KB for failure |
|---|---|---|
| mtc_drive_counter_check | Detects an MTC NVRAM drive issue | 000212587 PowerStore: System Health Check detects an MTC NVRAM drive issue. |
| drive_flags_check | Detects offline and failed drives including those that do not raise an alert | 000207485 PowerStore: System Health Check detects an offline or failed SSD. |
| bbu_sensor_check | Detects failure in various BBU health checks | 000196197 PowerStore: System Health Checks detects invalid battery status. |
| kms_lockbox_file_check | Detects an issue with the dare lockbox | 000196653 PowerStore: Health check detects an issue with the lockbox. |
| os_package_name_check | Detects incorrect filename in recovery partition | 000200075 PowerStore: Health check detects an issue with the filename, recovery partition, or PowerStoreOS package version. |
| duplicate_fw_entry_check | Detects duplicate component firmware entries in a node's resume (registry). | 000203390 PowerStore: System Health Checks detects duplicate firmware entries. |
| fsck_leftover_check | Detected unexpected recovery files | 000201738 PowerStore: System Health Checks detects recovery file. |
| recovery_partition_image_check | Detects incorrect filename in recovery partition | 000200075 PowerStore: System Health Checks for incorrect filename in recovery partition. |
| symmetric_icm_connection | Detects missing ICM connection | 000203115 PowerStore: Heath Check package check for asymmetric ICM connections fails |
| cpu_ierr_check | Checks for CPU internal error | 000196192 PowerStore: System Health Checks detects an issue in the CPU IERR Check. |
| InitiatorConnectivityCheck | Detects nonredundant initiators | 000196194 PowerStore: System Health Checks Detected Nonredundant Initiators. |
| icd_network_check | Detects missing connectivity to ToR | 000196193 PowerStore: System Health Checks detected an ICD Network Connectivity issue. |
| symmd_fw_upgrade_flag_check | Detects a PSU in an invalid state | 000199922 PowerStore: System Health Checks detects an incorrect PSU State. |
| dimm_correctable_error_check | Detects DIMM CE count (5k threshold) | 000199245 PowerStore: System Health Checks detects excessive DIMM Correctable Errors (CE) count. |
| active_system_alert_check | Detects active Major and Critical alerts | 000192609 PowerStore: Active alerts were detected by health checks. |
| cyc_node_space_check | Detects node's /cyc_node directory has insufficient space. | 000198173 PowerStore: System Health Checks detects lack of space in /cyc_node. |
| time_skew_check | Detects unsupported large time skew | 000196199 PowerStore: System Health Checks detects high time skew on Nodes and BMC. |
| db_tmpfiles_check | Detects database temporary files larger than expected | 000196198 PowerStore: System Health Checks detects large database temporary files. |
|
bbu_ipmi_i2c_check
|
Detects failure in various BBU health checks | 000196197 PowerStore: System Health Checks detects invalid battery status. |
| ru_items_sn_check | Detects inconsistent serial number for BBU or PSU | 000196196 PowerStore: System Health Checks detects that component Serial Numbers are not consistent: fru_items_sn_check |