PowerFlex: Data Integrity Issues When Upgrading The OS Without Upgrading DasCache

Summary: Data integrity issues might occur when an OS upgrade is performed and without upgrading the DasCache package first. Yum update was used to upgrade the OS on which the SDS and DasCache reside, however, the DasCache package was not upgraded after the OS upgrade. ...

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Instructions

Scenario

  • DasCache is configured properly (using /dev/disk-by-id).
  • SDS was in maintenance mode before using the yum command to upgrade the OS. After the OS upgrade, exit SDS maintenance mode was used, and shortly after, the application starts to report on data inconsistency (DI).
  • In this specific case, two SDS OSs were upgraded, however, a single SDS OS upgrade might trigger the issue as well.

Note: After the SDS OS upgrade, the DasCache service was failed to start, from some reason (still under investigation), the SDS service started successfully without the DasCache although it was supposed to fail the disk devices/SDS and the service should fail to start in order to protect the data.

 

Symptoms

Before the OS upgrade, the SDS DasCache version was:

fiop-1.5.14.rel-R3_9_Win_Linux.41_3.10.0_327.el7.x86_64.x86_6


SDS entered maintenance mode to upgrade the OS:

6457 2021-04-28 09:19:09.196 MDM_CLI_CONF_COMMAND_RECEIVED INFO Command enter_maintenance_mode received, User: 'admin'. [10252559] SDS: ID: 82c410860000000d;

SDS OS upgrade from RH 7.2 to RH 7.6 was performed by using yum update:
 

Apr 28 10:28:16 redhat-cust-1 yum[351251]: Updated: libgcc-4.8.5-36.el7.x86_64
Apr 28 10:28:16 redhat-cust-1 yum[351251]: Updated: redhat-release-server-7.6-4.el7.x86_64
Apr 28 10:28:16 redhat-cust-1 yum[351251]: Installed: 1:grub2-common-2.02-0.76.el7.noarch
Apr 28 10:28:16 redhat-cust-1 yum[351251]: Updated: setup-2.8.71-10.el7.noarch
Apr 28 10:28:17 redhat-cust-1 yum[351251]: Updated: filesystem-3.2-25.el7.x86_64
Apr 28 10:28:17 redhat-cust-1 yum[351251]: Updated: 32:bind-license-9.9.4-72.el7.noarch
Apr 28 10:28:18 redhat-cust-1 yum[351251]: Installed: 1:grub2-pc-modules-2.02-0.76.el7.noarch
Apr 28 10:28:19 redhat-cust-1 yum[351251]: Updated: tzdata-2018e-3.el7.noarch
Apr 28 10:28:19 redhat-cust-1 yum[351251]: Updated: kbd-misc-1.15.5-15.el7.noarch
Apr 28 10:28:19 redhat-cust-1 yum[351251]: Updated: 1:quota-nls-4.01-17.el7.noarch
Apr 28 10:28:19 redhat-cust-1 yum[351251]: Updated: 1:emacs-filesystem-24.3-22.el7.noarch
Apr 28 10:28:20 redhat-cust-1 yum[351251]: Updated: ncurses-base-5.9-14.20130511.el7_4.noarch
Apr 28 10:28:20 redhat-cust-1 yum[351251]: Updated: nss-softokn-freebl-3.36.0-5.el7_5.x86_64
Apr 28 10:28:24 redhat-cust-1 yum[351251]: Updated: glibc-common-2.17-260.el7.x86_64

SDS server was rebooted, but the DasCache service failed to start:

Apr 28 10:47:04 [localhost] fio.init: Starting Fio devices: Failed
Apr 28 10:47:04 [localhost] systemd: fio.service: main process exited, code=exited, status=4/NOPERMISSION
Apr 28 10:47:04 [localhost] systemd: Failed to start Block Driver Interface to Flashsoft Cache.
Apr 28 10:47:04 [localhost] systemd: Unit fio.service entered failed state.
Apr 28 10:47:04 [localhost] systemd: fio.service failed.
[root@Node]# fscli -l
Starting fio service failed

After the OS upgrade, the SDS DasCache version was still on the same version, as the DasCache was not upgraded:
 

fiop-1.5.14.rel-R3_9_Win_Linux.41_3.10.0_327.el7.x86_64.x86_6

Note: For some unknown reason (still under investigation) the SDS service started successfully, although it should have failed the SDS/disk devices. From this point on, once the SDS exits maintenance mode, a DI should be reported.

 

SDS exit maintenance mode:
 

6507 2021-04-28 10:01:54.700 MDM_CLI_CONF_COMMAND_RECEIVED INFO Command exit_maintenance_mode received, User: 'admin'. [10303510] SDS: ID: 82c410860000000d;
6508 2021-04-28 10:01:54.740 CLI_COMMAND_SUCCEEDED INFO Command exit_maintenance_mode succeeded. [10303510] 
6509 2021-04-28 10:04:00.111 SDS_MAINTENANCE_MODE_ENDED INFO SDS 10.1.150.50-RedHat (ID 82c410860000000d) has exited maintenance mode. 

Shortly after the SDS exit maintenance mode, the application (in this case, it was VMware datastores and VMs) start to report on DI:

2021-04-29T04:12:37.697Z cpu12:982259)WARNING: Res3: 4232: Volume 5e6bb636-01b03ca0-5350-246e96905870 ("DS_SQL_PD2PB_01") might be damaged on the disk. Resource cluster metadata corruption has been detected.

 

Impact

DI/DL
 

Root Cause

When DasCache being used with PowerFlex, the disk devices are exposed to the SDS with one more layer, that way the SDS read/write to the DasCache, and later on the data is flushed to the disk devices.

By design when DasCache service fails to start, the SDS fails the disk devices to protect the data on disk devices, that way, the SDS will NOT be able to access the disk devices directly.  

In this case, the SDS OS was upgraded, however, the DasCache package was not upgraded, resulted in the SDS service start successfully and bypassing the DasCache, once the SDS was reading/writing data directly from disk, there was a gap in the data as DasCache has not flushed all the data to the disk devices, eventually leading to DI.

Note: The reason the SDS service could start successfully is still being investigated. 
 

Workaround

There is no workaround for the issue if the SDS already exits maintenance mode and the DasCache service is in a failed state.

In case the OS was upgraded and SDS still in maintenance mode, there are two options to avoid a DI:

    1. Boot from the old kernel (in this case, version 3.10.0-327)
    2. Upgrade DasCache to match the kernel version and restart the SDS service, as described in the following procedure KB 000195110


Impacted Versions

All PowerFlex versions
 

Fixed In Version

N/A - still under investigation 

Affected Products

PowerFlex rack
Article Properties
Article Number: 000195109
Article Type: How To
Last Modified: 03 Jul 2025
Version:  3
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.