PowerFlex: A performance bundle for CloudIQ is not being created

Summary: Gateway (GW) is not generating a performance bundle for CloudIQ, but Configuration, Capacity, and Alerts are generated as expected.

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Symptoms

System configuration, system capacity, system alerts, and system performance statistics are generated by GW when a system is configured to send CloudIQ characteristics and statistical data.

The creation of any of the four bundles cannot be stopped manually by default.

The GW is unable to collect volume statistics and generate the performance bundle due to an issue with the interpretation of RestAPI responses between it and MDM.
See the Additional Info section for more information regarding the flow of data collection and generation.

The following exception errors can be found in the GW scaleio-trace.log files:

2022-08-05 12:05:53,975 [AsyncHandler-21] ERROR c.e.scaleio.esrsmanager.EsrsManager - Alert hasn't been sent since ESRS reached limit of 200 per: 8 hours
com.emc.scaleio.esrsmanager.NotificationMessageLimitException: null   <<<
    at com.emc.scaleio.esrsmanager.ESRSConnector.sendConnectEmcMessage(ESRSConnector.java:360) ~[ams-1.0-SNAPSHOT.jar:na]
    at com.emc.scaleio.esrsmanager.ESRSConnector.sendConnectEmcMessage(ESRSConnector.java:308) ~[ams-1.0-SNAPSHOT.jar:na]
    at com.emc.scaleio.esrsmanager.EsrsManager.sendAlert(EsrsManager.java:566) [ams-1.0-SNAPSHOT.jar:na]
    at com.emc.scaleio.esrsmanager.EsrsManager.sendAlert(EsrsManager.java:598) [ams-1.0-SNAPSHOT.jar:na]
    at com.emc.scaleio.esrsmanager.BaseNotificationManager.busReceivedAlerts(BaseNotificationManager.java:103) [ams-1.0-SNAPSHOT.jar:na]
...
2022-08-05 12:05:53,975 [https-jsse-nio-443-exec-293] ERROR c.e.s.s.r.DeviceRepositoryImpl - Error in QueryPropertiesResponse for Device::-101652817808130048 property:PENDING_MOVING_OUT_FWD_REBUILD_JOBS has value type: UNDEFINED_PROP_TYPE
2022-08-05 12:05:53,975 [https-jsse-nio-443-exec-293] ERROR c.e.s.s.r.DeviceRepositoryImpl - Error in QueryPropertiesResponse for Device::-101652817808130048 property:NET_THIN_USER_DATA_CAPACITY_IN_KB has value type: UNDEFINED_PROP_TYPE
...
2022-08-05 12:05:53,982 [https-jsse-nio-443-exec-306] ERROR c.e.s.s.r.DeviceRepositoryImpl - Error in QueryPropertiesResponse for Device::-99119590261981183 property:PENDING_MOVING_OUT_FWD_REBUILD_JOBS has value type: UNDEFINED_PROP_TYPE
2022-08-05 12:05:53,982 [https-jsse-nio-443-exec-290] ERROR c.e.s.s.r.DeviceRepositoryImpl - Error in QueryPropertiesResponse for Device::-99401056648822783 property:RFCACHE_WRITES_SKIPPED_STUCK_IO has value type: UNDEFINED_PROP_TYPE
2022-08-05 12:05:53,982 [https-jsse-nio-443-exec-310] ERROR c.e.s.s.w.c.ScaleIOController - Got an exception in handleException
java.lang.IllegalStateException: Bad number: 3   <<<
    at com.emc.s3g.scaleio.domain.enums.ScsiReserveType.valueOf(ScsiReserveType.java:42) ~[ams-1.0-SNAPSHOT.jar:na]
    at com.emc.s3g.scaleio.repository.BaseRepository.updateStatistics(BaseRepository.java:1184) ~[repository-1.0-SNAPSHOT.jar:na]
    at com.emc.s3g.scaleio.repository.BaseRepository.getStatistics(BaseRepository.java:981) ~[repository-1.0-SNAPSHOT.jar:na]
    at com.emc.s3g.scaleio.web.controller.ScaleIOController.getStatistics(ScaleIOController.java:93) ~[classes/:na]
    at sun.reflect.GeneratedMethodAccessor731.invoke(Unknown Source) ~[na:na]

Example of a working system:

root@working_cloudiq ~]# ls -lrt /opt/emc/scaleio/gateway/temp
total 300
drwx------. 2 root root 25 Feb 28 2020 certificates
drwx------. 2 root root 6 Feb 28 2020 scaleio-install-logs
-rwx------. 1 root root 0 Feb 28 2020 216e5abe-29e9-4825-b095-d8900d5964d8_ScaleIO-config.json
-rwx------. 1 root root 0 Jan 12 2022 safeToDelete.tmp
-rwx------. 1 root root 521 Jan 12 2022 index.html
-rwx------. 1 root root 0 Mar 20 01:36 GATEWAY_RUN_USER.txt
-rw-r-----. 1 root root 95929 Jul 14 08:33 powerflex_1657787617941_ELMSIO1234568_config.zip
-rw-r-----. 1 root root 47245 Jul 15 07:34 powerflex_1657870447081_ELMSIO1234568_capacity.zip
-rw-r-----. 1 root root 95935 Jul 15 08:33 powerflex_1657874022010_ELMSIO1234568_config.zip
-rw-r-----. 1 root root 47330 Jul 15 08:34 powerflex_1657874048125_ELMSIO1234568_capacity.zip
-rw-r-----. 1 root root 2671 Jul 15 09:02 powerflex_1657875734080_ELMSIO1234568_alerts.zip
-rw-r-----. 1 root root 2671 Jul 15 09:02 powerflex_1657875734085_ELMSIO1017KPF3_performance.zip   <<<
-rw-r-----. 1 root root 2670 Jul 15 09:07 powerflex_1657876034745_ELMSIO1234568_alerts.zip
-rw-r-----. 1 root root 2670 Jul 15 09:07 powerflex_1657876034750_ELMSIO1017KPF3_performance.zip   <<<

Example of a non-working system - performance.zip file was not generated:

root@not_working_cloudiq ~]# ls -lrt /opt/emc/scaleio/gateway/temp
total 300
drwx------. 2 root root 25 Feb 28 2020 certificates
drwx------. 2 root root 6 Feb 28 2020 scaleio-install-logs
-rwx------. 1 root root 0 Feb 28 2020 216e5abe-29e9-4825-b095-d8900d5964d8_ScaleIO-config.json
-rwx------. 1 root root 0 Jan 12 2022 safeToDelete.tmp
-rwx------. 1 root root 521 Jan 12 2022 index.html
-rwx------. 1 root root 0 Mar 20 01:36 GATEWAY_RUN_USER.txt
-rw-r-----. 1 root root 95929 Jul 14 08:33 powerflex_1657787617941_ELMSIO1234568_config.zip
-rw-r-----. 1 root root 47245 Jul 15 07:34 powerflex_1657870447081_ELMSIO1234568_capacity.zip
-rw-r-----. 1 root root 95935 Jul 15 08:33 powerflex_1657874022010_ELMSIO1234568_config.zip
-rw-r-----. 1 root root 47330 Jul 15 08:34 powerflex_1657874048125_ELMSIO1234568_capacity.zip
-rw-r-----. 1 root root 2671 Jul 15 09:02 powerflex_1657875734080_ELMSIO1234568_alerts.zip
-rw-r-----. 1 root root 2670 Jul 15 09:07 powerflex_1657876034745_ELMSIO1234568_alerts.zip

Cause

PowerFlex supports SCSI-2 reservation and a subset of SCSI-3 reservation commands. SCSI reservation commands (reset, reserve, release, read) are sent by SDCs to MDM, which then updates the SDSs.
When an SCSI-3 reservation has been placed on a volume, the RestAPI calls from the GW to the MDM to read the volume statistics and then fails with the error mentioned above - Bad number: 3.

The GW misinterprets the SCSI reservation type and fails the RestAPI call returning from the MDM.
The I/O and reservation on the PowerFlex side are working as expected.

How to validate SCSI reservation info in get_info?

$ awk 'BEGIN { printf "%-15s %-15s %s\n", "Volume_ID", "Volume_Name", "SCSI_Reservation"; printf "%-15s %-15s %s\n", "---------", "-----------", "----------------" }; /: ID:/ { volume_id = $2; volume_name = $3 } / SCSI-reserver-key:/ { scsi_reserv = $1; if (scsi_reserv == "scsi2-reserved:3"){ printf "%-15s %-15s %-15s %s\n", volume_id, volume_name, scsi_reserv, "<<< SCSI-3 !!!" } else{ printf "%-15s %-15s %s\n", volume_id, volume_name, scsi_reserv } }' getInfoDump/mdm/sdbg_out.txt | column -t
Volume_ID                Volume_Name          SCSI_Reservation
---------                -----------          ----------------
ID:0x2fad5f7f00000000    Name:vol1-sp1-PD1    scsi2-reserved:0
ID:0x2fad5fcb00000001    Name:vol2-sp1-PD1    scsi2-reserved:3  <<<  SCSI-3  !!!
ID:0x2fad5fcc00000002    Name:vol3-sp1-PD1    scsi2-reserved:3  <<<  SCSI-3  !!!
ID:0x2fa9dd3d00000003    Name:vol4-sp1-PD1    scsi2-reserved:0

How to validate SCSI reservation info on a live system?

$ cat > script
c mdm
dumpallscreens
disconnect
exit
^D
$ /opt/emc/scaleio/sds/diag/sdbg script > sdbg_out.txt
$ awk 'BEGIN { printf "%-15s %-15s %s\n", "Volume_ID", "Volume_Name", "SCSI_Reservation"; printf "%-15s %-15s %s\n", "---------", "-----------", "----------------" }; /: ID:/ { volume_id = $2; volume_name = $3 } / SCSI-reserver-key:/ { scsi_reserv = $1; if (scsi_reserv == "scsi2-reserved:3"){ printf "%-15s %-15s %-15s %s\n", volume_id, volume_name, scsi_reserv, "<<< SCSI-3 !!!" } else{ printf "%-15s %-15s %s\n", volume_id, volume_name, scsi_reserv } }' sdbg_out.txt | column -t

Volume_ID                Volume_Name          SCSI_Reservation
---------                -----------          ----------------
ID:0x2fae49da00000001    Name:vol1-sp1-PD1    scsi2-reserved:0
ID:0x2fad5fcb00000002    Name:vol2-sp1-PD1    scsi2-reserved:3  <<<  SCSI-3  !!!
ID:0x2fad5fcc00000003    Name:vol3-sp1-PD1    scsi2-reserved:3  <<<  SCSI-3  !!!
ID:0x2fa9dd3d00000004    Name:vol4-sp1-PD1    scsi2-reserved:0

Resolution

As the SCSI reservation is set by the client and application side, the only workaround is to release the reservation from the volume.

Impacted Versions

PowerFlex v3.5
PowerFlex v3.6
PowerFlex v4.0

Fixed In Version

PowerFlex v3.5.1.9
PowerFlex v3.6.1
PowerFlex v4.0.1.1

Additional Information

The flow for gathering and creating the performance bundle file consists of two separate processes:

The  First process Is activated  Every 5 s And sends a request for statistics from the MDM, and stores the response in an accumulated manner.

The  second process Is activated  Every 5 minutes Where it calculates the deltas and compresses the data into a .zip file inside the /opt/emc/scaleio/gateway/temp directory.

Affected Products

PowerFlex Appliance, PowerFlex custom node, PowerFlex Software
Article Properties
Article Number: 000208018
Article Type: Solution
Last Modified: 13 Feb 2025
Version:  4
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.