VPLEX: Backend IO bottlenecked following modification to BEPM parameters
Summary: This article is to let customers know of a possible performance or DU issue they may be at risk of if changes have been made to the VPLEX BEPM feature for VPLEX running on GeoSynchrony 6.2.x and what to do if this issue is seen. ...
Symptoms
Impact:
A customer may note there is a Backend (BE) load imbalance for the VPLEX to any array(s) connected to the VPLEX even if the paths are reported to be healthy from the connectivity and array side, which may lead to performance degradation and may lead to a Data Unavailability (DU) situation.
Cause:
Any time there is a parameter / setting change to the BEPM (Back-End Path Management) feature of the VPLEX it results in the creation of a "pathmanagement.conf" file that is saved on each director and a backup copy, for each director, is stored on the management server as well. This includes modifying the BEPM settings (even if reverting them back to the default settings) or when disabling and/or enabling the BEPM feature.
When this pathmanagement.conf file is present, and a directors’ firmware restarts (for any reason) it leaves the system vulnerable to a bug being tracked by VPLEX Engineering where any degraded Initiator-Target-LUN (ITL) that belong to a non-degraded IT nexuses or any degraded Logical Unit(s) will not become undegraded when they should be. This may result in a BE load imbalance and a performance degradation, which may lead to a Data Unavailable (DU) situation.
This file would be seen in /var/opt/zephyr/flashDir/pathmanagement.conf on each director if it exists.
Dell Technologies Hardware that is impacted by this issue:
VPLEX VS2 (Metro-FC/IP and Local)
VPLEX VS6 (Metro-FC/IP and Local)
Dell Technologies Software that is impacted by this issue:
GeoSynchrony 6.2 through 6.2 Patch 4
Cause
The issue of the BEPM feature failing to un-degrade any degraded ITL(s) is triggered by the following 2 factors:
-
BEPM settings having been changed:
- This was done by VPLEX Support when;
- A VPLEX is being prepared for XtremIO NDUs
- There was a performance issue and it was noted that ITL Paths were flapping between degraded and undegraded on some directors and not all, and BEPM parameter changes had been made.
- The BEPM option was disabled on some directors but not others while troubleshooting the issue.
-
Even if the BEPM settings have been reverted back to their default settings the system is susceptible to performance degradation and possibly DU for the reason stated in the Issue section, second paragraph, earlier in this article.
-
When the BEPM settings are changed it results in the “pathmanagement.conf” file being present in the /var/opt/zephyr/flashDir directory of each director, and a backup copy for each director saved to the management server.
-
The director firmware has restarted since the BEPM settings had been changed and the saved settings in the pathmanagement.conf file are read back in and applied during:
-
The firmware restart which could be for any reason: firmware crash, manually restarted by user, firmware has come back up following hardware replacement, or a VPLEX NDU.
Resolution
If you had preformed an XtremIO NDU, or had performance issues, and support had been engaged and made changes to the BEPM parameters please reach back out to Dell VPLEX Support to assist in resolving this issue and mention this article.