PowerFlex Extremely high network latency between SVMs when NSX is installed
Summary: SVMs (but also regular VMs and VMkernel interfaces) might show high network latency (up to a couple of seconds) when NSX is installed.
Symptoms
Scenario
NSX is installed on the ESXi hosts taking part in the PowerFlex cluster. NSX Distributed Firewall rule statistics collection is running and NSX version between 6.4.2-6.4.5.
Symptoms
Depending on the performance impact caused by NSX, symptoms can be as follows:
- High network latency between SVMs or ESXi hosts going to as high as a few seconds:
- ESXi hosts losing access to PowerFlex storage
- Intermittent SDS disconnections and rebuilds
- DU (DATA_FAILED) if multiple SDSs disconnect simultaneously
Cause
Impact
The impact might vary - from performance hit, and temporary inability to access storage to a full DU scenario.
Root Cause
In this particular case, the problem was caused by an NSX bug described in the VMware KB article Intermittent latency when using NSX 6.4.2 or above and a large number of Distributed Firewall rules - NSX Distributed Firewall rule statistics collection introduces a significant delay in vSphere network stack affecting all the VMs (including SVMs) and VMkernel interfaces.
To let you know, similar behavior can be caused by other NSX bugs or misconfiguration, so it is essential to review NSX with a VMware expert.
Resolution
Workaround
A potential temporary workaround is to add PowerFlex SVMs to the NSX firewall exclusion list, but VMware networking experts should be consulted before taking any actions.
Impacted Versions
N/A - not a PowerFlex issue
Fixed In Version
N/A