14G PowerEdge-VxFlex Ready Node-SDC Disconnects
Summary: 14G PowerEdge (VxFlex Ready Node) Storage Data Client (SDC) Disconnects
Symptoms
Issue Description
Scenario
Storage Data Servers (SDS) close sockets against the "disconnecting" SDC, due to KA timeout
The SDC reconnects successfully.
Symptoms
Events log shows an SDC disconnecting from multiple SDS at the same moment:
2018-05-10 23:32:35.466 SDC_DISCONNECTED_FROM_SDS_IP WARNING SDC ID: 0aeec9b500000008 disconnected from IP 10.90.61.161 of SDS SDS_10.90.21.161; ID: 7a0c5c4200000005 2018-05-10 23:32:35.466 SDC_DISCONNECTED_FROM_SDS_IP WARNING SDC ID: 0aeec9b500000008 disconnected from IP 10.90.61.156 of SDS SDS_10.90.21.156; ID: 7a0c353200000006 2018-05-10 23:32:35.466 SDC_DISCONNECTED_FROM_SDS_IP WARNING SDC ID: 0aeec9b500000008 disconnected from IP 10.90.41.157 of SDS SDS_10.90.21.157; ID: 7a0c353500000009 2018-05-10 23:32:35.466 SDC_DISCONNECTED_FROM_SDS_IP WARNING SDC ID: 0aeec9b500000008 disconnected from IP 10.90.61.165 of SDS SDS_10.90.21.165; ID: 7a0c35380000000c 2018-05-10 23:32:35.466 SDC_DISCONNECTED_FROM_SDS_IP WARNING SDC ID: 0aeec9b500000008 disconnected from IP 10.90.61.164 of SDS SDS_10.90.21.164; ID: 7a0c35390000000d The SDC shows disconnects, then reconnects successfully one second later:
2018-05-10T23:32:34.466Z cpu26:71534)WARNING: [4259101873] Disconnected from SDS with ID 7a0c5c4200000005 2018-05-10T23:32:34.466Z cpu41:71538)WARNING: [4259101873] Disconnected from SDS with ID 7a0c35380000000c 2018-05-10T23:32:34.466Z cpu26:71534)WARNING: [4259101873] Disconnected from SDS with ID 7a0c353500000009 2018-05-10T23:32:34.466Z cpu17:71540)WARNING: [4259101873] Disconnected from SDS with ID 7a0c353200000006 2018-05-10T23:32:34.466Z cpu26:71534)WARNING: [4259101873] Disconnected from SDS with ID 7a0c835300000012 2018-05-10T23:32:35.566Z cpu23:71432)WARNING: [4259102973] Connected to SDS with ID 7a0c835300000012 2018-05-10T23:32:35.566Z cpu26:71534)WARNING: [4259102973] Connected to SDS with ID 7a0c353200000006 2018-05-10T23:32:35.566Z cpu17:71540)WARNING: [4259102973] Connected to SDS with ID 7a0c5c4200000005 2018-05-10T23:32:35.566Z cpu26:71534)WARNING: [4259102973] Connected to SDS with ID 7a0c353500000009 2018-05-10T23:32:35.566Z cpu33:71433)WARNING: [4259102973] Connected to SDS with ID 7a0c35380000000c
Impact
None is observed, but SDC disconnect may impact volume access.
Cause
Root cause
Low-level I/O pauses caused delay in response from SDC to SDS.
Per the BIOS release notes, the underlying issue is that the "BIOS takes a long time to handle correctable memory errors."
Where these events should take single-digit milliseconds to complete, they were taking hundreds.
Resolution
Workaround
This is resolved in Dell PowerEdge BIOS version 1.3.7 and later.
See the 14G Ready Node Firmware and Driver Matrix for current BIOS, FW, and driver recommendations.
Additional Information
Impacted versions
VxFlex Ready Node (14G) with BIOS under 1.3.7
AMS 2.5 (Initial support for 14G nodes) included BIOS 1.2.11.
Fixed in version
VxFlex Ready Node (14G) with BIOS 1.3.7 or later
AMS 2.6 includes BIOS 1.3.7.