PowerFlex SDNAS SDC-sokkel ned med fejl 1006 Forbindelsen er afbrudt
Oversigt: Der er fejl ved genstart af SDC. IO får timeout fra PowerFlex Gateway.
Symptomer
Scenario
På grund af et netværksrelateret problem i opkaldsfunktionen i sdnas-klientprogrammet, hvor en lokal unormal sokkelforbindelse blev afbrudt, kunne SDNAS-instansen SDC ikke periodisk kommunikere med VxFlex Gateway.
Symptomer
Fra sdnas_active logfilen, hvor den forsøger at udføre en forespørgsel for diskenheden "29eba43900000678", og den HTTPS-kommando, som SDNAS bruger, modtager fejlkoden "1006 Forbindelse afbrudt":
2020/07/22-07:51:00.464672 7f2e39821701 ADAPTER: 6:[GvSh_3] ScaleIOStorageAdapter::createHeaderJson:317 JSON headers <{"URI":"\/api\/instances\/Volume::29eba43900000678","HTTP_TREQ":1,"REQHEADER":{"Content-Type":"application\/json"}}>
2020/07/22-07:51:00.464682 7f2e39821701 ADAPTER: 6:[GvSh_3] HTTPAdapterCommunicator::requestReply:495 Enter
2020/07/22-07:51:00.464709 7f2e39821701 ADAPTER: 6:[SVDM_A] HTTPAdapterCommunicator::requestReply:593 :Sending via HTTP/S Client!!
2020/07/22-07:51:00.464732 7f2e39821701 ADAPTER: 6:[SVDM_A] HTTPAdapterCommunicatorReq::writeHeader:77
2020/07/22-07:51:00.464746 7f2e39821701 ADAPTER: 6:[SVDM_A] HTTPAdapterCommunicatorReq::writeHeader:112 KEY = Content-Type VALUE = application/json
2020/07/22-07:51:00.464774 7f2e39821701 ADAPTER: 6:[SVDM_A] HTTPAdapterCommunicator::requestReply:618 :Getting Resp via HTTP/S Client!!
2020/07/22-07:51:00.464914 7f2e3723b701 SOCK_STREAM: 3:[core] T_DATA_REQ: (fd=0xffffffff, NS=0) not connected
2020/07/22-07:51:20.462439 7f2e39821701 ADAPTER: 3:[SVDM_A] HTTPAdapterCommunicator::requestReply:623 :NO RESPONSE!! HTTPSTATUS = 1006 Connection Disconnected
2020/07/22-07:51:20.462460 7f2e39821701 ADAPTER: 3:[GvSh_3] ScaleIOStorageAdapter::sendRequestEx:559 Unable to send request, status 3
2020/07/22-07:51:20.462466 7f2e39821701 ADAPTER: 6:[GvSh_3] ScaleIOStorageAdapter::sendRequestEx:576 Platform communications error/disconnect
2020/07/22-07:51:21.462960 7f2e39821701 ADAPTER: 6:[GvSh_3] ScaleIOStorageAdapter::startSession:351 Successfully retrieved "sacred" credentials from NodeCredentialManager
2020/07/22-07:51:21.462985 7f2e39821701 ADAPTER: 6:[GvSh_3] ScaleIOStorageAdapter::startSession:369 Use local adminInfo.sdnasAdminAddr for SIO server address
2020/07/22-07:51:21.462991 7f2e39821701 ADAPTER: 6:[GvSh_3] HTTPAdapterCommunicator::createConnection:188 Enter
2020/07/22-07:51:21.463232 7f2e39821701 ADAPTER: 6:[GvSh_3] HTTPAdapterCommunicator::createConnection:319 Exit
2020/07/22-07:51:21.463262 7f2e39821701 ADAPTER: 6:[GvSh_3] ScaleIOStorageAdapter::createHeaderJson:317 JSON headers <{"URI":"\/api\/login","HTTP_TREQ":1,"REQHEADER":{"Content-Type":"application\/json"}}>
2020/07/22-07:51:21.463276 7f2e39821701 ADAPTER: 6:[GvSh_3] HTTPAdapterCommunicator::requestReply:495 Enter
2020/07/22-07:51:21.463297 7f2e39821701 ADAPTER: 6:[SVDM_A] HTTPAdapterCommunicator::requestReply:593 :Sending via HTTP/S Client!!
2020/07/22-07:51:21.463311 7f2e39821701 ADAPTER: 6:[SVDM_A] HTTPAdapterCommunicatorReq::writeHeader:77
2020/07/22-07:51:21.463322 7f2e39821701 ADAPTER: 6:[SVDM_A] HTTPAdapterCommunicatorReq::writeHeader:112 KEY = Content-Type VALUE = application/json
2020/07/22-07:51:27.267311 7f2e45ba5700 UFS: 6:[core] reStartSync: fsid:162 dirtyCount:1 (max:22) interval:30007734 maxPending:1 nthread:16 ioLatency:1199
2020/07/22-07:51:41.468135 7f2e39821701 ADAPTER: 3:[SVDM_A] HTTPAdapterCommunicator::requestReply:603 :Unable to Send status: 1006 Connection Disconnected!!
2020/07/22-07:51:41.468156 7f2e39821701 ADAPTER: 3:[GvSh_3] ScaleIOStorageAdapter::startSession:389 Unable to send login request, status 3
2020/07/22-07:51:41.468165 7f2e39821701 ADAPTER: 6:[GvSh_3] ScaleIOStorageAdapter::startSession:425 Forcing communicator pool to be like-new due to setup error "SDNAS_RC_PLATFORM_COMMUNICATION_ERROR"
2020/07/22-07:51:41.468190 7f2e39821701 ADAPTER: 3:[GvSh_3] ScaleIOStorageAdapter::fetchVolumeProperties:730 Unexpected ScaleIO error -1
Når vi ser på VxFlex gateway api- og scaleio-trace-logfilerne, kan vi ikke se noget login- eller instansforsøg på det pågældende tidspunkt (4 timer bagud, da sdnas er i UTC). Logfilerne viser kun "132", der slutter IP i API'en i løbet af den tid, gatewayen har aldrig modtaget nogen API-kald fra nu af, der slutter på IP "136":
2020-07-22 03:48:07,919 [https-jsse-nio-443-exec-10] INFO audit - 172.25.154.132:/api/types/Volume/instances/action/queryIdByKey {name=5f17efb2b82c4295d40005056bd5aa7}
2020-07-22 03:48:07,948 [https-jsse-nio-443-exec-5] INFO audit - 172.25.154.132:/api/types/Volume/instances {volumeType=ThinProvisioned, name=5f17efb2b82c4295d40005056bd5aa7, storagePoolId=469c695200000000, volumeSizeInKb=6291456}
2020-07-22 03:48:07,965 [https-jsse-nio-443-exec-22] INFO audit - 172.25.154.132:/api/instances/Volume::29eba43300000080/action/addMappedSdc {guid=82AFDBA9-E76B-41C0-9F34-BF4392236F57, allowMultipleMappings=TRUE}
2020-07-22 03:48:07,973 [https-jsse-nio-443-exec-17] INFO audit - 172.25.154.132:/api/instances/Volume::29eba43300000080/action/addMappedSdc {guid=7EFE3A2F-59B3-43D3-B7A3-07CF35436F9F, allowMultipleMappings=TRUE}
2020-07-22 03:48:17,534 [https-jsse-nio-443-exec-13] INFO audit - 172.25.15.64:/api/instances/Volume::29eb929f000005d8
2020-07-22 03:48:17,634 [https-jsse-nio-443-exec-10] INFO audit - 172.25.15.64:/api/instances/Volume::29eb92a2000005f4
2020-07-22 03:48:17,784 [https-jsse-nio-443-exec-16] INFO audit - 172.25.15.64:/api/instances/Volume::29eb92a1000005ed
2020-07-22 03:48:26,924 [https-jsse-nio-443-exec-23] INFO audit - 172.25.154.132:/api/types/Volume/instances/action/queryIdByKey {name=5f17efc54a36ba4d4bf005056bd8714}
2020-07-22 03:48:26,949 [https-jsse-nio-443-exec-13] INFO audit - 172.25.154.132:/api/types/Volume/instances {volumeType=ThinProvisioned, name=5f17efc54a36ba4d4bf005056bd8714, storagePoolId=469c695200000000, volumeSizeInKb=52428800}
2020-07-22 03:48:26,971 [https-jsse-nio-443-exec-10] INFO audit - 172.25.154.132:/api/instances/Volume::29eba43400000669/action/addMappedSdc {guid=82AFDBA9-E76B-41C0-9F34-BF4392236F57, allowMultipleMappings=TRUE}
2020-07-22 03:48:26,980 [https-jsse-nio-443-exec-16] INFO audit - 172.25.154.132:/api/instances/Volume::29eba43400000669/action/addMappedSdc {guid=BE2535F2-AF91-4B70-8BAA-E52F4CF9C9B9, allowMultipleMappings=TRUE}
2020-07-22 03:48:27,200 [https-jsse-nio-443-exec-23] INFO audit - 172.25.19.22:/api/types/Volume/instances/action/queryIdByKey {name=5f17efc6d0f10999e45005056bd6acc}
2020-07-22 03:48:27,273 [https-jsse-nio-443-exec-13] INFO audit - 172.25.19.22:/api/instances/System::7e558c9e2d40490f/action/snapshotVolumes {snapshotDefs=[{volumeId=29eae306000001d2, snapshotName=5f17efc6d0f10999e45005056bd6acc}]}
2020-07-22 03:48:27,568 [https-jsse-nio-443-exec-10] INFO audit - 172.25.15.64:/api/instances/Volume::29eb929f000005d8
2020-07-22 03:48:27,667 [https-jsse-nio-443-exec-16] INFO audit - 172.25.15.64:/api/instances/Volume::29eb92a2000005f4
2020-07-22 03:48:27,818 [https-jsse-nio-443-exec-20] INFO audit - 172.25.15.64:/api/instances/Volume::29eb92a1000005ed
2020-07-22 03:48:35,154 [https-jsse-nio-443-exec-23] INFO audit - 172.25.154.132:/api/instances/Volume::29eba43400000669
2020-07-22 03:48:36,219 [https-jsse-nio-443-exec-19] INFO audit - 172.25.154.132:/api/types/Volume/instances/action/queryIdByKey {name=5f17efcfe6667f3a5d9005056bd8714}
2020-07-22 03:48:36,243 [https-jsse-nio-443-exec-10] INFO audit - 172.25.154.132:/api/types/Volume/instances {volumeType=ThinProvisioned, name=5f17efcfe6667f3a5d9005056bd8714, storagePoolId=469c695200000000, volumeSizeInKb=52428800}
2020-07-22 03:48:36,261 [https-jsse-nio-443-exec-16] INFO audit - 172.25.154.132:/api/instances/Volume::29eba43600000674/action/addMappedSdc {guid=BE2535F2-AF91-4B70-8BAA-E52F4CF9C9B9, allowMultipleMappings=TRUE}
2020-07-22 03:48:36,268 [https-jsse-nio-443-exec-20] INFO audit - 172.25.154.132:/api/instances/Volume::29eba43600000674/action/addMappedSdc {guid=F80A1A85-376A-4318-9018-5742364EF531, allowMultipleMappings=TRUE}
Påvirkning
REST API-login- og forespørgselsfunktionaliteten mislykkes fra SDNAS-forekomsterne.
Årsag
HTTP-fejlen "1006" sker lokalt med en unormalt lukket forbindelse. Det betyder, at forbindelsen aldrig forlod operativsystemet, hvor forbindelsen blev forsøgt. Dette forklarer, hvorfor VxFlex GW og MDM aldrig ser login- eller forespørgselsforsøgene. Se nedenfor fra rfc for HTTPS-sokkelkommunikation "rfc 6455"
1006 1006 is a reserved value and MUST NOT be set as a status code in a Close control frame by an endpoint. It is designated for use in applications expecting a status code to indicate that the connection was closed abnormally, e.g., without sending or receiving a Close control frame.
HTTP-fejlen sker inden for opkaldsfunktionen lokalt på selve SDC og kontakter aldrig GW.
Løsning
Løsning
Problemet blev løst ved at omgå en Layer-3-router inden for netværksstien. Dette gjorde det muligt for direkte Layer-2-kommunikation fra SDNAS-instans-SDC'erne at kommunikere direkte med VxFlex Gateway.
Påvirkede versioner
Alle VxFlexOS/PowerFlex-versioner
Fast i version
Ikke relevant – Ikke et VxFlex-problem