PowerFlex 4.x ESXi custom ISO download failure during Resource Group deployment
Summary: Custom iso download failed during resource group deployment for IC - this can happen for Hyperconverged infrastructure (HCI) or Compute Only (CO) deployments.
Symptoms
Custom-based ESXi ISO used in deploying CO-based service fails with the below message as the resource group cannot connect to the downloaded ESXi ISO. An error is found in the log path /opt/Dell/ASM/deployments/<job id>. This failure is observed in the 15G servers installed with BOSS M2-based drives with a large capacity size.
Error Message Below:
"msg": "HTTP RESPONSE for RCM OS location:
{u'status': 404, u'content_length': u'153', u'failed': True, u'url': u' [https://http-share.powerflex.svc:443/download/ESXi.iso|https://http-share.powerflex.svc/download/ESXi.iso] ', u'changed': False, u'elapsed': 0, u'content': u'<html>\\r\\n<head><title>404 Not Found</title></head>\\r\\n<body>\\r\\n<center><h1>404 Not Found</h1></center>\\r\\n<hr><center>nginx/1.16.1</center>\\r\\n</body>\\r\\n</html>\\r\\n', u'msg': u'Status code was 404 and n ot [200]: HTTP Error 404: Not Found', u'connection': u'close', u'content_type': u'text/html', u'date': u'Tue, 23 Aug 2022 12:41:30 GMT', u'redirected': False, u'server': u'nginx/1.16.1'}
\nHTTP RESPONSE for non-RCM OS location: {u'sta
tus': 200, u'cookies': {}, 'failed': False, u'url': u'
https://http-share.powerflex.svc:443/download/ESXi/
', u'changed': False, u'elapsed': 0, u'date': u'Tue, 23 Aug 2022 12:41:31 GMT', u'connection': u'close', u'content_type': u'text
/html', u'msg': u'OK (unknown bytes)', u'redirected': False, u'server': u'nginx/1.16.1', u'cookies_string': u''}"
Cause
The issue is with custom non-RCM ISO images not being deleted from HTTP-share. This can happen with any ISO (SLES, ESXi, etc.)
1. Uploaded the custom ESXi image with the name custom_esxi (able to see a folder in HTTP share )
2. Removed the custom ESXi image with the name custom_esxi (able to see a folder in HTTP share after deletion)
3. Uploaded SLES image with ESXi image type with name custom_esxi (now the HTTP share folder with name custom_esxi has both SLES and ESXi files together)
4. Remove the custom SLES image with the name custom_esxi (HTTP share folder with the name custom_esxi has both SLES and ESXi files together after deletion)
5. If we re-use the same name custom_esxi then deployments will fail.
Resolution
Log in to any of the nodes of the K8 cluster and find the deployer pod and restart the pod. Service Once the pod is restarted service retry will be successful, and the deployment will finish.
kubectl get pods -n powerflex | grep deployer
Note down Pod Name_ID
kubectl delete pod <pod name_ID> -n powerflex