Dell Automation Platform upgrade from 1.0 to 1.2 fails due to high memory utilization on single-node orchestrator.

Resumen: This article explains how to resolve dell automation platform 1.0 to 1.2 upgrade failures on a single-node orchestrator caused by high memory utilization. The issue can occur when Fusion services have high replica counts and the node is under resource pressure. To resolve it, check memory usage, scale down Metrics Server if installed, scale down Fusion deployments if required, and retry the upgrade. If Metrics Server is not installed, skip that step and browse the procedure. ...

Este artículo se aplica a Este artículo no se aplica a Este artículo no está vinculado a ningún producto específico. No se identifican todas las versiones del producto en este artículo.

Instrucciones

Procedure to Verify and Reduce Memory Usage Before Upgrade

Before performing the upgrade, verify the cluster memory utilization. If memory usage is below 60%, the upgrade can proceed normally. If memory usage is above 70%, reduce memory consumption before starting the upgrade.

Step 1: Verify cluster memory utilization

Run the following command to check node memory usage:

kubectl top nodes

Run the following command to check allocated node resources:

kubectl describe nodes | grep -A 12 "Allocated resources"

Step 2: Check and scale down Metrics Server if installed

Note: Metrics Server may not be installed in all environments because it was not previously required. If Metrics Server is not present, skip this step and continue with the next step.

Run the following command to check whether Metrics Server is running:

kubectl -n kube-system get deploy metrics-server

If Metrics Server is present, run the following command to scale it down:

kubectl -n kube-system scale deploys metrics-server --replicas=0

Run the following command to verify that Metrics Server is scaled down:

kubectl -n kube-system get deploy metrics-server

Step 3: Wait for resource stabilization

If Metrics Server was scaled down, wait approximately 5 minutes for cluster resources and metrics to stabilize. If Metrics Server was not installed, skip this wait.

Step 4: Check Fusion deployments

Run the following command to check Fusion deployments and replica counts:

kubectl get deploy -A | grep -i fusion

Step 5: Scale down Fusion deployments if memory usage is still high

Run the following command to find Fusion deployments with more than one replica and scale them down:

kubectl get deploy -A | awk 'NR>1 && tolower($2) ~ /fusion/ && $3+0 > 1 {print $1, $2}' | while read -r ns dep; do echo "Scaling $ns/$dep to 1 replica"; kubectl -n "$ns" scale deploy "$dep" --replicas=1; done

Run the following command to verify Fusion deployments after scaling:

kubectl get deploy -A | grep -i fusion

Step 6: Recheck cluster memory utilization after Fusion scale down

Wait a few minutes after scaling down Fusion deployments for cluster resources to stabilize.

Run the following command to check node memory usage again:

kubectl top nodes

Run the following command to check allocated node resources again:

kubectl describe nodes | grep -A 12 "Allocated resources"

Proceed with the upgrade only after memory utilization is reduced to an acceptable level and the cluster is stable.

Step 7: Retry the upgrade

After memory utilization is reduced and the cluster is stable, retry the Dell Automation Platform upgrade.

Step 8: Verify the upgrade

After the upgrade completes, run the following commands to verify Fusion pods and deployments:

kubectl get pods -A | grep -i fusion
kubectl get deploy -A | grep -i fusion

Información adicional

Note:
  • The scaling is temporary. During the upgrade process, deployment configurations are reapplied from Helm charts or registry manifests. These configurations restore the intended replica counts automatically, so manual restoration is not required after the upgrade.
  • Also, the Fusion replica count was reduced to 2 starting with Dell Automation Platform 1.2, this replica-count-related memory issue is not expected when upgrading from Dell Automation Platform 1.2 or later versions.

    EE-Ticket DAPEE-235
    Defect    DAP07A-2316 , DAP07A-2300

Productos afectados

NativeEdge

Productos

NativeEdge Solutions
Propiedades del artículo
Número del artículo: 000464006
Tipo de artículo: How To
Última modificación: 14 may 2026
Versión:  1
Encuentre respuestas a sus preguntas de otros usuarios de Dell
Servicios de soporte
Compruebe si el dispositivo está cubierto por los servicios de soporte.