PowerFlex Manager — falha na atualização do nó SO na função update_clc_node_agent
Summary: Falha no PFxM ao fazer upgrade do nó Storage Only (SO) durante a função "update_clc_node_agent", fazendo com que a operação de upgrade seja interrompida ao tentar colocar o SDS no PMM.
Symptoms
Situação
- Meio Ambiente: Equipamentos CloudLink Center de alta disponibilidade (HA)
- Problema: O grupo de recursos do SO lista apenas uma das duas VMs CLC esperadas.
- Sintoma: Falha no upgrade dos nós SO, citando que o nó não está no PMM.
Esse cenário representa o ponto de gatilho para a falha do upgrade. Veja abaixo um exemplo de como uma pilha íntegra deve aparecer ao executar o update_clc_node_agent função:
Exemplo de pilha saudável:
Local do registro: Job-afe400aa-d7fe-4897-9a04-fe08b924c4ae-0-1/deployment.logs
DEBUG [2024-12-16T11:20:36.199529] 13742: service_deployment.rb:5348:in `process_firmware_update': Processing firmware update after selecting resources DEBUG [2024-12-16T11:20:36.200310] 13742: service_deployment.rb:5353:in `block in process_firmware_update': Processing firmware update on rackserver-xxxxxxx INFO [2024-12-16T11:20:36.201536] 13742: service_deployment.rb:5363:in `block in process_firmware_update': Updating CLC Agent update on vmcl01-esxi08.dell.lab DEBUG [2024-12-16T11:20:36.201933] 13742: service_deployment.rb:5365:in `block in process_firmware_update': Updating CLC Agent version on node svm-vmcl01-esxi08 DEBUG [2024-12-16T11:20:36.202379] 13742: type/base.rb:412:in `delegate': service_deployment.rb:5366:in `block in process_firmware_update' calling delegated method update_clc_node_agent on # DEBUG [2024-12-16T11:20:36.204979] 13742: type/base.rb:412:in `delegate': cloudlinkcenter.rb:205:in `clc_agent_info' calling delegated method os_connect_ip on # DEBUG [2024-12-16T11:20:38.054169] 13742: type/base.rb:412:in `delegate': cloudlinkcenter.rb:742:in `update_clc_node_agent' calling delegated method os_connect_ip on # DEBUG [2024-12-16T11:20:38.760221] 13742: provider/cloudlink/cloudlinkcenter.rb:747:in `update_clc_node_agent': clc-10.10.30.20: CLC Server and agent are running on same version 7.1 (build 140) INFO [2024-12-16T11:20:38.760840] 13742: service_deployment.rb:5367:in `block in process_firmware_update': Competed CLC agent update on vmcl01-esxi08.dell.lab
Por outro lado, a pilha não íntegra exibe o seguinte erro:NoMethodError: undefined method '[]' for nil:NilClass
Local do registro: Job-afe400aa-d7fe-4897-9a04-fe08b924c4ae-0-1/deployment.logs
DEBUG [2024-12-19T13:35:48.462150] 19552: service_deployment.rb:5348:in `process_firmware_update': Processing firmware update after selecting resources DEBUG [2024-12-19T13:35:48.462349] 19552: service_deployment.rb:5353:in `block in process_firmware_update': Processing firmware update on rackserver-xxxxxxx INFO [2024-12-19T13:35:48.463044] 19552: service_deployment.rb:5363:in `block in process_firmware_update': Updating CLC Agent update on PFSON04 DEBUG [2024-12-19T13:35:48.463276] 19552: service_deployment.rb:5365:in `block in process_firmware_update': Updating CLC Agent version on node PFSON04 DEBUG [2024-12-19T13:35:48.463622] 19552: type/base.rb:412:in `delegate': service_deployment.rb:5366:in `block in process_firmware_update' calling delegated method update_clc_node_agent on # DEBUG [2024-12-19T13:35:48.466045] 19552: type/base.rb:412:in `delegate': cloudlinkcenter.rb:205:in `clc_agent_info' calling delegated method os_connect_ip on # DEBUG [2024-12-19T13:35:51.089302] 19552: type/base.rb:412:in `delegate': cloudlinkcenter.rb:742:in `update_clc_node_agent' calling delegated method os_connect_ip on # ERROR [2024-12-19T13:35:51.093230] 19552: service_deployment.rb:5535:in `process_firmware_update': Encountered an error during firmware update: NoMethodError: undefined method `[]' for nil:NilClass
Além disso, os registros do trabalho de upgrade capturam o momento exato em que a tarefa falha:
Local do registro: Job-afe400aa-d7fe-4897-9a04-fe08b924c4ae-0-1/deployment.logs
DEBUG [2024-12-19T13:37:23.210005] 19552: service_deployment.rb:6485:in `finalize_firmware_update': Update complete: false, in protected maintenance mode false ERROR [2024-12-19T13:37:23.210184] 19552: service_deployment.rb:6491:in `finalize_firmware_update': Failed to update the server! INFO [2024-12-19T13:37:23.210321] 19552: service_deployment.rb:6496:in `finalize_firmware_update': Firmware update status: Error ERROR [2024-12-19T13:37:23.216294] 19552: service_deployment.rb:622:in `process': Firmware update failed for Job-afe400aa-d7fe-4897-9a04-fe08b924c4ae-0-2 ERROR [2024-12-19T13:37:23.216535] 19552: service_deployment.rb:623:in `process': ["/opt/asm-deployer/lib/asm/service_deployment.rb:6500:in `finalize_firmware_update'", "/opt/asm-deployer/lib/asm/service_deployment.rb:5549:in `process_firmware_update'", "/opt/asm-deployer/lib/asm/service_deployment.rb:479:in `process'", "/opt/asm-deployer/lib/asm.rb:228:in `block in process_deployment'"] INFO [2024-12-19T13:37:23.216961] 19552: service_deployment.rb:625:in `process': Status: Error
Impacto
Não é possível fazer upgrade dos nós SO.
Cause
Os logs indicam que o PowerFlex Manager não pode prosseguir com a tarefa update_clc_node_agent, pois ele não consegue identificar o "Principal" correto entre os dois equipamentos CLC. O que é mostrado em deployment.logs na linha de erro:
ERROR [2024-12-19T13:35:51.093230] 19552: service_deployment.rb:5535:in `process_firmware_update': Encountered an error during firmware update: NoMethodError: undefined method `[]' for nil:NilClass
Resolution
-
Tentar uma ação Update Service Details
-
Inicie a ação Update Service Details no serviço afetado.
-
-
Verificar resumo de inventário no assistente
- Durante o processo, o assistente deve exibir um resumo do inventário indicando que um equipamento CLC está sendo removido e outro está sendo adicionado.
- Isso confirma que o CLC atual não é o principal e que o equipamento que está sendo adicionado é o principal correto
-
Conclua o processo Update Service Details Process
- Conclua a ação Update Service Details conforme orientado pelo assistente.
-
Prossiga com o upgrade
- Repita o upgrade. Agora, deve prosseguir sem problemas.
Versão afetada
PowerFlex Manager 3.x