PowerStore: Mapping an NVMeoF volume may lead to service disruption on multi-appliance clusters

Resumo: Mapping NVMeoF volumes on a multi-appliance cluster may lead to service disruption for the appliance on which the volume is created.

Este artigo aplica-se a Este artigo não se aplica a Este artigo não está vinculado a nenhum produto específico. Nem todas as versões do produto estão identificadas neste artigo.

Sintomas

Mapping NVMeoF volumes on a multi-appliance cluster may lead to service disruption for the appliance on which the volume is created. This may occur only in appliance#2 and above. This does not occur on the first appliance.

 

Environment:

  • Multi-appliance cluster
  • Hosts are connected over NVMe/FC or NVMe/TCP.
  • There were either (a) multiple add appliance failures or (b) multiple remove appliances performed.

 

Symptoms:

  • The node may unexpectedly reboot.
  • If both nodes reboot, a service disruption may occur.

 

Causa

  • On NVMeoF (NVMe/FC or NVMe/TCP), a basic mechanism exists to support asymmetric namespace access (ANA)
    ANA occurs on appliances where volume access characteristics may be different between NVMe controllers.
    Example: Volume-1 on Node-A may be optimized while Volume-1 on Node-B is non-optimized.
  • The concept is similar to ALUA with Target Port Group (TPG):
    Each node is assigned a unique TPG ID to distinguish between the states of each node (which is optimized and which is non-optimized)
  • With NVMe-oF on PowerStore, each appliance has several ANA groups:
    • ANA Group #1 - Used for volume migration between appliances (the group ID is 1 across the cluster)
    • ANA Group #X - Used to describe volumes where Node-A is optimized and Node-B is non-optimized
    • ANA Group #Y - Used to describe volumes where Node-A is non-optimized and Node-B is optimized
    • ANA Group #Z (Future Use) - Used to describe volumes where Node-A and Node-B are optimized (Active/Active)
  • When adding an appliance, Control-Path uses a special sequence number to determine the target port group id to create.
    This sequence only increments, even when the added appliance fails. The sequence can be quite large if the added appliance fails several times.
  • Due to a software issue, there is a limit on the maximum ANA Group ID, while Control-Path has no limit.
  • When mapping a volume to an NVMe host, the volume is classified to the correct ANA group; the ANA group is derived from the TPG ID for the Node who owns the volume.
  • The mapping operation may lead to a software module failure that may lead to a node reboot

 

Resolução

This issue is fixed in PowerStoreOS 4.0.0.

 

Workaround

  • Escalate to Global Services for assistance and after recovery, plan to upgrade to PowerStoreOS 4.0.0. See this KB article for expedited attention.

 

Produtos afetados

PowerStore
Propriedades do artigo
Número do artigo: 000216639
Tipo de artigo: Solution
Último modificado: 28 mai. 2024
Versão:  3
Encontre as respostas de outros usuários da Dell para suas perguntas.
Serviços de suporte
Verifique se o dispositivo está coberto pelos serviços de suporte.