PowerStore: Mapping an NVMeoF volume may lead to service disruption on multi-appliance clusters

Сводка: Mapping NVMeoF volumes on a multi-appliance cluster may lead to service disruption for the appliance on which the volume is created.

Данная статья применяется к Данная статья не применяется к Эта статья не привязана к какому-либо конкретному продукту. В этой статье указаны не все версии продуктов.

Симптомы

Mapping NVMeoF volumes on a multi-appliance cluster may lead to service disruption for the appliance on which the volume is created. This may occur only in appliance#2 and above. This does not occur on the first appliance.

 

Environment:

  • Multi-appliance cluster
  • Hosts are connected over NVMe/FC or NVMe/TCP.
  • There were either (a) multiple add appliance failures or (b) multiple remove appliances performed.

 

Symptoms:

  • The node may unexpectedly reboot.
  • If both nodes reboot, a service disruption may occur.

 

Причина

  • On NVMeoF (NVMe/FC or NVMe/TCP), a basic mechanism exists to support asymmetric namespace access (ANA)
    ANA occurs on appliances where volume access characteristics may be different between NVMe controllers.
    Example: Volume-1 on Node-A may be optimized while Volume-1 on Node-B is non-optimized.
  • The concept is similar to ALUA with Target Port Group (TPG):
    Each node is assigned a unique TPG ID to distinguish between the states of each node (which is optimized and which is non-optimized)
  • With NVMe-oF on PowerStore, each appliance has several ANA groups:
    • ANA Group #1 - Used for volume migration between appliances (the group ID is 1 across the cluster)
    • ANA Group #X - Used to describe volumes where Node-A is optimized and Node-B is non-optimized
    • ANA Group #Y - Used to describe volumes where Node-A is non-optimized and Node-B is optimized
    • ANA Group #Z (Future Use) - Used to describe volumes where Node-A and Node-B are optimized (Active/Active)
  • When adding an appliance, Control-Path uses a special sequence number to determine the target port group id to create.
    This sequence only increments, even when the added appliance fails. The sequence can be quite large if the added appliance fails several times.
  • Due to a software issue, there is a limit on the maximum ANA Group ID, while Control-Path has no limit.
  • When mapping a volume to an NVMe host, the volume is classified to the correct ANA group; the ANA group is derived from the TPG ID for the Node who owns the volume.
  • The mapping operation may lead to a software module failure that may lead to a node reboot

 

Разрешение

This issue is fixed in PowerStoreOS 4.0.0.

 

Workaround

  • Escalate to Global Services for assistance and after recovery, plan to upgrade to PowerStoreOS 4.0.0. See this KB article for expedited attention.

 

Затронутые продукты

PowerStore
Свойства статьи
Номер статьи: 000216639
Тип статьи: Solution
Последнее изменение: 28 May 2024
Версия:  3
Получите ответы на свои вопросы от других пользователей Dell
Услуги технической поддержки
Проверьте, распространяются ли на ваше устройство услуги технической поддержки.