PowerFlex: SDS Devices Missing Data After Reboot in AWS or Azure Cloud Environment

요약: SDS devices are missing data after reboot.

이 문서는 다음에 적용됩니다. 이 문서는 다음에 적용되지 않습니다. 이 문서는 특정 제품과 관련이 없습니다. 모든 제품 버전이 이 문서에 나와 있는 것은 아닙니다.

증상

After a hard OS reboot on one SDS, the SDS devices on that host no longer have data or signatures.

PowerFlex system is running inside a cloud environment, AWS, or Azure.

Symptoms

 - SDS node hard reboots (intentional or not)

 - Upon coming back up, SDS devices are present but all devices move to an error state upon joining back to the PowerFlex system.

 - Devices from the SDS that rebooted are in an error state and cannot be cleared.

 - Devices are missing SDS device signatures (from sds trc log):

2023/02/27 06:42:26.477168 7f669822fdb0:mosAsyncIO_OpenFileEx:00465: Opened file /dev/nvme2n1 (fd 18), maxInflight 8, maxIoSize 1310720, ptr 0x7f66a8003bf0
2023/02/27 06:42:26.477313 7f669822fdb0:phyDevMap_ThreadedReadDevId:00732: ERROR: Read device ID of /dev/nvme2n1 took 10 milli rc=INVALID_DEVICE_HEADER_SIGNATURE  
 


 - If CloudLink is running, the devices are seen as "unencrypted" and "data_raw" (from svmd.log)

{"mpoint":"/dev/nvme0n1","status":"unencrypted","label":"","type":"data_raw","size":1788, "drives" : ["/dev/nvme0n1"]},"9486948577258625":
{"mpoint":"/dev/nvme1n1","status":"unencrypted","label":"","type":"data_raw","size":1788, "drives" : ["/dev/nvme1n1"]},"9486948577268625":
{"mpoint":"/dev/nvme2n1","status":"unencrypted","label":"","type":"data_raw","size":1788, "drives" : ["/dev/nvme2n1"]},"9486948577278625":
{"mpoint":"/dev/nvme3n1","status":"unencrypted","label":"","type":"data_raw","size":1788, "drives" : ["/dev/nvme3n1"]},"9486948577288625":
{"mpoint":"/dev/nvme4n1","status":"unencrypted","label":"","type":"data_raw","size":1788, "drives" : ["/dev/nvme4n1"]},"9486948577298625":
{"mpoint":"/dev/nvme5n1","status":"unencrypted","label":"","type":"data_raw","size":1788, "drives" : ["/dev/nvme5n1"]},"9491767425":
 

Impact

Data degraded and a subsequent rebuild if only one node reboots

If more than one node reboots at or near the same time, data loss is a possibility.

원인

When running inside a cloud environment such as AWS or Azure, if the devices chosen for the underlying hardware are NVMEs, these are ephemeral and the data go away on hard reboot or when the instance is powered down. From the AWS documentation here:

 - NVMe SSD drives are ephemeral storage. When instances are powered down, the NVMe devices are wiped, in accordance with AWS design.
 - NVMe SSD drives are for extreme performance. Data does not persist following either a planned or unplanned shutdown. A backup solution is recommended.

해결

As mentioned in the documentation, this is working as designed. Make sure to have a backup solution.

 

Impacted Versions

PowerFlex 3.6.x
PowerFlex 4.x

해당 제품

PowerFlex rack, ScaleIO
문서 속성
문서 번호: 000211253
문서 유형: Solution
마지막 수정 시간: 06 1월 2026
버전:  1
다른 Dell 사용자에게 질문에 대한 답변 찾기
지원 서비스
디바이스에 지원 서비스가 적용되는지 확인하십시오.