VPlex: Logging volume showing critical failure due to disk being removed from the backend storage array
Summary: This article outlines the issue where a disk is removed from the back-end storage array while it is still being used on the VPlex as a logging volume and provides the workaround steps to resolve this issue. ...
Symptoms
1. A logging volume is showing as critical-failure in the health check output.
VPlexcli:/> health-check
Product Version: 5.3.0.03.00.04
Product Type: Metro
WAN Connectivity Type: FC
Hardware Type: VS2
Cluster Size: 1 engines
Clusters:
---------
Cluster Cluster Oper Health Connected Expelled Local-com
Name ID State State
--------- ------- ----- ------------- --------- -------- ---------
cluster-1 1 ok ok True False ok
cluster-2 2 ok minor-failure True False ok
cluster-2 Transition/Health Indications:
105 unhealthy Devices or storage-volumes
storage-volume unreachable
Meta Data:
----------
Cluster Volume Volume Oper Health Active
Name Name Type State State
--------- -------------------------------------- -------------- ----- ---------------- ------
cluster-1 VPLEX_DC1_meta meta-volume ok ok True
cluster-1 logging_vplex_dc2_log logging-volume ok ok -
cluster-1 VPLEX_DC1_meta_backup_2016Jun15_235911 meta-volume ok ok False
cluster-1 VPLEX_DC1_meta_backup_2016Jun14_235911 meta-volume ok ok False
cluster-1 LV_CLUSTER1_LOG1 logging-volume ok ok -
cluster-2 logging_volume_vplex logging-volume error critical-failure -
cluster-2 VPLEX_DC2_META meta-volume ok ok True
cluster-2 LV_CLUSTER2_LOG1 logging-volume ok ok -
cluster-2 VPLEX_DC2_META_backup_2016Jun15_235907 meta-volume ok ok False
cluster-2 VPLEX_DC2_META_backup_2016Jun14_235909 meta-volume ok ok False
Storage:
--------
Cluster Total Unhealthy Total Unhealthy Total Unhealthy No Not visible With
Name Storage Storage Virtual Virtual Dist Dist Dual from Unsupported
Volumes Volumes Volumes Volumes Devs Devs Paths All Dirs # of Paths
--------- ------- --------- ------- --------- ----- --------- ----- ----------- -----------
cluster-1 59 0 52 51 51 51 0 0 0
cluster-2 57 1 51 51 51 51 0 0 0
2. Checking this logging-volume hierarchy the storage it is built on has been removed.
VPlexcli:/clusters/cluster-2/system-volumes/vplex_dc2_log_vol/components> show-use-hierarchy /clusters/cluster-2/system-volumes/vplex_dc2_log_vol
logging-volume: vplex_dc2_log_vol (10G, raid-0, critical-failure, cluster-2)
extent: extent_vplex_DC2_LOG_bad_1 (10G, critical-failure)
storage-volume: vplex_DC2_LOG_bad (10G, critical-failure)
------> NO Storage array information
Cause
Resolution
Please follow the below workaround steps to resolve this issue:
1. Check the logging volume context to see if the distributed devices are still set to this logging volume.
VPlexcli:/clusters/cluster-2/system-volumes> ll
Name Volume Type Operational Health State Active Ready Geometry Component Block Block Capacity Slots
-------------------------------------- -------------- Status ---------------- ------ ----- -------- Count Count Size -------- -----
-------------------------------------- -------------- ----------- ---------------- ------ ----- -------- --------- -------- ----- -------- -----
LV_CLUSTER2_LOG1_vol logging-volume ok ok - - raid-1 1 2621440 4K 10G -
VPLEX_DC2_META meta-volume ok ok true true raid-1 2 22019840 4K 84G 32000
VPLEX_DC2_META_backup_2016Jun14_235909 meta-volume ok ok false true raid-1 1 22019840 4K 84G 64000
VPLEX_DC2_META_backup_2016Jun15_235907 meta-volume ok ok false true raid-1 1 22019840 4K 84G 64000
vplex_dc2_log_vol logging-volume error critical-failure - - raid-0 1 2621440 4K 10G -
VPlexcli:/clusters/cluster-2/system-volumes> cd /vplex_dc2_log_vol/segments
VPlexcli:/clusters/cluster-2/system-volumes/vplex_dc2_log_vol/segments> ll
Name Starting Block Use
---------------------------------------------------- Block Count ------------------------------------------
---------------------------------------------------- -------- ------- ------------------------------------------
allocated-device_DD1 0 8192 allocated for device_DD1
allocated-device_DD2 0 1600 allocated for device_DD2
allocated-device_DD3 0 800 allocated for device_DD3
2. Create a new logging-volume and ensure the hierarchy is healthy
VPlexcli:/clusters/cluster-2/system-volumes/LV_CLUSTER2_LOG1_vol/components> show-use-hierarchy /clusters/cluster-2/system-volumes/LV_CLUSTER2_LOG1_vol
logging-volume: LV_CLUSTER2_LOG1_vol (10G, raid-1, cluster-2)
extent: extent_CLUSTER2_LOG1 (10G)
storage-volume: CLUSTER2_LOG1 (10G)
logical-unit: VPD83T3:60060160690037000f5263e23732e611
storage-array:<ARRAY NAME>
3. Check to ensure that this volume is showing free space
VPlexcli:/clusters/cluster-2/system-volumes/LV_CLUSTER2_LOG1_vol/segments> ll
Name Starting Block Block Count Use
------ -------------- ----------- ----
free-0 0 2621440 free
4. Navigate to the distributed-devices context
VPlexcli:/> cd distributed-storage/distributed-devices/
5. Set the new logging-volume as the logging-volume for the cluster
VPlexcli:/distributed-storage/distributed-devices> set-log --logging-volumes LV_CLUSTER2_LOG1_vol --distributed-devices *
6. Confirm the new logging volume now contains the ditributed-device information
VPlexcli:/clusters/cluster-2/system-volumes/LV_CLUSTER2_LOG1_vol/segments> ll
Name Starting Block Use
---------------------------------------------------- Block Count ------------------------------------------
---------------------------------------------------- -------- ------- ------------------------------------------
allocated-device_DD1 0 8192 allocated for device_DD1
allocated-device_DD2 0 1600 allocated for device_DD2
allocated-device_DD3 0 800 allocated for device_DD3
7. Confirm the old logging volume is now showing only free space
VPlexcli:/clusters/cluster-2/system-volumes/vplex_dc2_log_vol/segments> ll
Name Starting Block Block Count Use
------ -------------- ----------- ----
free-0 0 24578 free
8. Navigate to system-volumes context and confirm there is a healthy logging volume
VPlexcli:/clusters/cluster-2/system-volumes> ll
Name Volume Type Operational Health State Active Ready Geometry Component Block Block Capacity Slots
-------------------------------------- -------------- Status ---------------- ------ ----- -------- Count Count Size -------- -----
-------------------------------------- -------------- ----------- ---------------- ------ ----- -------- --------- -------- ----- -------- -----
LV_CLUSTER2_LOG1_vol logging-volume ok ok - - raid-1 1 2621440 4K 10G -
VPLEX_DC2_META meta-volume ok ok true true raid-1 2 22019840 4K 84G 32000
VPLEX_DC2_META_backup_2016Jun14_235909 meta-volume ok ok false true raid-1 1 22019840 4K 84G 64000
VPLEX_DC2_META_backup_2016Jun15_235907 meta-volume ok ok false true raid-1 1 22019840 4K 84G 64000
vplex_dc2_log_vol logging-volume error critical-failure - - raid-0 1 2621440 4K 10G -
9. Destroy the old volume showing in critical-failure
VPlexcli:/clusters/cluster-2/system-volumes> logging-volume destroy --logging-volume vplex_dc2_log_vol
10. Destroy the extent the logging-volume was created on
VPlexcli:/clusters/cluster-2/storage-elements/extents> extent destroy --extents extent_vplex_DC2_LOG_bad_1
WARNING: The following items will be destroyed:
Context
-----------------------------------------------------------------------
/clusters/cluster-2/storage-elements/extents/extent_vplex_DC2_LOG_bad_1
Do you wish to proceed? (Yes/No) Yes
Destroyed 1 out of 1 targeted extents.
11. Navigate to the storage-volume context and unclaim the storage-volume the logging-volume was created on
VPlexcli:/clusters/cluster-2/storage-elements/storage-volumes/vplex_DC2_LOG_bad> unclaim
Unclaimed 1 of 1 storage-volumes.
12. Confirm health check returns clean
VPlexcli:> health-check
Product Version: 5.3.0.03.00.04
Product Type: Metro
WAN Connectivity Type: FC
Hardware Type: VS2
Cluster Size: 1 engines
Clusters:
---------
Cluster Cluster Oper Health Connected Expelled Local-com
Name ID State State
--------- ------- ----- ------ --------- -------- ---------
cluster-1 1 ok ok True False ok
cluster-2 2 ok ok True False ok
Meta Data:
----------
Cluster Volume Volume Oper Health Active
Name Name Type State State
--------- -------------------------------------- -------------- ----- ------ ------
cluster-1 VPLEX_DC1_meta meta-volume ok ok True
cluster-1 logging_vplex_dc2_log logging-volume ok ok -
cluster-1 VPLEX_DC1_meta_backup_2016Jun15_235911 meta-volume ok ok False
cluster-1 VPLEX_DC1_meta_backup_2016Jun14_235911 meta-volume ok ok False
cluster-1 LV_CLUSTER1_LOG1 logging-volume ok ok -
cluster-2 VPLEX_DC2_META meta-volume ok ok True
cluster-2 LV_CLUSTER2_LOG1 logging-volume ok ok -
cluster-2 VPLEX_DC2_META_backup_2016Jun15_235907 meta-volume ok ok False
cluster-2 VPLEX_DC2_META_backup_2016Jun14_235909 meta-volume ok ok False
Storage:
--------
Cluster Total Unhealthy Total Unhealthy Total Unhealthy No Not visible With
Name Storage Storage Virtual Virtual Dist Dist Dual from Unsupported
Volumes Volumes Volumes Volumes Devs Devs Paths All Dirs # of Paths
--------- ------- --------- ------- --------- ----- --------- ----- ----------- -----------
cluster-1 59 0 52 0 51 0 0 0 0
cluster-2 56 0 51 0 51 0 0 0 0
VPlexcli:/> ll **/storage-arrays/*
/clusters/cluster-2/storage-elements/storage-arrays/<ARRAY NAME>:
Attributes:
Name Value
------------------- ----------------------------------------------------------
auto-switch true
connectivity-status degraded
controllers [<ARRAY NAME>.SPA, <ARRAY NAME>.SPB]
logical-unit-count 57
ports [0x5006016108602147, 0x5006016408602147,
0x5006016908602147, 0x5006016c08602147]
b. Check the logical unit context for logical units showing in error
VPlexcli:/clusters/cluster-2/storage-elements/storage-arrays/<ARRAY NAME>/logical-units> ll
Name Connectivity Active/AAO Passive/AAN Visibility LUNs ALUA Support
---------------------------------------- Status Controllers Controllers ---------- ------------------ -----------------
---------------------------------------- ------------ ------------------ ------------------ ---------- ------------------ -----------------
VPD83T3:6006016076003700b743afe458dbe311 error <ARRAY NAME>.SPA <ARRAY NAME>.SPB none implicit-explicit
c. Navigate to the context of the logical-unit showing in error and confirm there is no underlying storage-associated with this.
VPlexcli:/clusters/cluster-2/storage-elements/storage-arrays/<ARRAY NAME>/logical-units/VPD83T3:6006016076003700b743afe458dbe311> ll
Name Value
---------------------- --------------------
active-aao-controller [<ARRAY NAME>.SPA]
active-aao-visibility []
alua-support implicit-explicit
connectivity-status error
luns [] <-----------No Underlying Storage
passive-aan-controller [<ARRAY NAME>.SPB]
passive-aan-visibility []
storage-volume -
visibility none
d. Forget this logical-unit
VPlexcli:/clusters/cluster-2/storage-elements/storage-arrays/<ARRAY NAME>:/logical-units/VPD83T3:6006016076003700b743afe458dbe311> forget
1 of 1 logical-units were forgotten.
e. Confirm the storage-array health status is now showing as ok
VPlexcli:/> ll **/storage-arrays
/clusters/cluster-2/storage-elements/storage-arrays:
Name Connectivity Auto Ports Logical
--------------------------- Status Switch ------------------- Unit
--------------------------- ------------ ------ ------------------- Count
--------------------------- ------------ ------ ------------------- -------
<ARRAY NAME> ok true 0x5006016108602147, 56
0x5006016408602147,
0x5006016908602147,
0x5006016c08602147