Start a Conversation

Unsolved

This post is more than 5 years old

U

2730

November 16th, 2015 10:00

Can't seem to drop failed pool

I have an older CX4 with no support.

We finally lost enough drives out of our pool to make it fail.

Now, I tried to drop the pool, and it's managed to "unbind" several of the disks, but has been going for several hours now and I see no progress in the delete operation.

I have already split the array into more than 1 pool, so that once a pool fails I have been able to shuffle the luns around to other pools, then drop and recreate the pool as a smaller pool, and then move the luns back.

all this shuffling of the luns is probably driving a higher failure rate, but I don't know any better approach other than to revert to using a bunch of smaller raidgroups instead of pools.

Anyhow, back to the main issue, how can I get this "drop" to finish? It appears to be hung.

4.5K Posts

November 17th, 2015 08:00

When you say "drop" I'm assuming that you mean your trying to destroy the Pool/

To destroy a Pool, you must first destroy the LUNs in the Pool, which also means that the LUNs can not be part of a Storage Group or part of a Mirror. But if you have a faulted Pool (the pool is off-line due to multiple disks failing) then you will not be able to destroy the LUNs, which means you can not destroy the Pool. You'll have to repair the Pool first.

If this array is out of contract, I'd recommend that you not use Pools, only Raid Groups - disk failures will only affect the raid group and not the whole Pool.

glen

222 Posts

November 17th, 2015 10:00

The Pool has 0 luns, is empty, but will not drop/destroy/delete.

I can post a pix if desired, but the Navisphere GUI is showing "Deleting" as the status for over 24 hours.

The set of disks, some are State=Enabled, others are Stage=Unbound, but are not changed in the past 24 hours.

Appears the "delete" operation is hung and the array will not let me perform any operation on the pool, it sais to wait for the array to complete, which it's not doing.

And Yes, I am strongly thinknig of moving away from pools to smaller raid-groups, probably 12+2 R6.

222 Posts

November 17th, 2015 11:00

Nl9Z5Bg.pngvdQR1AV.png

222 Posts

November 17th, 2015 12:00

071ZzsT.png

4.5K Posts

November 18th, 2015 11:00

The error message in the screen cap shows that there was some type of operation running on the disks in the Pool. Rung the following CLI command to the array - this should indicate what is occurring on the disks:

navicli -h spa storagepool -list


Below is an example of a Pool that had a disk fail and the new disk was still rebuilding the private LUNs. The LUN in a private state means that you can not delete that LUN until after it has finished rebuilding.


Current Operation:  Destroying

Current Operation State:  Failed

Current Operation Status:  The unbind operation has failed; the LUN may be private (0x40008053)    

Current Operation Percent Completed:  40

If you run the "getlun -state" command it should show if there are any LUNs that are rebuilding.

glen

222 Posts

November 20th, 2015 04:00

I did get the pool dropped after rebooting the SPs. It took over 24 hours to drop though.

I only had one HS for the pool, and 2 failed drives.

Ive settled on using Raid6 due to the large drive sizes and large number of drives in the pool.

If I go with raid groups and metaluns instead of pools, I could configure the raid group as a 12+2 with one HS per tray. That way, one tray would be a single raid group.

The array is mostly filed with the 1Tb and 2Tb sata drives, so due to them being slow and large, I feel Raid6 is my only option as the possibility of a second drive failure during a rebuild is high.

No Events found!

Top