Start a Conversation

Unsolved

This post is more than 5 years old

A

1638

December 14th, 2012 08:00

User luns on vault drives -impacts ?

We've had an interesting fault which is causing (still) no end of pain.

Storage Array:AX4-5F
Environ: VMWare
Vault disks (1TB SATA): contain user luns.

We were doing ndu to upgrade FLARE up to 711.
The upgrade started on 12Dec at 17:00 and completed at 19:08.
During upgrade vault disk went faulty and WR cache disabled. There are user luns on the vaults so the hot spare got invoked. It is now in the process of rebuilding the user luns (3x500GB) and has already been 48 hours and still has 1x500GB left to do. The whole time the WR cache being disabled is causing all VMs no end of pain. We tried re-enabling the WR cache but it won't let us do it, so we continue to suffer.
Additionally there is ofcourse the equalisation part to complete also before the WR cache becomes enabled.

1. Is there any WR cache 'force' command available ?
2. Would anyone like to hazard a guess how long this is likely to take ?
3. Is there anything else at all that we can do ?

We're just looking for some construtive input here please. We are aware that it is a penalty we're paying for having user luns on the vaults, but re-stating that won't help.

thanks.

1.4K Posts

December 16th, 2012 17:00

1. Is there any WR cache 'force' command available ?

No.

2. Would anyone like to hazard a guess how long this is likely to take ?

Apologies, for the delayed response but I reckon it should have been completed by now. However, for future references refer to EMC CLARiiON Best Practices for Performance and Availability

From EMC CLARiiON Best Practices for Performance and Availability- Best Practices:

Rebuild.JPG

3. Is there anything else at all that we can do ?

Since, I would go for changing/tweaking rebuild priority to complete the process faster. If by changing rebuild priority impacts the performance negatively I would change it back to normal.

224 Posts

December 16th, 2012 22:00

Adding to ankit's point.

In CX-Series arrays, the first five drives in the base disk enclosure are used for several internal

Tasks.

We recommend binding a RAID group on the vault drives (such as a five drive RAID 5) and a

test LUN, which should not be in any Storage Group. A typical test LUN size is 10 GB, but the LUN can be of any capacity, up to and including all the available space on the vault drives.

Very heavy host I/O on these four drives results in increased response times for Navisphere commands and could interfere with this staging. Very heavy I/O ultimately could cause an NDU to timeout (in which case Flare stays at the revision level that it started from).

In short the lab recommends not to bind USER LUNs on the vault drives. Kindly find more on this on Primus emc79630.

Regards,

Sheron Godfred.

44 Posts

December 17th, 2012 00:00

Thanks for your attempts guys but was it not clearly stated this is an AX4 NOT CX and therefore the rebuld priority is NOT an option, it is actually set to 'High' and cannot be changed.

Just for your info, it has not yet completed. It has now been going on for over 4 days. Everything is working as it should, just very slow. It is currently on the second (of five) luns on the raid group but has completed the rebuld and doing the equalization. One lun of 500GB took over 24 hours just on the equlization.

My advice, regardless of what anyone tells you::

(a) do NOT use large luns as vaults.

(b) do not put any user luns on the vaults (even ones requiring low IOPs).

(c) disable 'HA vault cache' and just replace any faulty vault asap.

The pain at our site with ths (on the VMs) has been really bad and senior management will not forget easily.

1.4K Posts

December 17th, 2012 04:00

(a) do NOT use large luns as vaults.

(b) do not put any user luns on the vaults (even ones requiring low IOPs).

(c) disable 'HA vault cache' and just replace any faulty vault asap.

These are mentioned in the Best Practices but thank you for sharing!

247 Posts

December 19th, 2012 04:00

Andy, an AX4 is basically a CX3. We've got some CX3's in our company as well and ran into the same issue a couple of months ago. I agree with your advice: don't put any LUNs on the CX3 vault drives.

FYI, the CX4 generation and later will not disable write cache when a vault drive fails. So you could still store some data on those drives if you really wanted to...

No Events found!

Top