Start a Conversation

Unsolved

This post is more than 5 years old

2322

August 2nd, 2010 21:00

RAID stripe expansion - stuck at 0%

I have a CX-500 with FLARE 2.19.500.5.014 and NaviSphere 6.19.0.4.15

I have one DAE filled with 146GB FC disks using RAID5.  This is my original working LUN.

I have a second DAE filled with 146GB FC disks using RAID 5. This is the LUN I want to stripe expand into the original.

In NaviSphere, I went to the RAID Group, chose the original LUN, chose Expand, followed the Wizard, chose Stripe Expand, and the Wizard correctly found my second DAE so I selected that to add to my first LUN.

I believe I meet all the criteria to do a stripe expand, as the disks are the same size, the RAID is the same, the two LUNs are the same size, and they both sit on my SP-B. BTW neither of these two DAE's hold the vault disks which are on a separate enclosure.

The problem is 4 days after starting the job, the stripe expansion shows 0% completed. Any idea's what I've done wrong?  Thanks.

59 Posts

August 2nd, 2010 21:00

Hi,

You probably have not done anything wrong, but given the age of your flare code it could be a flare bug.

If you have navicli installed can you run the following -

navicli -h metalun -list -state

This will give you the state of the metalun without the GUI interfering.

Please post the output here...

9 Posts

August 2nd, 2010 23:00

Hi Jim,

Thanks for the fast reply.  I have navicli installed, but when we typed in your suggested instruction

it returns "Invalid command line parameters"

I guess my version of the CLI is too old. I looked at the list of available commands and there is no "metalun" command.

Although I do understand it would be useful to see if the MetaLUN state = 'expanding'.

Since this CX500 is an older machine, we don't have an active maintenance contract for it, so getting a newer version of FLARE is impossible,

and I don't know if it's even possible to get a newer version of the navicli. That leaves me kind of stuck.

2.2K Posts

August 3rd, 2010 08:00

To find out what commands are supported with the version of navicli you are running use this:

navicli -h metalun help

This should return a list of supported metalun commands. Look under the -list section.

21 Posts

August 3rd, 2010 09:00

Times when Navisphere manager doesnot report correct figure, try using cli commands, use metalun -list command.

5 Practitioner

 • 

274.2K Posts

August 3rd, 2010 14:00

This is a FLARE issue and you should upgrade the 2.19 patch level to 45. Since you are unable to upgrade the FLARE as you stated in your previous message. Recheck any LUN(s) in the RAID group to make sure none are trespassed. If any LUNs are trespassed, attempt to manually trespass them back to their default owner. You can do this by right-clicking on the LUN and select Trespass. Sometimes expansions are stuck because LUNs are trespassed or trespassing back and forth. Check the host failover settings to see if it can minimize trespasses.

Check the Events logs on each SP to see if any trespasses occurred on any LUN(s) right before the expansion started in that RAID group. Right-click on each SP and select View Events. To check for Trespasses in Navisphere, go to Tools > Trespass, this will show you all LUNs that are trespassed on a particular storage system. Verify there are no faults currently on the storage system.

In addition, a expansion will kick off a defragmentation which is part of the expansion, so it's possible the LUN(s) have not completed the defragmentation yet. See KB solutions emc134339 and emc178703.

Hope this helps...

9 Posts

August 4th, 2010 02:00

Thanks for everyone's input. I really appreciate the help. Unfortunately, I'm kind of a newbie when it comes to storage so I'm digesting your replies, and I have further questions.

First, my version of navicli does not have the metalun command at all, so it's not just an issue of not having all the parameters for the command.

For my primary LUN, the default SP was SP-A, but the current SP shows it to be SP-B.  For the new LUN that I want to expand into the first, both default and current are SP-B.

Now, judging from your feedback, this may be evidence of a Trespass issue, however, when I go to the dropdown menu Tools and click on the show Trespassed LUNs, my primary storage LUN is NOT listed there.

Say I wanted to just in case, switch the default SP of my primary LUN to SP-B, so that there can be no issue of trespass, I can't currently do it because the option is greyed out.  Since I am stuck in the middle of the stripe expansion, the LUN shows up as a Private LUN, and I'm not able to change any of it's parameters. Unfortunately, I don't know how to stop the expansion...even though it's stuck at 0%...it's still showing as if it's an ongoing process.

The event logs did reveal that the expansion stopped 1 second after it was initiated 5 days ago (sorry I didn't see that).  Unfortunately, being a newbie, I can't interpret error code yet.  This will take more research on my part.  If you have a hint, it's appreciated.

Thanks for the help. Here's what the log said.

184.
Date:08/01/2010
Time:12:13:54 AM
Event Code:0x6a0
Description:Disk soft media error
Subsystem:CK200052801065
Device:Enclosure 1 Disk 10
SP:SPB
Host:CX500SPB
Source:N/A
Category:N/A
Log:Storage Array
Sense Key:0x0
Ext Code1:0x0
Ext Code2:0x22
Type:Information


185.
Date:07/31/2010
Time:10:00:14 PM
Event Code:0x71370005
Description:MetaLUN Stripe Expansion Halted:  -1472744168.   00 00 04 00 02 00 58 00 d3 04 00 00 05 00 37 61 05 00 37 61 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 71 37 00 05
Subsystem:CK200052801065
Device:N/A
SP:N/A
Host:CX500SPB
Source:Aggregate
Category:NT System Log
Log:NT System Log
Sense Key:N/A
Ext Code1:N/A
Ext Code2:N/A
Type:Information


186.
Date:07/31/2010
Time:10:00:14 PM
Event Code:0x71370005
Description:MetaLUN Stripe Expansion Halted:  3.   00 00 04 00 02 00 58 00 d3 04 00 00 05 00 37 61 05 00 37 61 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 71 37 00 05
Subsystem:CK200052801065
Device:N/A
SP:N/A
Host:CX500SPB
Source:Aggregate
Category:NT System Log
Log:NT System Log
Sense Key:N/A
Ext Code1:N/A
Ext Code2:N/A
Type:Information

24 Posts

August 4th, 2010 13:00

Sorry for the typos there but the full navicli command is:

navicli -h trespass lun

24 Posts

August 4th, 2010 13:00

Hi POL-SG,

I don't see in your post you tring to trespass the lun you are trying to expand.  Have you trespassed the lun by either 1.  Right click the lun selecting trespass or 2.  Issuing navicli -h lun

If you have not then tresspass the lun in question and post the results.  The expansion should start and if it does not then a ticket may be  needed for deeper investigation as there are other possible issue with the lun.

4.5K Posts

August 4th, 2010 14:00

In Navisphere look in the SPB tree - you should see the original LUN listed there with the rings around the icon of the LUN - this is the metaLUN - in all versions of Navisphere there is no metaLUN create command - it is the "Expand".

Sometimes the expand will get stuck if there is a Background verify running on one of the LUNs that you're trying to expand. Find the LUN under the SPB tree and right click on the LUN and select Trespass - sometimes that will unstick the expand function.

glen

9 Posts

August 4th, 2010 20:00

I had tried to trespass the individual LUN that was my primary LUN, but the message came back "cannot trespass private LUN". I imagine this is because it believes it's still in the middle of the expansion.

I had not tried to trespass the meta-lun (out of fear and ignorance that I may screw something up). However, upon your encouragement, I trespassed the meta-lun this morning and voila...the expansion started!  Progress immediately jumped to 34%, and it has been stuck at that level for the last 1.5 hours.  This time there are no error messages in the event log, so I'm just going to let it run for a few more hours and see if it progresses further.  Will update you later today.

Thanks again for the great input.

59 Posts

August 4th, 2010 20:00

Pol, just so you can understand the etalun structure...

LEts say you start with lun 10, and you want to expand this lun.

You create another lun, lun 20, and depending on the type of expansion striped or concatenate you add lun20 to lun10.

When this is done flare will renumber lun 20 to the highest available lun number and make this lun a private lun. It is also trespassed to the same SP as the "head" lun.

Lun10 will still remain as lun10 (even though it actually also gets renumbered) and is known as the "head" of the metalun.

Anything that needs to be done for this metalun is to be done via the head lun...ie, trespass, snapshot, mirror etc....

Any trespasses on the metalun, will trespass all of the components of the meta lun to follw the head lun.

Anyway hope this is not as clear as mud...

Jim

9 Posts

August 5th, 2010 03:00

Yes it makes sense.  I'm happy to report the expansion is still continuing. After another 7 hours it's about 36% complete now. There are no error messages in the log. If everything continues as expected, I'm happy to close this discussion as 'answered' with many thanks to all of you. It's been a great learning experience.

4.5K Posts

August 5th, 2010 10:00

metaLUN.bmp

4.5K Posts

August 5th, 2010 10:00

remember that all operations you perform on the metaLUN must be done by using the metaLUN "head" that is, the LUN with the ring on the icon. This icon  indicates that this is the metaLUN head. This is the one that you want to use when changing parameters. The only time you would select for individual component LUNs is when you need to change the LUN cache settings, then would select each of the component LUNs to make the change.

This picture is from a later flare release, on release 19 you would find the metaLUNs under the owning SP.

metaLUN.bmp

glen

9 Posts

August 5th, 2010 20:00

Understood Glen. The expansion is continuing. It's now at 42%.  At this rate it will finish in another 7 days. I'm trying to lower the application workload on the SAN to allow it more to use more resources for the expansion. [ignore the 'F' on the screen below. I have a failed disk on a separate DAE.] I hope to close this case as 'answered' after the weekend.  I'm really looking forward to seeing what kind of read/write performance I get after the expansion is complete. Thanks again.

Expansion.gif

No Events found!

Top