Start a Conversation

Unsolved

This post is more than 5 years old

13220

March 13th, 2016 03:00

Replace hard

Hi dear friends

I have vnx 5400.25*600GB*10KB.an storage pool with RAID5 and 20 disks.

about a week ago disk22 on DPE showed me an alert.when I checked it my hots pare(disk24) disk was replaced automatically and disk22 shows removed status.I didn't have disk in my stock.so I ejecedt faulty disk .yesterday I purchased a new hard and seated it in bay 22.disk 22 showed unbound.everything is ok.but I want disk 22 replace in my pool and disk24 be my hot spare,so I click on disk 24 and click "copy to hot spare".

disk22 and disk24 went to rebiuld process,now I checked my pool but disk 22 isn't in it and when I see hardware---->disk---->disk22---->properties it showed me it is in RAID group 58 !!!! all my hard is in storage pool ..I don't have any RAID group.what should I do ?please help me

Capture.PNG.png

March 13th, 2016 04:00

the RAID Group being shown may be the internal/private one for the Pool it is a member of. Where are you seeing this ?

I would check from naviseccli also to get more verbose info;

naviseccli -h ipspa -getdisk 0_0_22

Check your Pool properties to verify the disk is a member.

Also, with Permanent sparing, there's no real reason to 'failback' the hot spare drive, it's not necessary.

8 Posts

March 13th, 2016 05:00

tnx Dear Brett@s

My pool disk members is as below:Disk 22 isn't its member!!!! should I expand my pool after I seat disk 22 then copy to hit spare or not?

I didn't expand pool after reseating new disk.is it wrong?

Capture2.PNG.png

Hardware--->Disks--->disk22----->Properties

Capture3.PNG.png

March 13th, 2016 16:00

What are the properties on disk 24 now ?

..and I would get some sp collects done and log an SR.

8 Posts

March 14th, 2016 01:00

24.PNG.pngDisk 24 screen shot

65 Posts

March 14th, 2016 06:00

Hello,

If I understand correctly:

- Disk 0.0.22 failed, and 0.0.24 took its place in the pool.
- You replaced 0.0.22, and it showed as unbound, ready to act as a spare.

- You Copied 0.0.24 to a spare.

- Now you see 0.0.22 in RG 58 and 0.0.24 as an unbound spare.

Correct?

- Are you sure you have no RAID Group 58 created under Storage > RAID Groups?

- Can you put more screenshots of the RAID Groups tab, refresh Unisphere, and show the pool's used drives again?

As mentioned before, it'd be easier to check SP Collects, which you should open an SR for. Also for your reference, you could have either used the copytodisk CLI command, or used the drive mobility feature, both options are explained in KB https://support.emc.com/kb/305286 ("How does next-generation VNX (MCx/05.33) choose hot spares?").

Adham

8 Posts

March 14th, 2016 06:00

RAIDGroup.jpg

absoultly right,as you see I never have RAID group

8 Posts

March 14th, 2016 07:00

DEar glen

I know the difference between VNX1 and VNX2.you are right in vnx2 every unbound disk may consider for spare.

so how can I replace 22 in my pool?Where I was wrong?

4.5K Posts

March 14th, 2016 07:00

The way the VNX2 (VNX 54xx) works for hot spares is different from VNX1. When the hot spare (0-0-24) replaced 0-0-22, disks 0-0-24 became part of the Pool. When you inserted the new 0-0-22, that disk then became a new, possible hot spare. All unused disks are potential hot spares.

From KB 306286

What if the customer wants to keep the disk configuration the same and doesn't want the drives to move due to the permanent sparing?

If the customer wants to keep RAID Groups within the same Bus, Enclosure, Disk ( B.E.D.) location there are two options:

  1. Use the MCx Drive Mobility feature after the rebuild has completed to the spare.
  2. Use the copytodisk CLI command to copy the data back to the original disk slot.

This means that you should use the Naviseccli command to move the data from 0-0-24 to the new 0-0-22. 

As for the rsaid group that the new 0-0-22 now shows, select the Properties for disk 0-0-21 and see what you get? In Pools, the disks are configured into Private raid groups.

glen

4.5K Posts

March 14th, 2016 11:00

I'm not completely sure, but maybe using the GUI to do the second copy to hot spare (disks 24 -> 22), although that seems to have worked.

When a disk fails and copies to the hot spare, once it's completed, you can remove the failed disk, then take the hot spare and move it to the empty slot where you removed the failed disk (Drive mobility). Once it's powered up, take the new disk and put that in slot 24 for the new hot spare.

The KB implies that you must use the CLI command to copy the data from the hot spare back to the new disk. I'm not certain what happened with 0-0-22 after you ran the copy to disk command - in the Disk Properties the "Storage Pool Name" should be the name of the Pool. Why it's showing as raid group 58 is not clear to me.

Is disk 0-0-22 still showing RG 58 when you look at the properties of the disk 0-0-22? What is the Pool Properties/Disks showing?

I tried coping to a Hot Spare from one of the disks in a Pool (1-0-1 is the original, 1-0-0 is the hot spare) - I can see it copying on the Pool Properties

CopyToHotSpare.PNG.png

In the Disk Properties I can also see the disk (1-0-1) copying to the hot spare (1-0-0).

CopyToHotSpare_2.PNG.png

For each of the disks it looks the same 1-0-0 is the Hot Spare) - they both show up in Pool 0

CopyToHotSpare_3.PNG.pngCopyToHotSpare_4.PNG.png

glen

8 Posts

March 19th, 2016 02:00

Tnx Glen

If I want to use Drive mobility it requires down time and it seems it is not a good way for replacing hard disk.

for Tshoot I tested another thing.I right clicked to disk 22 and select copy to hot spare option.Both of my disk went to rebuild status(22 and 24) and after it finished I saw disk 24 is my in pool and disk 22 is unbound status.

I repeated this process again.R click on disk 24 and select copy to host spare (disk 24 and 22 went to rebuild status) but after it finished disk 22 didn't go to pool and it shows it is a member of RAID group58.realy crazing issue !!!

any idea ?

It is worthwhile to mention that all my disks in pool is hitachi and my new disk on bay 22 is seagate.does it cause this problem or not?

4.5K Posts

March 21st, 2016 09:00

The issue of the RG number is a bit strange, we'd need to look at the spcollects to see into the Pool for the private raid groups. I'd recommend you open a case with EMC so they can look at the spcollects. Just open a case with EMC from the support.emc.com site - run new spcollects prior to opening the case and attach the spcollects to the case. It would be helpful to detail the date/times that you did the different actions (copy to hot spares) if you have that information.

The new drive mobility should not require you to take down time (as far as I know). physically removing and inseting the drive in a new slot should take less than two minutes and during that time the raid group that contains the drive would go into a de-graded state (if it's a parity raid type like Raid 5 the disks would be using parity for requests until the drive is in the new location) so there would still be access, but there would be a latency decrease during the swap time.

glen

8.6K Posts

March 21st, 2016 10:00

Actually its 5 minutes for drive mobility

If you stay within 5 minutes there is no downtime. Even if you don’t than the RG will just degrade – that doesn’t mean downtime.

Please see the VNX McX white paper for details

8 Posts

March 21st, 2016 11:00

Dear Glen

I would like to really appreciate you fro your help.would you please let me know how can I run SpCollect?and what's your meaning about Private Raid group?I can't get it.

Thanks

4.5K Posts

March 29th, 2016 08:00

In Unisphere on the when you go to the System page, on the lower right side there's a section called "Diagnostic Files". You click on "Generate Diagnostic Files ......." - one for SPA and one for SPB. This starts the process of collecting the diagnostic files (spcollects) for each SP. Once the process is completed, you can then "Get Diagnostic Files --SPx". One for SPA and one for SPB. These two diagnostic files (spcollects) can then get used by support for look at your array for issues.

You can automate this process by installing the Unisphere Service Manager (USM) on a Windows workstation (Windows 7 for example) that will connect to you array (you start the program and then log in to the array) and you get generate the same spcollects for SPA and SPB into one zipped file that will be downloaded to your workstation in a folder (default is C:\EMC\repository). Once you have the spcollects, you can use USM to review these files.

You can find more detail with this KB article:

https://support.emc.com/kb/334497

A Pool is a logical construct. It is built using raid groups. When you first create a new Pool, you start by assigning disks to the Pool. These disks are ordered into private raid groups. For example, if you create a very simple Pool using 5 SAS disks, the disks are (by default) first used to create a single 4+1 raid 5 raid group. This raid group will be assigned a raid group number - the number is the last usable raid group number in the array - say 4000. This is called a private raid group and it is the foundation of the Pool. The Pool is a "file system" built on top of that raid group. When you look at the Properties of the disks in that Pool you would see that they are contained in that private raid group. You can NOT see these private raid groups from Unisphere, you need the spcollects to see this.

glen

No Events found!

Top