Start a Conversation

Unsolved

10 Posts

7709

June 25th, 2018 12:00

Need help keeping our AX4-5 units up and running just a bit longer...

The program that I began working for a few months ago has purchased a new HP storage solution, but they have to award another contract for some more parts.  Once that work is done, we will be decommissioning three EMC AX4 arrays.  For now, I have to keep the arrays running because the drive space is supporting the VMware environment.

Here's the current situation:

EMC #1 has active connections to 14 VMware Hosts.

It is listing Disks 5,6,8,9 and 10 as FAULTED, REMOVED or MISPLACED.

Standby Power Supplies A and B are FAULTED.

EMC #3 has active connections to the same 14 VMware Hosts as EMC #1.


EMC #2 has an active connection to only one server but the volume is not in use on the server.

It is listing Disk 7 as REMOVED.
Power/Cooling Module B is FAULTED.

Standby Power Supply A is FAULTED.

Back to EMC #1.  Here is the FAULTED Virtual Disk:

Virtual Disks:
Name Capacity State
EMC_C 2 TB Faulted,Offline
Disks:
Disk Capacity State
Enclosure 0 Disk 4 917 GB Normal
Enclosure 0 Disk 5 0.000 GB Removed
Enclosure 0 Disk 6 0.000 GB Removed
Enclosure 0 Disk 7 917 GB Normal
Enclosure 0 Disk 8 0.000 GB Faulted
Enclosure 0 Disk 9 0.000 GB Removed
Hot Spare Replacing Enclosure 0 Disk 8 in Disk Pool 2 - data has been reconstructed to the hot spare

So here's my proposal: Shut down EMC #2. It is not providing services, and it has good parts. Install the good SPS from #2 into #1. Then I can use 5 of #2's disks to replace the FAULTED/REMOVED drives.  If I understand RAID 5, there's no chance that any of the data on Virtual Disk EMC_C is intact, since only 2 of the disks are in NORMAL state. Do I still need to insert the "new" disks one at a time and let the RAID rebuild?

Also, with EMC #1's Disk 10 being listed as MISPLACED, might there be some benefit to swapping it into the REMOVED/FAULTED slots to see if it is recognized and made active again?

Thanks,
Steve

4.5K Posts

June 26th, 2018 08:00

That's pretty complicated what you want to do. In all the EMC arrays, the disks that are used to create a Raid Group are then locked into that physical location - so if you used disks 5-10 to create a raid group then each of those disks are formatted with information about the slot number and position within the raid group.

To use a disk it must first be zeroed out. This happens normally when you destroy a raid group - it removes the raid information from all the drives and unlocks them from their current position.

The first question you need to ask is what data can be deleted and what can't. You need to make a backup of the data that can't be lost.

Then you can look at destroying any disk groups you don't care about and removing those groups. That should free up those disks for use.

I think the "misplaced" is a disk that was taken from somewhere and then inserted into a different slot. If that's the case, that disk 10 is probably still holding the information from the original raid group it used to be in.

What you need to do is more complex then this forum is designed to help with. You'll either need to contact EMC support or the 3rd party that originally sold the AX.

glen

June 26th, 2018 10:00

I agree with your assessment, but there's no way we can get support.  If Contracting could take the steps to get us back under Dell maintenance, then they could also award the contract for our new equipment and I wouldn't be concerning myself with the EMCs at all.  However, in your response you answered most of my questions.

EMC #2 holds no valuable information. It has 2 virtual disks in RAID5 configurations. If I destroy both of them, then the data will be zeroed on those 11 good physical disks.  At that point, I can pull some of the zeroed disks, use them to replace the REMOVED/FAULTED disks in EMC #1's FAULTED Virtual Disk (EMC_C) and make a new Virtual Disk that isn't FAULTED.  After all, a 6-disk RAID5 array with 4 failed members can't regenerate its data anyway, right?

The MISPLACED disk #10 would have originated in the same chassis where it is currently located. Since it only has one faulted disk pool, then it stands to reason its original slot would have been 5, 6, 8 or 9. If so, then it should recognize it's correct location once replaced, correct?

Thanks,

Steve

4.5K Posts

June 27th, 2018 14:00

Correct - the drive when inserted in the correct slot will indicate that

glen

June 29th, 2018 15:00

I just discovered today that on the "Attention Required" page, the system actually told me the expected Serial Number for Disk 6.  When I pulled the MISPLACED disk from Slot 10, it was the serial that belonged in the empty Slot 6, so I put it there, restoring Disk 6 to NORMAL.

What puzzles me is why I can't find any screen that lists the expected serial numbers for the other slots.  I found the disks that were pulled out of the array just sitting on a table, but I don't know which slots they came from. They're probably bad anyway, but I'd like to make sure.  Am I missing a menu item in Navisphere Express that will show me the expected serial numbers?

Also, I have destroyed all the disk pools and virtual disks on EMC 2, but when I slipped one of the disks into a slot on EMC 1, it came back with a status of MISPLACED.  Did I miss a step that would have zeroed the disks in EMC 2 before I shut it down?

Thanks,

Steve

July 2nd, 2018 12:00

I know no one has replied to my last message, but since my posts are moderated, I thought I'd go ahead and post the latest.

I found another thread in the community where it was discussed that zeroing occurs when creating a new disk pool, so I tried that:

  • Click "Disk Pools"
  • Click "Create Disk Pools"
  • Select "RAID 1/0"
  • Select an even number of disks (I've tried several combinations, but say Disk 5 and 6 for example)
  • Click "Apply"

At that point, the screen blanks and takes me back to the Navisphere Express Login page.  I've tried restarting the AX4 and that does not change the result.

I'd really like to make this work, but there's virtually no information outside this community, so I really appreciate Glen's input.

Steve

4.5K Posts

July 6th, 2018 08:00

What is the Flare version running on the array? Sometimes an older version will not recognize disks based on the OE (Operating Environment) version of the array. The latest version is 02.23.050.5.712. You can check the part number of the disks against which OE version in the below document - look for the AX4 section.

https://support.emc.com/docu42949_All_VNX_CLARiiON_Celerra_Storage_Systems_Drive_and_FLARE_OE_Matrices.pdf?language=en_US

glen


July 6th, 2018 11:00

All of our arrays are running FLARE 02.23.050.5.711.

I have compared all of our Part Numbers with the drive and FLARE matrices you linked, and all are listed as fine with our FLARE version and Array Models.

EMC #1 0 1 2 3 4 5 6 7 8 9 10 11
Serial Number: Z1N4R08Y Z1N388GT Z1N4RL9E P6GH6BVP Z1N4RDMG Removed 9QJ7XPYG Z1N4KXXT Removed Removed Removed N/A
Part Number: 5050063 5050063 5050063 5050669 5050063 Removed 5048831 5050063 Removed Removed Removed 5050063
Vendor Name: ATA-ST ATA-ST ATA-ST ATA-HTCH ATA-ST Removed ATA-ST ATA-ST Removed Removed Removed ATA-ST
EMC #2 0 1 2 3 4 5 6 7 8 9 10 11
Serial Number: Z1N26P0H PBJ51XPE 9QJ7YQRC Z1N350ZD PAKST4BE 9QJ7XZLM 9QJ7YQG6 Z1N37XZW 9QJ7ZK8L 9QJ7Z71Y Z1N3XE4J Removed
Part Number: 5050063 5048805 5048831 5050063 5048805 5048831 5048831 5050063 5048831 5048831 5050063 Removed
Vendor Name: ATA-ST ATA-HTCH ATA-ST ATA-ST ATA-HTCH ATA-ST ATA-ST ATA-ST ATA-ST ATA-ST ATA-ST Removed

4.5K Posts

July 10th, 2018 07:00

Have you tried to insert one of the disks from #2 in slot 0-0-8 to see what happens? It shows in your first post that a hot spare is replacing 0-0-8 - if you put a disk in that slot it might start to copy the data back from the hot spare to the new disk.

I think the problem is that with that many faulted disks, you may have to re-initialize the array. Since all of the data is already gone, this might be the best action.

I'll see if I can find any documentation about re-initializing the array.

g;en

4.5K Posts

July 10th, 2018 07:00

Also, you might want to look at this site - go to the Legacy EMC Product Documentation and then to AX4-5 section

https://mydocuments.emc.com/#

glen

July 10th, 2018 12:00

Glen,

Thanks for that.  I was able to take EMC #2's Disk 10 and put it into EMC #1's Disk 8 position.  Disk 8 is now reporting as Normal.

BTW, I figured out why I was getting kicked out of EMC #2 whenever I tired to create arrays.  Firefox and Navisphere are not 100% frieldly.  I switched to IE and was able to create a new array. After that, one of the new array disks went into EMC #1's Disk 5 slot happily.  The other did not go into 9 without error, so I am attempting to replicate my success by creating another poll to rob from. As of now, only 9 and 10 on EMC #1 are faulted!

I appreciate your sticking with me through this process.  We'd be moving faster if the forum weren't requiring my posts to be moderated, but thank you for checking back in.

Steve

July 16th, 2018 10:00

I had a bit of an event where EMC #2 didn't react well to all of the drive pulling as I tired to get two last drives zeroed and moved to EMC #1.  I left it sitting over the weekend and at least got to where I could login again.  Here's what #2 looks like:

    Disk 0Normal
    Disk 1Empty
    Disk 2Normal
    Disk 3Normal
    Disk 4Empty
    Disk 5Empty
    Disk 6Empty
    Disk 7Normal
    Disk 8Spare ready
    Disk 9Spare ready
    Disk 10Spare ready
    Disk 11Spare ready

No disk pools or virtual disks exist on EMC #2 at this point, but I can't get any of them to go over to EMC #1's Disk 9 or 10.  We're so close, though.  If you've got any other ideas, I'd love to hear them.

4.5K Posts

July 19th, 2018 14:00

Disks 0-3 are the operating system disks - if you want the array to work, you can't pull these drives. Not sure how to get these four to become normal disks once they've been used as system disks. The way they work is that disks 0 and 2 control SPA and 1 and 3 are for SPB. Since disk 1 is missing, if you have that disk (it should have a label indicating it's an OS disk), you might try putting it in an empty slot on #2 and see if you can create a single virtual disk from it - that might zero out the whole disk (part of the disk is used by the OS and is normally not accessible and rest of the disk can be used as user space). This might not work as I've never heard of anyone trying that. As long as you don't remove 0 and 2, SPA will stay up. If you can get the whole space to be used, then destroy the virtual disk and see if it will be recognized on #1.

glen

July 20th, 2018 07:00

I understand.  I've been trying to avoid messing with Disks 0-3. The Empty Disk #1 is toast. It goes orange whenever it's inserted.

The fact that 8-11 are all in "Spare Ready" state gets me because I can't seem to get them to zero, and, as spares, they aren't likely to have any data on them at all.

4.5K Posts

July 20th, 2018 11:00

try using 7-11 and make a virtual disk (you may get an error about no spare). When you create a virtual disk it runs the disk zeroing process - let this run for about an hour to be sure it finishes. When the virtual disk is ready, then destroy it. That may leave the disks zeroed.

glen

July 24th, 2018 12:00

It wouldn't allow me to create it without a spare, so I created a RAID 1/0 disk pool on EMC #2 using disks 8-11 with 7 as the spare.  I then created a virtual disk of 100GB and let it finish.  I then deleted the virtual disk and tried swapping the drives.  The disks were listed as MISPLACED when I moved any of them into the EMC #1 slots.  I then deleted the disk pool on EMC #2 and tried again with the same results.

No joy.

No Events found!

Top