1 Rookie
•
17 Posts
0
8803
Steps to follow to replace a Failed Drive in PS6100
Hi,
I am new to these Devices, in my environment we have 2 Dell EquilLogic devices and we got Failed drives on both devices. These are out of support i believe
Slot Type Model Size RPM Status Errors SMART Health
---- -------- ---------------- ----------- ----- --------- ------ ------------
1 SAS(HDD) Unavailable 0MB 10000 failed 0 unavailable
2 SAS(HDD) Unavailable 0MB 10000 failed 0 unavailable
7 SAS(HDD) Unavailable 0MB 10000 failed 0 unavailable
Is it straight forward to replace these? Like Order new Drives and pull the failed once and replaced with new?
If not do i need to follow any other steps from CLI or GUI before and after the replacement?
One more Query:
I can also see it has some failed components but it's now showing what exactly failed?? Is there a way to find out what failed here?
Health Status Details
Critical conditions::
Critical hardware component failure.
Warning conditions::
More spare drives are expected.
any help would be a great help and much appreciated.
Thanks in advance.
_______________________________________________________________________________
dwilliam62
3 Apprentice
3 Apprentice
•
1.5K Posts
0
March 17th, 2020 05:00
Hello,
RAID50 is two RAID5's striped together. So each RAID5 stripe can only handle one drive failure. You should have also had two SPARE drives? Which should have replaced the failed one automatically. if not, or if you were running with 'no spares' option then the another failure on either RAIDset will cause an outage of that entire array.
Once you resolve the failed drive situation you might want to consider converting to RAID6.
Regards,
Don
dwilliam62
3 Apprentice
3 Apprentice
•
1.5K Posts
1
January 9th, 2020 08:00
Hello,
Yes,. Basically replace them and the rebuild will start. There are no commands to issue.
However, you can't use OEM or NON-EQL drives in the 6100. Even other "Dell" drives, like from a server or other SAN will work. They are branded specifically for the EQL storage arrays. So you will need a vendor that has drives for EQL arrays. I would replace them one at a time. Is this data backed up?
You might want to check the service data for that array, you might still be able to get it back under contract. If you go to support.dell.com and enter in the service tag, you can see he ship date. You can get up to 7 years from that date.
Regards,
Don
santhumax
1 Rookie
1 Rookie
•
17 Posts
0
January 9th, 2020 09:00
HI Don,
thanks a lot for your response. we have order drives using a vendor (techbuyer) they have the same exact drives (Refurbished once), Frm the CLI i copied & provided the Model number.
Slot Type Model Size RPM Status Errors SMART Health
---- -------- ---------------- ----------- ----- --------- ------ ------------
0 SAS(HDD) ST600MM0006 558.91GB 10000 online 0 ok
1 SAS(HDD) ST9600205SS 558.91GB 10000 online 0 ok
2 SAS(HDD) Unavailable 0MB 10000 failed 0 unavailable
3 SAS(HDD) ST9600205SS 558.91GB 10000 online 0 ok
4 SAS(HDD) ST9600205SS 558.91GB 10000 online 0 ok
5 SAS(HDD) ST600MM0088 558.91GB 10000 online 0 ok
6 SAS(HDD) ST600MM0088 558.91GB 10000 online 0 ok
7 SAS(HDD) Unavailable 0MB 10000 failed 0 unavailable
@DON can you also let me know how to check what other failed components in the device from the CLI?
dwilliam62
3 Apprentice
3 Apprentice
•
1.5K Posts
1
January 9th, 2020 10:00
Hello,
The "failed component" is likely to be the failed drives. It's a general flag to indicate something has failed. Then the last time "more spares expected" is the reason. It's looking for spare drives to start a rebuild
Regards,
Don
santhumax
1 Rookie
1 Rookie
•
17 Posts
0
January 16th, 2020 03:00
Hi Don,
Thank you..
is there a command in CLI to view the disk rebuild status and time? not sure but i can't open the GUI session due to some errors.
also in one of the system there are 2 failed drives to i have to do it one by one right? is this means wait till a drive completely rebuilt and do it the next one?
Regards,
Santhosh Kumar
dwilliam62
3 Apprentice
3 Apprentice
•
1.5K Posts
1
January 16th, 2020 05:00
Hello,
At the CLI:
member select MEMNBERNAME show (Look for RAID percentage)
_____________________________ Member Information ______________________________
Name: MEMBERNAME Status: online
TotalSpace: 33.74TB UsedSpace: 1TB
SnapSpace: 3.21GB Description:
Def-Gateway: 100.85.224.1 Serial-Number:
Disks: 24
Spares: 1 Controllers: 2
CacheMode: write-back Connections: 142
RaidStatus: ok RaidPercentage: 0.000%
LostBlocks: false HealthStatus: normal
LocateMember: disable Controller-Safe: disabled
Version: V10.0.3 (R469188) Delay-Data-Move: disable
ChassisType: DELLSBB4u24 3.5 Accelerated RAID Capable: no
Pool: default Raid-policy: raid6
Re: Drives. Since they have both failed but you are still online you can replace them both. It will put a greater load on the member while the rebuilds are running.
What RAID level is that member using?
Regards,
Don
dwilliam62
3 Apprentice
3 Apprentice
•
1.5K Posts
1
January 16th, 2020 05:00
Hello,
Since the drives are 10K RPM and 500GB the rebuild time should not be very long.
Don
santhumax
1 Rookie
1 Rookie
•
17 Posts
0
January 16th, 2020 07:00
HI Don,
just replaced the drive and ran the command you provided in the previous reply..
ITFC-EQL-U44-GRP1> member select ITFC-EQL-U44-N1 show
_____________________________ Member Information ______________________________
Name: ITFC-EQL-U44-N1 Status: online
TotalSpace: 9.42TB UsedSpace: 7.58TB
SnapSpace: 121.29GB Description:
Def-Gateway: Serial-Number:
Disks: 24 CN-0WK7G2-70821-1A9-007N-A00
Spares: 2 Controllers: 2
CacheMode: write-back
Connections: 24
RaidStatus: ok
RaidPercentage: 0.000%
LostBlocks: false
HealthStatus: normal
LocateMember: disable
Controller-Safe: disabled
Version: V8.1.4 (R425229)
Delay-Data-Move: unconfigured
ChassisType: DELLSBB2u24 2.5
Accelerated RAID Capable: no
Pool: default
Raid-policy: raid50
Product Family: PS6100
All-Disks-SED: no
SectorSize: 512
Language-Kit-Version: de, es, fr, ja, ExpandedSnapDataSize: N/A
ko, zh CompressedSnapDataSize: N/A
CompressionSavings: N/A Data-Reduction: no-capable-hardware
Raid-Rebuild-Delay-State: disabled
_______________________________________________________________________________
____________________________ Health Status Details ____________________________
Critical conditions::
None
Warning conditions::
None
_______________________________________________________________________________
____________________________ Operations InProgress ____________________________
ID StartTime Progress Operation Details
-- -------------------- -------- -----------------------------------------------
the RAID percentage showing 0.00%!!!
it means the rebuilt completed?
dwilliam62
3 Apprentice
3 Apprentice
•
1.5K Posts
0
January 16th, 2020 10:00
Hello,
Or the rebuilds had already finished and you just replaced the two consumed spares.
member select ITFC-EQL-U44-N1 show disks
I suspect you will find two drives marked as SPARE now.
Regards,
Don
santhumax
1 Rookie
1 Rookie
•
17 Posts
0
January 17th, 2020 06:00
Hi Don,
These drives failed a few days ago, may be a week to 10 days.
Yes, as you said i can see these replaced drives as Spares.
Regards,
Santhosh Kumar
dwilliam62
3 Apprentice
3 Apprentice
•
1.5K Posts
0
January 17th, 2020 06:00
Hello Santhosh,
You are very welcome! I am glad that I could assist you.
Thank you for letting us know the issue is resolved.
Regards,
Don
santhumax
1 Rookie
1 Rookie
•
17 Posts
0
March 17th, 2020 03:00
Don,
Good Morning,
i have a query.
we have a EqualLogic PS6100 which has 24 Disks (600GB SAS) and RAID50 Configured
Created 2 Volumes from these one with 4TB and another with 5TB
can you let me know how many drive failures this system can handle?
as of now we have 2 failed drives (Slot 18 and Slot 20).
Regards,
Santhosh
santhumax
1 Rookie
1 Rookie
•
17 Posts
0
March 17th, 2020 06:00
HI Don,
yes, we had 2 Spare drives.
can i change the RAID from 50 to RAID 6 on the go? to survive one more drive failure?
We are planning to replace these sooner. we need to buy these so just wanted to check if we have time or not in case one more drive fails!!
Regards,
Santhosh
dwilliam62
3 Apprentice
3 Apprentice
•
1.5K Posts
0
March 17th, 2020 06:00
Hello,
I'm not 100% sure what state you are in. You had two spares, so both failed drives were replaced OK and the current RAID status is "OK" or Green correct?
If so you should replace those spares ASAP in case you have another failure.
If not and BOTH RAIDsets are in degraded mode you can NOT convert to RAID6 and you need to get 4x spare drives to replace the two failed then have two spares available again after the rebuilds complete. Then you can convert to RAID6
Converting RAID levels puts strain on all the drives so you could see additional failures. I suspect the firmware on the arrays and drives are probably old as well? What is the firmware version?
Regards.
Don