Unsolved
This post is more than 5 years old
8 Posts
0
82716
H200: RAID10 spans
Hi to all
I'm having big issue with a PERC H200 and a RAID10 with 6 disks.
I've replaced one of failed disk but seems that there is another one failed (and not reported by OMSA) because i'm having huge read errors:
# sg_logs /dev/sg4 -p3 | grep errors
Total errors corrected = 6912496
Total uncorrected errors = 56739
resync is taking hages, in 24 hours only 0.17% was synced.
Now, how can I ensure with disks are bound in pairs for each RAID1 ? Replaced disk is the #6, failed disks is #5. Should I assume that #5 and #6 are a single RAID1 pair ?
If I replace the #5 when is still syncing the #6, i'll loose all data, right ?
OMSA is reporting everyting as non-critical:
# omreport storage vdisk controller=0
Virtual Disk 0 on Controller PERC H200 Integrated (Embedded)
Controller PERC H200 Integrated (Embedded)
ID : 0
Status : Non-Critical
Name : Virtual Disk 0
State : Reconstructing
Hot Spare Policy violated : Not Assigned
Encrypted : Not Applicable
Progress : 0% complete
Layout : RAID-10
Size : 836.63 GB (898319253504 bytes)
T10 Protection Information Status : No
Associated Fluid Cache State : Not Applicable
Device Name : /dev/sda
Bus Protocol : SAS
Media : HDD
Read Policy : Not Applicable
Write Policy : Not Applicable
Cache Policy : Not Applicable
Stripe Element Size : 64 KB
Disk Cache Policy : Disabled
DELL-Josh Cr
Moderator
Moderator
•
8.7K Posts
0
March 15th, 2016 11:00
Hi,
That seems like a very slow rebuild, has it made further progress? 0:0 0:1, 0:2 0:3, 0:4 0:5, 0:6 0:7 should be the spans. Are you starting the numbering at 0 or 1? Even if they are separate spans, you should let one finish before the next one starts.
GandalfCorvotem
8 Posts
0
March 15th, 2016 13:00
No big updates:
Any idea? I'm sure that is trying to rebuild from a failed drive, because read errors is always incrementing on disk number #4 (starting from #0)
# sas2ircu 0 STATUS
LSI Corporation SAS2 IR Configuration Utility.
Version 20.00.00.00 (2014.09.18)
Copyright (c) 2008-2014 LSI Corporation. All rights reserved.
Background command progress status for controller 0...
IR Volume 1
Volume ID : 79
Current operation : Synchronize
Volume status : Enabled
Volume state : Degraded
Volume wwid : 05051e850439df49
Physical disk I/Os : Not quiesced
Volume size (in sectors) : 584843264
Number of remaining sectors : 583474279
Percentage complete : 0.23%
SAS2IRCU: Command STATUS Completed Successfully.
SAS2IRCU: Utility Completed Successfully.
DELL-Josh Cr
Moderator
Moderator
•
8.7K Posts
0
March 15th, 2016 14:00
It does sound like drive 4 is bad as well, and they are in the same span. Even if it completes the rebuild the data may not be complete and cause other issues. Replacing both drives and then restoring a backup is the safest choice.
GandalfCorvotem
8 Posts
0
March 17th, 2016 02:00
What is strange is that drive 4 is able to write, but still getting tons of read failures.
Actually i'm backupping all files by moving them on another server but is very very slow, probably due to controller issue or disk drive issue.
What would happen if I replace both drive at the same time ? RAID10 would be totally unavailable ?
GandalfCorvotem
8 Posts
0
March 17th, 2016 02:00
This is a XenServer node with 12 virtual machine. 10 is running fine, with no issue at all, 2 are running bad (no read/write errors but very slow and high load waiting for I/O)
Faster way should be replace both drive with new ones (or replace just the #4, as #5 is already replaced with a newer one and still rebuilding) but I think that failure of 2 disks in the same span will result in total data loss.
DELL-Josh Cr
Moderator
Moderator
•
8.7K Posts
0
March 17th, 2016 10:00
Yes, replacing both drives at the same time will cause the raid 10 to fail. If you can handle the performance issues until the next maintenance window you can, but swapping both drives and starting fresh is the best option.
GandalfCorvotem
8 Posts
0
March 18th, 2016 05:00
Now rebuild is finished (very strange), vdisk status is "Ok" but controller is marked as Degraded:
What is happening here ? Now can I safely replace the bad disk (it's the 1:0:4, even if omsa mark it as online non-critical) ?
# sas2ircu 0 STATUS
LSI Corporation SAS2 IR Configuration Utility.
Version 20.00.00.00 (2014.09.18)
Copyright (c) 2008-2014 LSI Corporation. All rights reserved.
Background command progress status for controller 0...
IR Volume 1
Volume ID : 79
Current operation : None
Volume status : Enabled
Volume state : Optimal
Volume wwid : 05051e850439df49
Physical disk I/Os : Not quiesced
SAS2IRCU: Command STATUS Completed Successfully.
SAS2IRCU: Utility Completed Successfully.
# omreport storage controller controller=0
Controller PERC H200 Integrated (Embedded)
Controllers
ID : 0
Status : Non-Critical
Name : PERC H200 Integrated
Slot ID : Embedded
State : Degraded
Firmware Version : 07.02.42.00
Latest Available Firmware Version : 07.03.06.00
Driver Version : 08.100.00.01
Minimum Required Driver Version : Not Applicable
Storport Driver Version : Not Applicable
Minimum Required Storport Driver Version : Not Applicable
Number of Connectors : 2
Rebuild Rate : Not Applicable
BGI Rate : Not Applicable
Check Consistency Rate : Not Applicable
Reconstruct Rate : Not Applicable
Alarm State : Not Applicable
Cluster Mode : Not Applicable
SCSI Initiator ID : Not Applicable
Cache Memory Size : Not Applicable
Patrol Read Mode : Not Applicable
Patrol Read State : Not Applicable
Patrol Read Rate : Not Applicable
Patrol Read Iterations : Not Applicable
Abort Check Consistency on Error : Not Applicable
Allow Revertible Hot Spare and Replace Member : Not Applicable
Load Balance : Not Applicable
Auto Replace Member on Predictive Failure : Not Applicable
Redundant Path view : Not Applicable
CacheCade Capable : Not Applicable
Persistent Hot Spare : Not Applicable
Encryption Capable : Not Applicable
Encryption Key Present : Not Applicable
Encryption Mode : Not Applicable
Preserved Cache : Not Applicable
T10 Protection Information Capable : No
Connectors
ID : 0
Status : Ok
Name : Connector 0
State : Ready
Connector Type : SAS Port RAID Mode
Termination : Not Applicable
SCSI Rate : Not Applicable
ID : 1
Status : Ok
Name : Connector 1
State : Ready
Connector Type : SAS Port RAID Mode
Termination : Not Applicable
SCSI Rate : Not Applicable
Virtual Disks
ID : 0
Status : Ok
Name : Virtual Disk 0
State : Ready
Hot Spare Policy violated : Not Assigned
Encrypted : Not Applicable
Layout : RAID-10
Size : 836.63 GB (898319253504 bytes)
T10 Protection Information Status : No
Associated Fluid Cache State : Not Applicable
Device Name : /dev/sda
Bus Protocol : SAS
Media : HDD
Read Policy : Not Applicable
Write Policy : Not Applicable
Cache Policy : Not Applicable
Stripe Element Size : 64 KB
Disk Cache Policy : Disabled
Physical Disks
ID : 0:0:0
Status : Non-Critical
Name : Physical Disk 0:0:0
State : Online
Power Status : Not Applicable
Bus Protocol : SAS
Media : HDD
Part of Cache Pool : Not Applicable
Remaining Rated Write Endurance : Not Applicable
Failure Predicted : No
Revision : 000B
Driver Version : Not Applicable
Model Number : Not Applicable
T10 PI Capable : No
Certified : No
Encryption Capable : No
Encrypted : Not Applicable
Progress : Not Applicable
Mirror Set ID : 0
Capacity : 278.88 GB (299439751168 bytes)
Used RAID Disk Space : 278.88 GB (299439751168 bytes)
Available RAID Disk Space : 0.00 GB (0 bytes)
Hot Spare : No
Vendor ID : SEAGATE
Product ID : ST3300657SS
Serial No. : 6SJ92DZP
Part Number : Not Available
Negotiated Speed : 6.00 Gbps
Capable Speed : Not Available
PCIe Maximum Link Width : Not Applicable
PCIe Negotiated Link Width : Not Applicable
Sector Size : 512B
Device Write Cache : Not Applicable
Manufacture Day : Not Available
Manufacture Week : Not Available
Manufacture Year : Not Available
SAS Address : 5000C50088FA287D
ID : 0:0:1
Status : Non-Critical
Name : Physical Disk 0:0:1
State : Online
Power Status : Not Applicable
Bus Protocol : SAS
Media : HDD
Part of Cache Pool : Not Applicable
Remaining Rated Write Endurance : Not Applicable
Failure Predicted : No
Revision : 000B
Driver Version : Not Applicable
Model Number : Not Applicable
T10 PI Capable : No
Certified : No
Encryption Capable : No
Encrypted : Not Applicable
Progress : Not Applicable
Mirror Set ID : 0
Capacity : 278.88 GB (299439751168 bytes)
Used RAID Disk Space : 278.88 GB (299439751168 bytes)
Available RAID Disk Space : 0.00 GB (0 bytes)
Hot Spare : No
Vendor ID : SEAGATE
Product ID : ST3300657SS
Serial No. : 6SJ92F5S
Part Number : Not Available
Negotiated Speed : 6.00 Gbps
Capable Speed : Not Available
PCIe Maximum Link Width : Not Applicable
PCIe Negotiated Link Width : Not Applicable
Sector Size : 512B
Device Write Cache : Not Applicable
Manufacture Day : Not Available
Manufacture Week : Not Available
Manufacture Year : Not Available
SAS Address : 5000C50088FA27CD
ID : 0:0:2
Status : Non-Critical
Name : Physical Disk 0:0:2
State : Online
Power Status : Not Applicable
Bus Protocol : SAS
Media : HDD
Part of Cache Pool : Not Applicable
Remaining Rated Write Endurance : Not Applicable
Failure Predicted : No
Revision : A510
Driver Version : Not Applicable
Model Number : Not Applicable
T10 PI Capable : No
Certified : No
Encryption Capable : No
Encrypted : Not Applicable
Progress : Not Applicable
Mirror Set ID : 1
Capacity : 278.88 GB (299439751168 bytes)
Used RAID Disk Space : 278.88 GB (299439751168 bytes)
Available RAID Disk Space : 0.00 GB (0 bytes)
Hot Spare : No
Vendor ID : HITACHI
Product ID : HUS156030VLS600
Serial No. : JXY2BVEN
Part Number : Not Available
Negotiated Speed : 6.00 Gbps
Capable Speed : Not Available
PCIe Maximum Link Width : Not Applicable
PCIe Negotiated Link Width : Not Applicable
Sector Size : 512B
Device Write Cache : Not Applicable
Manufacture Day : Not Available
Manufacture Week : Not Available
Manufacture Year : Not Available
SAS Address : 5000CCA018AD6EFD
ID : 0:0:3
Status : Non-Critical
Name : Physical Disk 0:0:3
State : Online
Power Status : Not Applicable
Bus Protocol : SAS
Media : HDD
Part of Cache Pool : Not Applicable
Remaining Rated Write Endurance : Not Applicable
Failure Predicted : No
Revision : A5D0
Driver Version : Not Applicable
Model Number : Not Applicable
T10 PI Capable : No
Certified : No
Encryption Capable : No
Encrypted : Not Applicable
Progress : Not Applicable
Mirror Set ID : 1
Capacity : 278.88 GB (299439751168 bytes)
Used RAID Disk Space : 278.88 GB (299439751168 bytes)
Available RAID Disk Space : 0.00 GB (0 bytes)
Hot Spare : No
Vendor ID : HITACHI
Product ID : HUS156030VLS600
Serial No. : J0VH4X1N
Part Number : Not Available
Negotiated Speed : 6.00 Gbps
Capable Speed : Not Available
PCIe Maximum Link Width : Not Applicable
PCIe Negotiated Link Width : Not Applicable
Sector Size : 512B
Device Write Cache : Not Applicable
Manufacture Day : Not Available
Manufacture Week : Not Available
Manufacture Year : Not Available
SAS Address : 5000CCA01F1B8FE9
ID : 1:0:4
Status : Non-Critical
Name : Physical Disk 1:0:4
State : Online
Power Status : Not Applicable
Bus Protocol : SAS
Media : HDD
Part of Cache Pool : Not Applicable
Remaining Rated Write Endurance : Not Applicable
Failure Predicted : No
Revision : A5D0
Driver Version : Not Applicable
Model Number : Not Applicable
T10 PI Capable : No
Certified : No
Encryption Capable : No
Encrypted : Not Applicable
Progress : Not Applicable
Mirror Set ID : 2
Capacity : 278.88 GB (299439751168 bytes)
Used RAID Disk Space : 278.88 GB (299439751168 bytes)
Available RAID Disk Space : 0.00 GB (0 bytes)
Hot Spare : No
Vendor ID : HITACHI
Product ID : HUS156030VLS600
Serial No. : J0VB92HN
Part Number : Not Available
Negotiated Speed : 6.00 Gbps
Capable Speed : Not Available
PCIe Maximum Link Width : Not Applicable
PCIe Negotiated Link Width : Not Applicable
Sector Size : 512B
Device Write Cache : Not Applicable
Manufacture Day : Not Available
Manufacture Week : Not Available
Manufacture Year : Not Available
SAS Address : 5000CCA01F1488AD
ID : 1:0:5
Status : Non-Critical
Name : Physical Disk 1:0:5
State : Online
Power Status : Not Applicable
Bus Protocol : SAS
Media : HDD
Part of Cache Pool : Not Applicable
Remaining Rated Write Endurance : Not Applicable
Failure Predicted : No
Revision : A5D0
Driver Version : Not Applicable
Model Number : Not Applicable
T10 PI Capable : No
Certified : No
Encryption Capable : No
Encrypted : Not Applicable
Progress : Not Applicable
Mirror Set ID : 2
Capacity : 278.88 GB (299439751168 bytes)
Used RAID Disk Space : 278.88 GB (299439751168 bytes)
Available RAID Disk Space : 0.00 GB (0 bytes)
Hot Spare : No
Vendor ID : HITACHI
Product ID : HUS156030VLS600
Serial No. : LVWKKXYM
Part Number : Not Available
Negotiated Speed : 6.00 Gbps
Capable Speed : Not Available
PCIe Maximum Link Width : Not Applicable
PCIe Negotiated Link Width : Not Applicable
Sector Size : 512B
Device Write Cache : Not Applicable
Manufacture Day : Not Available
Manufacture Week : Not Available
Manufacture Year : Not Available
SAS Address : 5000CCA02A585489
Enclosure(s)
ID : 0:0
Status : Ok
Name : Backplane
State : Ready
Connector : 0
Target ID : Not Applicable
Configuration : Not Applicable
Firmware Version : 1.07
Downstream Firmware Version : Not Applicable
Service Tag : 18L02C4
Express Service Code : 2695786708
Asset Tag : Not Applicable
Asset Name : Not Applicable
Backplane Part Number : Not Applicable
Split Bus Part Number : Not Applicable
Enclosure Part Number : Not Applicable
SAS Address : 5882B0B053F66D00
Enclosure Alarm : Not Applicable
ID : 1:0
Status : Ok
Name : Backplane
State : Ready
Connector : 1
Target ID : Not Applicable
Configuration : Not Applicable
Firmware Version : 1.07
Downstream Firmware Version : Not Applicable
Service Tag : 18L02C4
Express Service Code : 2695786708
Asset Tag : Not Applicable
Asset Name : Not Applicable
Backplane Part Number : Not Applicable
Split Bus Part Number : Not Applicable
Enclosure Part Number : Not Applicable
SAS Address : 5882B0B053F66D00
Enclosure Alarm : Not Applicable
# sas2ircu 0 DISPLAY
LSI Corporation SAS2 IR Configuration Utility.
Version 20.00.00.00 (2014.09.18)
Copyright (c) 2008-2014 LSI Corporation. All rights reserved.
Read configuration has been initiated for controller 0
------------------------------------------------------------------------
Controller information
------------------------------------------------------------------------
Controller type : SAS2008
BIOS version : 7.11.01.00
Firmware version : 7.15.04.00
Channel description : 1 Serial Attached SCSI
Initiator ID : 0
Maximum physical devices : 39
Concurrent commands supported : 2607
Slot : 0
Segment : 0
Bus : 3
Device : 0
Function : 0
RAID Support : Yes
------------------------------------------------------------------------
IR Volume information
------------------------------------------------------------------------
IR volume 1
Volume ID : 79
Status of volume : Okay (OKY)
Volume wwid : 05051e850439df49
RAID level : RAID10
Size (in MB) : 856704
Physical hard disks :
PHY[0] Enclosure#/Slot# : 1:0
PHY[1] Enclosure#/Slot# : 1:1
PHY[2] Enclosure#/Slot# : 1:2
PHY[3] Enclosure#/Slot# : 1:3
PHY[4] Enclosure#/Slot# : 1:4
PHY[5] Enclosure#/Slot# : 1:5
------------------------------------------------------------------------
Physical device information
------------------------------------------------------------------------
Initiator at ID #0
Device is a Hard disk
Enclosure # : 1
Slot # : 0
SAS Address : 5000c50-0-88fa-287d
State : Optimal (OPT)
Size (in MB)/(in sectors) : 286102/585937499
Manufacturer : SEAGATE
Model Number : ST3300657SS
Firmware Revision : 000B
Serial No : 6SJ92DZP0000N541122P
GUID : 5000c50088fa287f
Protocol : SAS
Drive Type : SAS_HDD
Device is a Hard disk
Enclosure # : 1
Slot # : 1
SAS Address : 5000c50-0-88fa-27cd
State : Optimal (OPT)
Size (in MB)/(in sectors) : 286102/585937499
Manufacturer : SEAGATE
Model Number : ST3300657SS
Firmware Revision : 000B
Serial No : 6SJ92F5S0000N541146H
GUID : 5000c50088fa27cf
Protocol : SAS
Drive Type : SAS_HDD
Device is a Hard disk
Enclosure # : 1
Slot # : 2
SAS Address : 5000cca-0-18ad-6efd
State : Optimal (OPT)
Size (in MB)/(in sectors) : 286168/586072367
Manufacturer : HITACHI
Model Number : HUS156030VLS600
Firmware Revision : A510
Serial No : JXY2BVEN
GUID : 5000cca018ad6efc
Protocol : SAS
Drive Type : SAS_HDD
Device is a Hard disk
Enclosure # : 1
Slot # : 3
SAS Address : 5000cca-0-1f1b-8fe9
State : Optimal (OPT)
Size (in MB)/(in sectors) : 286168/586072367
Manufacturer : HITACHI
Model Number : HUS156030VLS600
Firmware Revision : A5D0
Serial No : J0VH4X1N
GUID : 5000cca01f1b8fe8
Protocol : SAS
Drive Type : SAS_HDD
Device is a Hard disk
Enclosure # : 1
Slot # : 4
SAS Address : 5000cca-0-1f14-88ad
State : Optimal (OPT)
Size (in MB)/(in sectors) : 286168/586072367
Manufacturer : HITACHI
Model Number : HUS156030VLS600
Firmware Revision : A5D0
Serial No : J0VB92HN
GUID : 5000cca01f1488ac
Protocol : SAS
Drive Type : SAS_HDD
Device is a Hard disk
Enclosure # : 1
Slot # : 5
SAS Address : 5000cca-0-2a58-5489
State : Optimal (OPT)
Size (in MB)/(in sectors) : 286168/586072367
Manufacturer : HITACHI
Model Number : HUS156030VLS600
Firmware Revision : A5D0
Serial No : LVWKKXYM
GUID : 5000cca02a585488
Protocol : SAS
Drive Type : SAS_HDD
Device is a Enclosure services device
Enclosure # : 1
Slot # : 9
SAS Address : 5882b0b-0-53f6-6d00
State : Standby (SBY)
Manufacturer : DP
Model Number : BACKPLANE
Firmware Revision : 1.07
Serial No : 18L02C4
GUID : N/A
Protocol : SAS
Device Type : Enclosure services device
------------------------------------------------------------------------
Enclosure information
------------------------------------------------------------------------
Enclosure# : 1
Logical ID : 5782bcb0:53f66d00
Numslots : 9
StartSlot : 0
------------------------------------------------------------------------
SAS2IRCU: Command DISPLAY Completed Successfully.
SAS2IRCU: Utility Completed Successfully.
GandalfCorvotem
8 Posts
0
March 18th, 2016 09:00
But why omsa is seeing it as degraded?
DELL-Josh Cr
Moderator
Moderator
•
8.7K Posts
0
March 18th, 2016 09:00
Yes, you can drive drive 4 now and it should rebuild without downtime.
DELL-Josh Cr
Moderator
Moderator
•
8.7K Posts
0
March 18th, 2016 10:00
You are right
GandalfCorvotem
8 Posts
0
March 18th, 2016 10:00
I don't think is possible to turn off the cache in PERC H200 (I don't think it has a cache)
DELL-Josh Cr
Moderator
Moderator
•
8.7K Posts
0
March 18th, 2016 10:00
It is also possible that the PERC itself is failing and causing the issues. I would let the consistency check finish, replace drive 4, and if you still have issues after that, replace the PERC. You could also try turning off the cache on the PERC and see if that stops the read errors as a troubleshooting step.
DELL-Josh Cr
Moderator
Moderator
•
8.7K Posts
0
March 18th, 2016 10:00
The drive is in a predictive failure state, it just has not dropped offline. Offlining the drive and then replacing it is the best option.
GandalfCorvotem
8 Posts
0
March 18th, 2016 10:00
That's strange, as all drives report "Failure Predicted : No"
Now the controller is running a consistency check (that I can't abort). I have to wait to finish before offlining the drive.
I've also seen that both failed drives are on backplane 1:0. #4 was unable to read yesterday. Now even #5 (new drive) is unable to read some sectors.
Both drives are on backplaned 1:0