Start a Conversation

Unsolved

This post is more than 5 years old

20949

August 9th, 2017 18:00

Max sensible size for RAID6 using 10TB drives?

I have a MD1400 with (12) 10TB near-line drives on an H830.  In the past using 3TB or 4TB drives on an H810 I would typically make MD1200 trays one virtual disk, RAID6, with a dedicated spare.  But with 10TB drives that's 9x10TB or app. 90TB.

Is this wise to make such a large RAID6?  I realize the H830+MD1400 is 12Gb SAS and I'm configured for redundant path, but I'm not sure about the rebuild time or the probability for an unrecoverable bit error.

But I have only one tray and if I configure two virtual disks that would be four drives for parity and one for spare.  That leaves only 7 drives for data for the two disks.  Ugh.

Please advise.

4 Operator

 • 

1.8K Posts

August 9th, 2017 20:00

"Is this wise to make such a large RAID6?"

Tis a question of wisdom or insanity

Perhaps you could close shop for a couple weeks during rebuilds? You better hope Patrol reads are able to complete every couple weeks. Seriously even with raid 6 plus spares I would not consider a 90TB array ;  data center high end raids are made for this much data, and can handle  the possible large number of disk errors/failures, not raid systems which mere mortals can afford  .  Just my opinion

1 Rookie

 • 

31 Posts

August 11th, 2017 13:00

As a test I configured a 32TB RAID6 disk on an H810/MD1200 system with 4TB drives.  Background initialization took 9 hours.

I configured an 83TB disk on an H830/MD1400 system with 10TB drives.  Background initialization took 17 hours.

Both controllers are configured for redundant path.

The MD1400 is 12Gb SAS vs. 6Gb SAS on the MD1200.

So naively I guess you could think the larger disk is 2.6 times the smaller so it would take 2.6*9hours = 23.4 hours to init.  But because the MD1400 is twice the speed that would be 11.7 hours.  The actual number of 17 hours is in between.

I'm going to fail one of the drives next and see what the rebuild time is.  I think I'm correct in saying the rebuild is independent of the filesystem and whether there the disk is empty, right?

4 Operator

 • 

1.8K Posts

August 11th, 2017 14:00

Wow I am impressed, the newest controllers are much faster as to initialization, still any large array invoke fear for me as to the odds of a failure are much higher. Please post your rebuild times.

"I think I'm correct in saying the rebuild is independent of the filesystem and whether there the disk is empty, right?"

Since parity is created on writes, I assume an empty  array would have far less overhead then an array with data, rebuild times being greater with more data.

 

.

1 Rookie

 • 

31 Posts

August 11th, 2017 18:00

I think rebuild times with my H810's and 4TB drives is less than 24 hours.  I just took one of the 10TB drives offline and triggered a rebuild.  I will see how long it takes.

These are Hitachi drives, UBE is 1 10^15.

My Seagate 4TB are spec'ed as 1 sector read per 10^15????  Ive never seen the spec this way.  One sector per 10^15 is a huge amount more reliable than one bit in 10^15.

4 Operator

 • 

1.8K Posts

August 12th, 2017 08:00

"One sector per 10^15 is a huge amount more reliable than one bit in 10^15."

But if 1 bit bites the dust, as formatted bits belong to a sector, the sector is marked bad, basically the same thing; bits are not corrected/failed, sectors are. Then it is unusual to only have 1 bit fail, usually there are multiples in one area of a disk...., except if due to natural background radiation :emotion-1:

1 Rookie

 • 

31 Posts

August 12th, 2017 16:00

The way I would compute the probabilities, for example, reading a 1TB drive:

1TB = app. 8 x 10^12 bits = 1.95 x 10^9 sectors (512 byte sectors)

URE rate of 1 in 10^15 bits means 8x10^12 / 1x10^15 = .008 or .8%

URE rate of 1 in 10^15 sectors means 1.95x10^9 / 1x10^15 = .00000195 or .000195%

That's why I said huge amount of difference, basically a factor of 512.

The rebuild took 16 hours or so.

4 Operator

 • 

1.8K Posts

August 13th, 2017 12:00

Not debating the math, but I am debating Seagtes claim, as they have a rather high failure rate in general, especially the 4TB drives (backblaze drive reports).....of course we know Seagate would never, never  fudge a claim. :emotion-42:

As to the rebuild, pretty quick, wonder what a 50% used array  would take.

1 Rookie

 • 

31 Posts

August 16th, 2017 11:00

So I haven't tried to fill the array and test recover times.  I'm just going to spilt the tray into two raid6 disks.

I did ask Dell if they had any guidance.  I kind of hoped they have a lab somewhere where they test configurations.  But I was told no. Their storage group used to do a lot of benchmarking and write white papers.  Maybe no more.

So I think this is the end of the line for me and these RAID configurations.  On to Gluster (of course I imagine since Dell bought EMC they would be happy to discuss Isilon :-)).

Moderator

 • 

3.1K Posts

June 7th, 2018 02:00

Hi avhjr, I would not recommend a total of <90TB virtual drive, as rebuilding takes a very long time. I may not have the actually rebuild time, but I've previously tried reconstruct a RAID5 4TB VD to a RAID6 VD, it takes about 1 day to reach 50%. Situation between yours and mine would not be the same though, but calculation of RAID parity takes places to rebuild between 12 drives, that would that longer time. Hope this clarifies your doubts.

65 Posts

January 15th, 2019 21:00

So this thread is kind of old but I've just ordered new MD1400 trays with 12TB drives.  I'm thinking this time of creating RAID6 virtual drives with 9 drives, 2 parity, 1 dedicated spare.  This would be 108TB.  I haven't had any issues over the last year with my 10TB drives.  However Dell has moved away from using 4kn drives and the new 12TB drives are 512e.  I don't know if this affects rebuild times.

I can run a rebuild test to get times but if the drive has just been initialized I don't know if the rebuild time on a blank virtual drive is different from the rebuild time on a mostly full virtual drive where every sector isn't all zeros.

Of course with the PERC detecting drives that are about to fail recovery is much faster if you've set the parameter "Auto Replace Member on Predictive Failure".  Then the controller might not have to rebuild- it first tries to just copy the failing drive to the spare.  This is just one drive reading to one drive writing.  I find this happens more often than a failure that requires a rebuild.

I kind of wish someone from LSI/Broadcom read this forum to weigh in.

Moderator

 • 

3.1K Posts

January 15th, 2019 23:00

Hi @avhays, The 512e may affect rebuild duration, when data size in your RAID are huge. An initialized RAID array rebuild time is different from a data populated RAID array, as data RAID array contains RAID algorithms and parity. When a member of the RAID array is replaced, data is then will reassembled on the new drive with the RAID algorithms and parity data. Unlike RAID 1 mirrored data, RAID 5 rebuilding does not copy the failed data into the new drive.
No Events found!

Top