jimkunysz
3 Argentum

virtual pool question

we recently purchased a cx4-960 and I've been reading up on using virtual pools. I have a bunch of questions that I sent our TC but here's one I'd like to get the forums thoughts on.

Documentation in Best Practices guides say "more pools with a smaller number of storage devices will have better availability than fewer pools with a greater number of storage devices" and "rebuild times increase for larger raid groups".

The first seems to contradict the reason of having Pools as it would seem it would be better from an administrative view to have fewer pools with more devices; but does this increase rebuild times by having larger pools?

I understand that pools are groupings of smaller raid groups (ie 4+1 for raid 5) but the data is potentially dispersed across many raid groups within a pool.

In addition, I read this as well: "after a thin lun trespasses, a thin lun's private information remains under control of the original owning SP. this results in both SPs being used in handling the i/o's. this can adversely affect performance".

what reliability/availability/performance is lost or gained by using large pools?

Thanks.

Jim

Labels (2)
0 Kudos
3 Replies
jps00
3 Zinc

Re: virtual pool question

Statistically, the smaller the number of dependent components in a device the more reliable it is.  The pool storage object can be considered a device.

Availability in CLARiiON storage systems is always at the RAID group level.   Data for all the LUNs in the pool is likely spread homogeneously across the pool's RAID groups.  If for example, you have a double fault in one of the RAID-5 level groups that make-up the pool, some portion of all of the LUNs that make-up the pool is unrecoverable.  The availablity of the pool is the availability of a single RAID group.  With a large pool containing a large number of LUNs or a lot of the storage system's capacity, the result of a single RAID group's double fault would adversely effect a lot of user data.  With more, smaller pools, the result of a RAID group failure would effect fewer LUNs or a smaller amount of user data.

Note that the risk of losing a RAID group can be mitigated by configuring your pool's RAID groups to be RAID-level 6.  Also provisioning your pools exclusively with Fibre channel or SAS drives increases their availability.

In addition, I read this as well: "after a thin lun trespasses, a thin lun's private information remains under control of the original owning SP. this results in both SPs being used in handling the i/o's. this can adversely affect performance".

I don't understand your question here.

If you have questions about Best Practices, or any other EMC authored document, the quickest way to get an answer is to use PowerLink.  To the right-side of the Powerlink reference is the 'Feedback to Author' link.  This will send email directly to the author, who will shortly respond to your question.

jimkunysz
3 Argentum

Re: virtual pool question

I'm trying to see the benefit of using Pools versus Raid Groups. At this point, the only benefit I see is support for FAST.

Adding disk to an existing Pool doesn't immediately increase iops for an existing lun as it won't restripe automatically to the new disks.

Also, I now need to pay much closer attention to trespassed luns because if they belong in a pool, I'm now effectively doubling the SP utilization required to service that lun.

Unfortunately, our storage is 'project based'. That is, the owner of a new project/application purchases disk to support their project. This could be as small as 1 disk or as large as a full dae (or more). It's difficult to build Pools correctly with this type of purchasing model but unless I create Pools, I'm limited in what FAST provides me.

Jim

0 Kudos
jps00
3 Zinc

Re: virtual pool question

Shortly, Best Practices FLARE Revision 30.0 will be available.  It has a more in-depth description of pools, that may answer some of your questions.

Also, I now need to pay much closer attention to trespassed luns because if they belong in a pool, I'm now effectively doubling the SP utilization required to service that lun

The trespassed pool LUN does not double the percentage of total SP resources used to service a LUN.  However, what it does use, is a portion of the communications channel between SPs.  The additional bandwidth used may affect overall system performance by increasing the time it takes for communications between SPs.  On a lightly loaded system this is not a problem.