Mabro1

666 Posts

86642

August 7th, 2012 08:00

Ask the Expert: Performance Calculations on Clariion/VNX

Performance calculations on the CLARiiON/VNX with RRR & Jon Klaus

Welcome to the EMC Support Community Ask the Expert conversation. This is an opportunity to learn about Performance calculations on the Clariion /VNX systems and the various considerations that must be taken into account

This discussion begins on Monday, August 13th. Get ready by bookmarking this page or signing up for email notifications.

Your hosts:

Rob Koper is working in the IT industry since 1994 and since 2004 working for Open Line Consultancy. He started with Clariion CX300 and DMX-2 and worked with all newer arrays ever since, up to current technologies like VNX 5700 and the larger DMX-4 and VMAX 20k systems. He's mainly involved in managing and migrating data to storage arrays over large Cisco and Brocade SANs that span multiple sites widely spread through the Netherlands. Since 2007 he's an active member on ECN and the Support Forums and he currently holds Proven Professional certifications like Implementation Engineer for VNX, Clariion (expert) and Symmetrix as well as Technology Architect for Clariion and Symmetrix.

Jon Klaus has been working at Open Line since 2008 as a project consultant on various storage and server virtualization projects. To prepare for these projects, an intensive one year barrage of courses on CLARiiON and Celerra has yielded him the EMCTAe and EMCIEe certifications on CLARiiON and EMCIE + EMCTA status on Celerra.

Currently Jon is contracted by a large multinational and part of a team that is responsible for running and maintaining several (EMC) storage and backup systems throughout Europe. Amongst his day-to-day activities are: performance troubleshooting, storage migrations and designing a new architecture for the Europe storage and backup environment.

This event ran from the 13th until the 31st of August .

Here is a summary document og the higlights of that discussion as set out by the experts. Ask The Expert: Performance Calculations on Clariion/VNX wrap up

The discussion itself follows below.

Responses(170)

JonK1

247 Posts

0

August 23rd, 2012 02:00

So Rob has mentioned it before... at some point in time you'll get at the office, open your Unisphere Analyzer and see the following.

Yikes, now what!?

First of all, check your configuration. In this case it was quite obvious where the problem lies: out of the four FC front-end ports per SP, only two were connected. The other ports were used for migration purposes when the system was initially installed. After the migration finished, they were never reconnected...

But even if you have all the ports connected, there are some smart things to keep in mind.

If you are replicating, try to keep the MirrorView port free from host traffic. This will prevent host I/O from interfering with your replication I/O, keeping your synchronous mirroring fast and thus prevent host write I/O slowdowns.
How would you spread your servers across the available FE port: attach them to all the available ports? Not a good idea!
- Documentation on powerlink clearly states that too many paths per host (i.e. more than four paths) can result in lengthy failovers. So unless you have extremely high bandwidth requirements, limit yourself to four paths per host max.
- Remember from your specsheets or training that a CLARiiON model has a limited amount of initiators? If you have 8 paths instead of 4, you have double the amount of initiators active. Depending from your environment, you may end up with a system that has GB's to spare but can't add another server!
- Make yourself a spreadsheet to keep track of which server is zoned to which port, and start staggering them. Something like this may do...

(For the careful reader: This array will never replicate using MirrorView, so that's why we're using port A7 & B7 for Host I/O ).

And of course, if careful planning still gets you Queue Full errors, do remember Rob's post about HBA queue settings!

RRR

2 Intern

•

5.7K Posts

0

August 23rd, 2012 03:00

Better safe than sorry! Starting with a decent design includes a decent forecast of the amount of hosts that are going to be attached to a VNX or Clariion and the number of LUNs they are going to get. Based on this you can calculate the queue depth setting to avoid the QFULLs which are bad for performance since HBAs and host OSs will slow down generating I/Os. If you simply avoid outstanding I/Os getting too high / too much the feared QFULLs won't appear and the performance stays predictable.

dynamox

2 Intern

•

20.4K Posts

0

August 23rd, 2012 04:00

Jon Klaus wrote:

How would you spread your servers across the available FE port: attach them to all the available ports? Not a good idea!

Documentation on powerlink clearly states that too many paths per host (i.e. more than four paths) can result in lengthy failovers. So unless you have extremely high bandwidth requirements, limit yourself to four paths per host max.

hmm ..can you provide links/references where it states that more than 4 paths to a LUN will lead to slow failover ? I have never read anything like that nor has it ever been mentioned in Performance Workshops that i've taken.

JonK1

247 Posts

1

August 23rd, 2012 05:00

Certainly! I found this in the CLARiiON best practices for Performance and Availability document, R30. On page 25 it states:

PowerPath allows the host to connect to a LUN through more than one SP port. This is known as multipathing. PowerPath optimizes multipathed LUNs with load-balancing algorithms. It offers several load-balancing algorithms. Port load balancing equalizes the I/O workload over all available channels. We recommend the default algorithm, ClarOpt, which adjusts for number of bytes transferred and for the queue depth.

Hosts connected to CLARiiONs benefit from multipathing. Direct-attach multipathing requires at least two HBAs; SAN multipathing also requires at least two HBAs. Each HBA needs to be zoned to more than one SP port. The advantages of multipathing are:
- Failover from port to port on the same SP, maintaining an even system load and minimizing LUN trespassing
- Port load balancing across SP ports and host HBAs
- Higher bandwidth attach from host to storage system (assuming the host has as many HBAs as paths used)
While PowerPath offers load balancing across all available active paths, this comes at some cost:
- Some host CPU resources are used during both normal operations, as well as during failover.
- Every active and passive path from the host requires an initiator record; there are a finite number of initiators per system.
- Active paths increase time to fail over in some situations. (PowerPath tries several paths before trespassing a LUN from one SP to the other.)

Because of these factors, active paths should be limited, via zoning, to two storage system ports per HBA for each storage system SP to which the host is attached. The exception is in environments where bursts of I/O from other hosts sharing the storage system ports are unpredictable and severe. In this case, four storage system ports per HBA should be used.

The EMC PowerPath Version 5.5 Product Guide available on Powerlink provides additional details on
PowerPath configuration and usage.

Message was edited by: Jon Klaus - Correcting layout for some of the botched PDF c/p results...

RRR

2 Intern

•

5.7K Posts

0

August 23rd, 2012 05:00

I’m not sure on this one, but I think I heard a colleague of mine say (years ago) that VMWare ESX can only handle 1024 objects (related to storage). A LUN that uses 4 paths are 4 objects, so if you use 8 paths, eachg LUN will cost you 8 / 1000. I’ve seen ESX clusters with >100 LUNs. Any comment on this?

dynamox

2 Intern

•

20.4K Posts

0

August 23rd, 2012 05:00

Thanks, interesting the exception that they mention could be in every environment. Honestly i would rather have more paths (reasonable number of course) to give me more queues and experience slower failover times as that occurs very seldom.

Jim_Hegner

212 Posts

0

August 23rd, 2012 07:00

RRR you are correct, ESXI supports 1024 paths on a server in total, so 4 paths to each lun means 256 luns maximum

I have attached a clip form the VNX unified storage integration with Vmware Vsphere best practices...

that document is a part of the VNX TA expert level curriculum.

RRR

2 Intern

•

5.7K Posts

0

August 23rd, 2012 07:00

So as long as failovers aren’t an issue and handling 8 paths instead of 4 is ok, then go for 8 paths I guess, but I never encountered a production environment where 8 was a requirement. If failing over with 8 paths causes problems, go for 4.

RRR

2 Intern

•

5.7K Posts

0

August 23rd, 2012 07:00

Check!

dynamox

2 Intern

•

20.4K Posts

0

August 23rd, 2012 07:00

RRR wrote:
I’m not sure on this one, but I think I heard a colleague of mine say (years ago) that VMWare ESX can only handle 1024 objects (related to storage). A LUN that uses 4 paths are 4 objects, so if you use 8 paths, eachg LUN will cost you 8 / 1000. I’ve seen ESX clusters with >100 LUNs. Any comment on this?

I would have to revisit ESXi storage admin guide to look at limitations.

JonK1

247 Posts

0

August 23rd, 2012 07:00

Guys, before we forget: Mark has arranged for us to expand this topic onto Twitter for tonight, 7-9PM CEST. Jump in if you like, hashtag is #EMCATE.

http://tweetchat.com/room/EMCATE

See you tonight?!

Mabro1

666 Posts

0

August 23rd, 2012 09:00

Indeed, in 10 minutes we kick off. Looking forward to it

RRR

2 Intern

•

5.7K Posts

0

August 23rd, 2012 10:00

Better performance in the 14+2 compared to 6+2? If you look at it per GB the larger group performs a little less good!

JonK1

247 Posts

0

August 23rd, 2012 10:00

vipinvk111 @JonKlaus @50mu can you explain the RAID10 write penalty. How it is 2 ? I hope atlease 1 read and 2 write is required (as mirrored) #EMCATE -7:23 PM Aug 23rd, 2012

For RAID10 or RAID1 writing, no readback is performed. If you change a block, it's written to the primary disk and the secondary disk: 2 write I/O's in total.

This is different from RAID5 where you need to read the old block, the parity (that's 2 reads), then write the new block and the new parity (that's 2 writes).

RAID6 builds on RAID5: you read the old block and the old parity blocks (dual parity), so 3 reads in total. And then you write it all back: data + parity blocks. 3+3=6!

JonK1

247 Posts

0

August 23rd, 2012 10:00

victorforde #EMCATE What makes the 14+2 RAID6 more efficient than the older 6+2 RAID6 for NL-SAS?

A good question from victor over on Twitter...

So for starters, the bigger the RG is, the more data disks you'll have compared to parity disks. In the new situation you'll have the nett capacity of 14 drives to allocate to servers. In the older 6+2 groups you would have to make two groups and still only end up with the capacity of 12 disks. Using 4 parity disks instead of a total of 2. This is just the financial aspect...

Performance wise, a bigger RAID group (obviously ) contains more drives, which will yield you more performance. However, watch out for the downside of this: rebuild times will be longer with larger RAID groups. Or, if you're rebuilding on ASAP rate, the impact on your storage (bus, disks, SP for parity calculations) will be larger.

If you want the performance and low rebuild impact, keep the RG's small and use striped metaLUNs to get you the amount of disks (=IOPs) you need. But you'll pay for it with $$

1
4
5
6
7
8
12

View All

No Events found!

Enterprise Support

Ask the Expert: Performance Calculations on Clariion/VNX

Performance calculations on the CLARiiON/VNX with RRR & Jon Klaus