emarzock1

124 Posts

47817

June 23rd, 2011 05:00

VNXe Networking Questions

WRT VNXe 3100 configuration, and link failover and such, here’s what I don’t quite understand:

The docs refer to the two storage processors on the system as SP’s. This is terminology held over from CLARiiON, and strongly implies to me that the SP’s run active/active. But I also know that there is something analogous to DART under the covers, allocating iSCSI LUNs and NAS shares/exports.

So, how does this work? Is this DART-like code running on just one SP at a time? Both? If both, then what? What does that mean from a networking and storage access perspective? All LUNs and NAS shares/exports available on all configured network ports?

I don't understand, and I haven't found a document that explains it.

Thanks,

Eric

Responses(38)

Storagesavvy

474 Posts

0

June 23rd, 2011 13:00

While the VNXe does not run DART, it does act similarly to DART and shares a lot of functionality with Celerra/DART.

For network HA the VNXe acts a bit differently from Celerra though. Since Celerra used an Active/Passive model, you would build network redundancy within each datamover to handle network failures, and then duplicate the config on the passive datamover in the event of datamover failure. In Celerra configurations, this usually ended up employing the FailSafe Network and resulted in some ports being designated as passive which reduced overall bandwidth available to the datamover. VNXe is different due to its Active/Active model.

“For network high-availability features to work, the cable on each SP needs to have the same connectivity. If Port 0 on SPA is plugged in to Subnet X, Port 0 on SPB must also be plugged in to Subnet X. This is necessary for both server and network failover. If a VNXe server is configured to use a port that is not connected on the peer SP, an alert is generated. Unisphere does not verify if they are plugged in to the same subnet, but they should be, for proper failover. If you configure a server on a port that has no cable or connectivity, the traffic is routed over an SP interconnect path to the same port on the peer SP (just a single network connection to the entire system is not recommended).”

--Page 30, EMC VNXe Series Storage Systems, A Detailed Review

“Network paths – The VNXe supports network pass-through to provide network path redundancy. If a network path becomes unavailable due to a failed NIC, switch port, or bad cable, network traffic is re-routed through the peer SP using an inter-SP network, and all the network connections remain active.”

--Page 31, EMC VNXe Series Storage Systems, A Detailed Review

With this in mind, I personally would create an LACP group of all ports on SPA connected to one switch and an LACP group of all ports on SPB connected to a different switch. That way you get the benefit of ALL ports for traffic, network switch redundancy, and SP redundancy.

Description of SP Active/Active failover…

“When an SP experiences hardware or software failure, reboots, or a user places it in Service Mode, a failover occurs. Storage servers that use the out-of-service SP fail over to the other SP if it is available, with minimal disruption between the VNXe system and connected hosts.

…

After the SP is available again, the storage servers fail back to the original SP. The failback policy can be configured through Unisphere.”

Richard J Anderson

1 Attachment

h8178-vnxe-storage-systems-wp.pdf

emarzock1

124 Posts

0

June 23rd, 2011 11:00

So, if I'm to understand this correctly, you're saying that the SP's run active/active like that of the CLARiiON, but the iSCSI and NAS provisioning functionality is controlled by something analogous to DART?

This surprises me, because only 6 of the 12 network interfaces are showing up as available on the particular VNXe 3100 implementation that we're currently doing. It's very Celerra-esque in how it is configured.

Is there a document that really does a good job of describing all of this. It is not clear to us at all, and we've been to all the VNXe implementation training

thanks,

Eric

Storagesavvy

474 Posts

0

June 23rd, 2011 11:00

VNXe is active/active in the sense that both SPs service requests at the same time. Any particular LUN or Share will be owned by one or the other SPs under normal circumstances. If one SP fails, then the other SP takes over all of the volumes/shares from the failed SP. This is all very similar to the way Clariion SPs handle LUN failover.

The interesting/cool thing about the VNXe’s hardware is that there is an internal network switch between the SPs, so if a client loses access to the owner SP through the network, but the SP is still online, it can send requests to the secondary SP which will forward the requests internally through the internal switch.

Richard J Anderson

emarzock1

124 Posts

0

June 23rd, 2011 15:00

Richard, thank you for your complete, and authoritative answer.

Just a suggestion... There's much to be interpreted from all of the information presented in that paper. Somehow, a higher-level description, sort of like a section entitled "VNXe Networking for Dummies" if you will, needs to be put into that paper. Your narrative is a very good start on that.

Thanks,

Eric

drewdown1

8 Posts

0

June 29th, 2011 08:00

Richard

I have a 3300 that I am implementing now but would like to utilize the 10gig ports and have some form of I/O load balancing.

What I was thinking was utilizing both 10gig interfaces on SPA on a specific network and connect them to a dedicated switch. Then configure both 10gig interfaces on the SPB on a separate network and connect them to a separate dedicated switch? My vmware hsots would have dedicated vmkernels/pnics on both networks for failover and redundancy. I would like to utilize both paths for some form of load balancing but I am not quite clear on how to achieve this with the vnxe.

Is that advisable?

drewdown1

8 Posts

0

September 9th, 2011 08:00

huberw

I worked with EMC support on configuring multipathing with my vnxe3300. I am using (2) cisco 4948 and (4) 10gig ports on the vnxe.

Its configured like so:

SPA

eth10 192.168.55.5 -

> 4948A

eth11 192.168.66.5-------> 4948B

SPB

eth10 192.168.55.6 -

> 4948A

eth11 192.168.66.6 -

> 4948B

I then have all datastores in vmware configured for round robin multipathing. However I am seeing a fair amount of different paths (missing/broken/dead) between all my hosts and the vnxe3300. Waiting to hear back from EMC as to why.

However even without the transparency its been working quite well for me at this point. Its almost two basic though, I want to see information but its just not given to me from EMC and unisphere.

Message was edited by: drewdown

huberw1

14 Posts

0

September 9th, 2011 08:00

Richard,

I'm reading through this thread and appreciate all of the detailed info.

I agree with Eric though, EMC needs to be more transparent about how this is supposed to be configured. What is the best practice for configuring multipathing on a VNXe? Some people say it is to split the SP ports into different subnets. Port 0 of each SP is connected to physical switch 0 and is in subnet X. Port 1 of each SP is connected to physical switch 1 and is in subnet Y. You mentioned configuring a LACP group with all ports on each SP and connecting them to the same physical switch. Which is the correct way (I'm asking specifically for a VMware use case).

In the VNX techbook for VMware on page 50 it describes connecting the ports on each SP to different switches like I described above, but using the same subnet for all ports. I know CLARiiON arrays pre-FLARE30 had a bug that required using different subnets, which is how I got used to doing things. Has that changed with VNX and VNXe?

Looking for some guidance here...

drewdown1

8 Posts

0

September 9th, 2011 09:00

Total assumption on my part, but I would tend to believe its AULA (whether its referred to as AULA I dont know) because EMC had me configure the ds for round robin and being that EMC owns vmware I would hope they would be the ones to know.

huberw1

14 Posts

0

September 9th, 2011 09:00

Hi Drew,

This is another thing I have a question on.

It has been stated that the VNXe is active/active. Even though each LUN is "owned" by a specific SP, there is inter-SP connectivity inside the array that allows IO to be passed from the "non-owning" SP to the owning SP. This sounds an awful lot like ALUA, which is found in the CX and the VNX devices, but I've never heard of anyone calling this ALUA on a VNXe. Is the VNXe ALUA or not?

If it is ALUA, then round robin would be an appropriate PSP for VMware. If it ISN'T ALUA, then I would think that MRU would be the appripriate policy.

The VMware HCL lists the VNXe3100 and 3300 as being supported with MRU as the PSP. However there is a disclaimer right above it that says some vendors might support RR, but you need to contact them for info.

http://partnerweb.vmware.com/comp_guide2/detail.php?deviceCategory=san&productid=19523&vcl=true

So... is the VNXe ALUA or not, and what is the recommended multipathing policy for the array with VMware?

Storagesavvy

474 Posts

1

September 10th, 2011 10:00

It may help everyone here to understand that VNXe networking is NOT the same as VNX networking. VNX follows the same design guidelines as Clariion/Celerra.

With FC/iSCSI block on Clariion and VNX, ALUA is available to redirect IO to the alternate (non-owning) SP. Those redirected IOs are processed by that secondary SP and then passed to the owning SP, so the IO is actually being handled by both SPs in certain ways. This is why ALUA typically directs IOs to the owning SP if it can.

With NAS on Celerra and VNX, the front end datamovers are active/passive, each active datamover owns some set of filesystems/shares/iSCSI LUNs, etc and there is a passive (not fully booted) datamover that takes on the complete identity of a failed datamover when needed.

VNXe is not the same as Clariion, Celerra, or VNX, it is actually a newer design. It is Active/Active in that both controllers are actively processing IO simultaneously. However, there is still a notion of ownership like Clariion in that the filesystems/LUNs are owned by one of the two storage processors. If IO is directed to a port on the non-owning controller, there is actually an internal Ethernet switch of sorts that forwards the traffic. It is not ALUA. That said, I would not recommend directing IO to the non-owning controller under normal operating scenarios as it may affect performance. While you could follow the same network guidelines as VNX NAS for a VNXe, the fact that VNXe forwards IO between controllers means you have more options than a VNX as far as leveraging LACP and still getting switch level redundancy. It also depends on whether you are using NAS or iSCSI. With NAS you could easily connect all of the ports on one controller to the same switch and put them in a single LACP group. Then connect the second controller to a second switch/second LACP group, and use the internal switch/forwarding to handle switch failures. With iSCSI you may want to have two “fabrics” in which case you would want to mesh the controller connectivity into both switches using two ports (or two LACP groups) per controller.

In any case, it’s pretty flexible.

Richard J Anderson

huberw1

14 Posts

0

September 10th, 2011 11:00

Richard (Storagesavvy - love the blog BTW),

Thanks for taking the time to write up the detailed explanation of the networking on VNX, CLARiiON, Celerra, and VNXe.

I still have a few questions (probably more than before - sorry) if you wouldn't mind addressing them.

The explanation of ALUA on CLARiiON and on VNX sounds the same as the new technology in the VNXe. What is the difference between the VNXe "ethernet switch" that forwards IO to the owning SP and ALUA in the CLARiiON and VNX? It sounds like I am missing something there, or maybe I'm not understanding active/active..?

I guess I have to ask the question that if it is not recommended to forward IO to the non-owning controller then what is the point of active/active controllers? Does this mean that using round robin as the path selection policy in VMware is not recommended for VNXe? What is the recommended configuration for path selection (besides powerpath/VE) for VMware?

Suppose I have a VNXe 3100 with dual SPs, using block protocols only for VMware storage. I would configure it as follows:

SPA port 0 - ethernet switch 1 - subnet X

SPA port 1 - ethernet switch 2 - subnet Y

SPB port 0 - ethernet switch 1 - subnet X

SPB port 1 - ethernet switch 2 - subnet Y

(this takes me back to the CLARiiON AX and CX pre-flare30 days).

ESX is configured with a vSwitch that has 2 pNIC uplinks, each one bound specifically to a vmkernel port group in the correct subnet. The software initiator in ESX is bound to both vmkernel interfaces using esxcli to ensure both will be used. Does that sound like the optimal configuration? What multipathing policy should I use? I would guess that MRU is the right choice if directing IO to the non-owning SP is the best practice.

If this were a 3300 I would do the same thing but instead of only 1 link per physical switch I would have 2x LACP groups per SP, each connected to different physical switches.

What about the number of iSCSI servers? Is it recommended to create an iSCSI server per port or per number of trunks, so in the case of the 3100 I would have 4x iSCSI servers on the VNXe (and VMware's software initiator would be configured to look at all 4 iSCSI servers).

Last question, promise.. why would the configuration be different with NAS vs iSCSI? Wouldn't both configurations work for either protcol?

Sorry for all of the questions, you've been extremely helpful thus far! Looking foward to your response!

Storagesavvy

474 Posts

4

September 10th, 2011 15:00

Okay, I had to dig through some really technical documentation designed for VNXe developers to get a better handle on this…

The primary difference between ALUA on VNX/CX and the network redirection on VNXe is the layer at which the redirection occurs. VNXe uses similar hardware design as the CX/VNX Block storage processor and has a PCIe based CMI channel between the SPs.

With ALUA, the IO is received on the non-owning SP, and is redirected to the owning SP at the SCSI layer. SCSI acknowledgements are made by the SP that received the IO request (regardless of which is the owner). The backend IO is handled by the owning SP no matter which SP received the request. In essence, the FLARE operating environment on both SPs has to deal with the IO. There is a performance impact to sending IO down the non-optimal path.

With VNXe, the actual TCPIP Packet (layer 3) is forwarded through the CMI channel to the owning SP and then processed up the OSI stack to strip out the SCSI commands by the owning SP.

First, a bit of EMC NAS Background - In a traditional Celerra (or VNX File) network implementation, you would use the FSN (FailSafe Network) feature to create a passive set of Ethernet ports on the same controller as the active ports in order to gain switch level redundancy. The active ports connect to switchA and the passive ports connect to switchB. The passive ports never process any network traffic/IO unless the active ports are completely down (cable or switch failure). In that event, the Celerra/VNXFile code moves the IP addresses to the passive ports and starts processing network traffic via the second switch. This is great for network redundancy but requires you to dedicate ports for HA, reducing overall network bandwidth available under normal circumstances.

VNXe implements the FSN feature (on by default) in a different way. Since the VNXe is active/active (where as Celerra/VNXFile is not), VNXe automatically sets up the matching ethernet ports of both SPs into an FSN failover group. This is done way down low in the OS stack as a virtual NIC in the Core OS. The IP traffic rides over the PCIe CMI channel. So if the network port Eth2 on SPA fails, the VNXe will move the IP address(s) that were on that port over to the Eth2 port on SPB. The actual CIFS/NFS/iSCSI services behind that IP address are still owned by SPA in that case. When network connectivity is restored to the Eth2 port on SPA, then VNXe will move the IP address back to the original location/port. The same behavior exists in the reverse direction (SPA and SPB redirect IP traffic for each other as needed). No matter which SP is receiving the IP traffic, each SP always owns a specific set of LUNs (determined at LUN creation time) and processes IO for those LUNs. Active/Active in this case denotes that both SPs actively manage workload, not that both process workloads for the same LUN.

In order to ensure proper network HA (as well as SP failover HA), the partner port on both SPs must be connected to the same VLAN. So if you have an iSCSI environment with two subnets, make sure that Eth2 on both SPs, connects to subnetA, and Eth3 on both SPs connects to subnetB. Since a normal VNXe3100 has 2 ports per SP, there is no need for LACP or other link aggregation.

MRU is right for the VNXe but RR wouldn’t cause any harm since the alternate paths to the same LUN are always on the same SP. The VNXe does not present any passive paths for a LUN from the non-owning SP.

In other VNXe documentation I found, it’s indicated that you would use a single iSCSI server but associate multiple IP addresses to that iSCSI server. I’m not sure if you can have multiple iSCSI servers accessing the same LUN anyway.

Your description of your network setup sounds fine and technically works for iSCSI or NAS. However if I was building a NAS-only environment (no iSCSI) I would favor having more bandwidth per CIFS/NFS export while being as simple as possible from a client connectivity perspective. So I would put the two interfaces of each SP into aggregation groups and use a single IP address per SP or CIFS server, etc. That way your users/clients can leverage the full 2gbps bandwidth no matter which export or CIFS server they were accessing. For switch redundancy in this NAS-only case, I would just connect each SP to a different switch. If one switch fails, the VNXe will move the IP addresses for the unavailable SP over to the alternate SP and continue processing IO.

Hope that helps!

Richard J Anderson

huberw1

14 Posts

0

September 11th, 2011 18:00

Richard,

This is exactly what I was looking for! Thank you so much for going beyond the call of duty and digging up these answers for me, and on a weekend no less! I'll be sure to ping you on twitter if I have any other questions that pop up

Thanks again!!

emarzock1

124 Posts

0

September 12th, 2011 07:00

Richard,

Thank you for this awesome response. It is very helpful in finally, completely understanding the VNXe networking.

Your last paragraph has me still just a little perplexed, however.

I think that what I've read from your research is that, by default, SPA-1 and SPB-1 are FSN'ed to one another, and SPA-2 and SPB-2 are FSN'ed to one another. Also, for a NAS-only implementation, your're recommending that SPA-1 and SPA-2 are in a LACP group, and SPB-1 and SPB-2 are in a different LACP group.

Is that correct?

If so, this would mean that the Aggregation Groups would be spanning the redundant switches, and thus, the switches would need to be "stacked." If the switches weren't stacked, the Aggregation Groups wouldn't work.

Correct?

Just a suggestion for EMC: Your research, with accompanying pictures to add clarity, really needs to be incorporated into the white paper.

Storagesavvy

474 Posts

1

September 12th, 2011 09:00

For the NAS only implementation, the LACP group usage would require the ports of a single SP to be connected to the same switch (or same stack). You gain switch redundancy through the FSN. Physical connectivity looks like this..

NAS Server A - VLANA - LACP Group 1 -> SPA port 2 -> switchA port 1

NAS Server A - VLANA - LACP Group 1 -> SPA port 3 -> switchA port 2

NAS Server B - VLANA - LACP Group 2 -> SPB port 2 -> switchB port 1

NAS Server B - VLANA - LACP Group 2 -> SPB port 3 -> switchB port 2

In the above example, you get 2Gb/s of aggregate throughput to any NAS share on either SP. The clients will load balance across the two 1gb ports per LACP Group. If a switch dies, the opposite switch will carry the load for the all clients reducing the total bandwidth of the VNXe from 4gbps to 2gbps.

For the iSCSI implementation it looks different..

iSCSI Server A - VLANA -> SPA port 2 -> switchA port 1 - VLANA

iSCSI Server A - VLANB -> SPA port 3 -> switchB port 2 - VLANB

iSCSI Server B - VLANA -> SPB port 2 -> switchB port 1 - VLANA

iSCSI Server B - VLANB -> SPB port 3 -> switchA port 2 - VLANB

In this example, you rely on VMWare NMP, or PowerPath to handle path failure and/or load balancing.

Best practices would suggest that NAS and iSCSI traffic be separated anyway, so you could implement both with the addition of more ethernet ports to the VNXe

----

I agree about the pictures and more detail needed. Maybe we can get someone on that.

1
2
3

View All

No Events found!