This post is more than 5 years old

69612

February 9th, 2012 12:00

Management Networks and SAN Snapshotting

Hi all!

Long time lurker, first time poster. I must say that I'm very happy with the purchase of our EqualLogic PS4000 series SANs. I've been pouring over both EqualLogic and VMware's Best Practice Guides and I have a couple questions hanging out in my mind that I'm hoping someone can answer. I suspect it might help other folks who have searched for what I'm asking and come up short. I feel like I have quite a few of the pieces but I need some help putting them together. I hope you read this, Don! :emotion-10:

I'm using my SAN solely to host VMware vSphere VMs. I'm using Veeam Backup & Restore 6 to back up these VMs to a NAS box. It's been working great! Veeam is running off a physical server connected to the SAN with dual gigabit NICs. I'm using the 'SAN Mode' backup (their name for an off-host backup) to snapshot the VMs and dump them on my NAS box.

...now for the EqualLogic questions!

  1. I understand I can use my EqualLogic SAN to snapshot a LUN. I also understand that I can set up replica LUNs, and I can replicate LUNs between two EqualLogic boxes. The LUNs I have carved out are solely for VMFS. Considering I'm using Veeam with the SAN Mode backup option - is there any reason why I'd want to reserve space for snapshots on my EqualLogic? Right now I have the default 20% reserve and for my situation it seems like it's wasted space? Am I missing something here? Note that I get de-duplication and compression with Veeam!
  2. My PS4000 has a dedicated 100Mbit management port that I'm not able to use for iSCSI traffic. My plan was to VLAN off a management network for it on my switches that I'm solely using for my VM hosts and the SAN. Then I found out that my dedicated management port would need to be on a separate subnet from my other interfaces. Now I'm thinking about not even using that dedicated 100Mbit management port and just doing management through the iSCSI group IP address. To me it seems like routing that management traffic is not worthwhile. Are there negatives associated with managing the EqualLogic through the iSCSI group IP address that I'm missing?
  3. My EqualLogic is on a physically separated network segment with no route to the Internet or my production network. I like this idea - but then I cannot use NTP or SMTP. Of course I've manually set the time correctly on my EqualLogic but I'm interested in how accurate that time needs to be. Is it only used for time & date stamps on SAN based snapshots and logs? Regarding SMTP - since I'm not able to reach my normal SMTP server, can I set up SAN HQ to collect any alerts from the SAN and e-mail me that way? Are there any alerts that wouldn't come through? What are other folks doing to address this?
  4. Adding multiple volumes does not increase SAN performance in any way, right? Is the main reason I'd set up multiple smaller volumes versus a larger one (2TB - 512B to make VMware happy) just for the ability to set up different RAID types? I understand this question gets more complex with multiple SANs but for this question I'm talking about a single SAN
I apologize for being so verbose. I blame it on all the forum lurking I've been doing where people don't post enough information.

February 10th, 2012 07:00

Thank you so much for your input, Don. I really appreciate it. Your answer regarding multiple volumes makes complete sense to me. Since I'm using multipathing (via VMware round robin) I'm assuming I need to balance my number of iSCSI connections to the SAN with the performance benefits offered by multiple volumes. I suspect this won't be as much of an issue with current firmware versus older (I know the iSCSI connection number limit was doubled in recent firmware).

I already have SanHQ 2.20 installed so I can use that to look at my current queue depths. I also did some poking around the forums and I see there is some LUN design talk as it relates to VMware. I understand there is no one-size-fits-all solution so I'll tackle all that next.

If I see in SANHQ 2.20 that my queue level never hits 32 do you imagine I would see less performance benefits to chopping my LUNs up into smaller pieces? I imagine more iSCSI connections at the same time is better, I just want to make sure that the pain of redesigning and copying all my VMs into additional smaller LUNs is going to be worth it.

Should I limit my redesign scope to focus on servers that use the most IOPS? It'd basically be my single Exchange 2010 box, two separate SQL servers, and my file server.

February 11th, 2012 16:00

Thanks for all the great info, excellent post. I'm using vSphere 5 Standard. I did some testing with the iops value when I first got my SANs and setting it to 3 didn't improve my results with iometer (using settings to take advantage of the 1MB block size) so I didn't keep the change. I understand now why I didn't see improvement with iometer - I knew the results wouldn't be 'real world' but I didn't know how else to test. I'm glad the consensus is to set it to 3.

Now I've set my iops value to 3, disabled LRO, and turned delayed ACK off on all my hosts. The LRO thing was new to me so thanks for mentioning it! I know there is a lot of information out there about lun design and sql/exchange best practices so I'll read up on that stuff next.

Thanks again for taking the time to reply.

4 Posts

February 12th, 2012 05:00

Those would actually create a support case via the Equallogic Website. Alerts from SANHQ do not.

203 Posts

February 14th, 2012 19:00

Hey Don, can you confirm that this is still advisable in vSphere 5.0 when using Round Robin?  (I dug around some Dell / EQL docs, but it didn't mention it.  For some reason I thought this was no longer an issue.  You covered the LRO command line change, but what was the string needed for setting the IOPs to 3 (under vsphere 5)?  What have you found as the best way to monitor the balance of the RR?

203 Posts

February 15th, 2012 06:00

Wow... I overlooked this somehow.  Thanks Don.  Having previously set each datastore to RR, I did execute the following on one of my datastores:

#esxcli storage nmp psp roundrobin deviceconfig set -d naa.6090a09860213c7843eb34e9380180de -I 3 -t iops

Looking at the details again via esxcli storage nmp device list, I see that the "iops=3" validated the change, but I also see the the string "policy=rr" was changed to "policy=iops"  Is this expected?

Also, is it advisable that this only be performed while in maintenance mode, or can it be changed while in production?

203 Posts

February 15th, 2012 07:00

Perfect.  The test I had done was on a system in maintenance mode.  Just wanted to verify.  Thanks.

February 15th, 2012 15:00

Thanks for the link to rinetd, Don. I plan on forwarding NTP and SMTP traffic from my isolated iscsi network onto my production network via my physical backup server. Seems like a better idea than routing that traffic...why make things more complicated than they need to be, right?

203 Posts

February 15th, 2012 19:00

Interesting... observation.  I took a host into maint mode, and of the datastores it was connected to (already set to RR), I applied the "esxcli storage nmp psp roundrobin deviceconfig set -d naa.[idnumber] -I 3 -t iops".  Verified all of the changes.  Took it out of maintenance mode, threw a couple of VM's on it.  Ran Iometer on one of them.  Then I went into vcenter on the realtime monitoring of the Network, and isolated just the two NICs associated with my iSCSI vswitch.  And I still only see one of them being used.  Weird...

I know I'd probably solve all of this by just installing and using the MEM, but that looks fairly involved from what I can tell.

203 Posts

February 15th, 2012 20:00

In each of the given vmkernel ports, in the NIC Teaming tab, where I have one "active" and one "unused", all of the "Policy Exceptions" do not have a tick next to any of them, but the Failback does have a grayed out "yes"  For the actual vmkernel ports, the "Override switch failover order" is ticked so that one can be set as active, and one as unused.

I'll have to experiment on one of my other hosts for the MEM.  I see the latest one just came out for 5.0, so that might give me a little inspiration (along with currently only utilizing one NIC).

203 Posts

February 15th, 2012 20:00

Gotcha.  I did that (ticked the checkbox, setting it to no), rescanned the volumes for good measure, and ran another test.  still pulling off of just one vnic.  

Yeah, I'm all vsphere 5.0, so maybe I'll give that a try sometime soon.  It would be nice to figure this one out though.  Hmm...

203 Posts

February 15th, 2012 21:00

Good idea.  Thanks for the good info Don.  Much appreciated.  Hopefully I can thank you in person at the DSF this year.

203 Posts

February 16th, 2012 11:00

Well thats a shame.  ...Oh, as for the issue.  Tech support lead me down the right road to the cause.  The lack of the vmkernel port bindings under the iSCSI software adapter.  Fixed, and everything is fine now.

No Events found!

Top