Start a Conversation

Unsolved

This post is more than 5 years old

8952

August 9th, 2012 18:00

Just getting started with an Isilon and VMware and file transfers are slow

I have been a user of VMware with Dell's equallogics. An Isilon was purchased and hooked to a Cisco UCS Blade server. This datacenter where we are renting rack space has a 10 gig connection to another datacenter where we are renting rack space and also 10 gig connections between the datacenters and the Isilon is plugged into a a 10 gig backbone even though our Isilon (and I'm not that familiar with everything) has 4 1 gig connections that are supposed to load balance?

The biggest issue is speed. I can transfer a 3 gig iso image to a local drive or a server in the other datacenter and I get about near 1 gig speed at about 100 MBs per second. That I can live with. However if I transfer staff files that are much much smaller in size and I try to transfer 100 gigs of data that has about 145,000 files in it the speed is about 10-20 MBs or about 10% of a standard gig connection.

That happend if I tranfer a file from the share on the Isilon to the drive of an ESXi host whose datastore is also on the Isilon.

On backup jobs using dedupe from the 2nd datacenter to the 1st datacenter that has the Isilon I can get about 200 MB per minute. On a backup job from a building that only has a 100 MB to the 2nd datacenter that has the Equallogics I can jam the 100 Megabit connection on a backup job.

Is there something not configured right on the Isilon or is it just the nature of the beast.?

thanks

1 Rookie

 • 

20.4K Posts

August 9th, 2012 19:00

what kind of Isilon nodes do you have ?

30 Posts

August 10th, 2012 09:00

Load-balancing across multiple NICs works with SyncIQ, but if you're uploading data using standard NFS (including vSphere datastores) or CIFS protocols, you're restricted by the protocol architecture to a single NIC on the storage cluster.

Having said that, what you're describing sounds pretty slow to me too. Besides dynamox's question, what protcocol are you using to transfer the data, and are you mounting to a 1Gb or 10Gb interface on the target node?

James Walkenhorst

Virtualization Solutions Architect

Isilon Storage Division

james.walkenhorst@emc.com

1 Rookie

 • 

20.4K Posts

August 10th, 2012 09:00

James,

did you mean SmartConnect ..not SyncIQ ?

30 Posts

August 10th, 2012 10:00

What I meant was that SyncIQ will transfer data between the source and target clusters using as many interfaces simultaneously as you specify, because it isn't based on either NFS or CIFS/SMB for data transfers.

With NFS datastores, though, SmartConnect will balance new client connections according to whichever policy you specify, and it will rebalance connections in the event of a NIC/path failure (by rebalancing IP addresses) but it won't do a round-robin-style distribution of data streams for existing connections.

iSCSI datastores can be configured to do that in vSphere, but the core NFS architecture doesn't allow for multipath connectivity. If you map an NFS datastore using a SmartConnect zone name, you're still mounting an ESXi host to a specific IP address, which in turn maps to a specific node interface on the Isilon cluster. Balancing a single NFS data stream across multiple physical paths requires pNFS, which isn't currently available in either OneFS or vSphere.

I do think there might be something misconfigured in the connection between the ESXi host in one data center and the Isilon storage cluster in the other. I just can't tell what it is from the information given in the original post.

Sorry for the confusion. Hope this clears things up a bit...

jaw

August 10th, 2012 11:00

I hate to answer a question with a question, but the following part of your post has me scratching my head...

"That happend if I tranfer a file from the share on the Isilon to the drive of an ESXi host whose datastore is also on the Isilon."

Is the ESXi host in the same datacenter as the Isilon storage cluster?  Also, is the vSphere Client connection to the ESXi host also made from the same datacenter, or at least from the same side of the WAN?

If not, a copy operation using the vSphere Browse Datastore functions is going to pull all the data across the WAN to the vSphere Client side from the Isilon Share and then send it all back across the WAN again to the ESXi Datastore/Isilon NFS.  It also sound like you are processing the name space for 145,000 files for CIFS/SMB and then having the ESXi process the same namespace again for NFS.  Finally, I have not double-checked, but I believe vSphere 5 Client connections are SSL-encrypted by default.  If this is true, we also have the encapsulation and encryption overhead to account for.

This does not address all of your concerns, but if we can confirm the test parameters we may be able to sort more of this out.

1 Rookie

 • 

20.4K Posts

August 10th, 2012 11:00

are you saying that SyncIQ could be overloading node interfaces and causing performance issues for NFS clients ? I am not following you how SyncIQ is affecting NFS performance ?

Thanks

August 10th, 2012 12:00

We are still missing confirmation on the user interface for the WinShare2ESXidatastore transfers and the ESXi2ESXi transfers.  CLI versus vSphere Client can make a big difference as to where the transfer traffic actually goes.

Also, has anyone checked for a consistent Maximum Transmission Unit (MTU) size end to end in the same datacenter and end to end across datacenters?  Isilon will support Jumbo Frames (9000 byte MTU), depending upon the OneFS version you are running.  Isilon will also support Link Aggregation Control Protocol (LACP) in a way that can be compatible with Cisco switches.  I do not recommend activating both LACP and Jumbo Frames unless you are one OneFS 6.5.5.x or later.  THX

6 Posts

August 10th, 2012 12:00

The one test that I did was from the shared folder on the Isilon to the datastore of an ESXI host also on the same Isilon all hooked up to the same Cisco switch.

I should also mention that I get the same slowness when transferring files from a physical drive on an ESXi host (I am on ESXi 5) to the another ESXi host's physical drive at the other datacenter totally bypassing the Isilon and the Dell Equallogic at the other datacenter so maybe the issue is the networking.

There is a 40 gigabit connection between both datacenters and inside everything is connected by 10 gig except the Isilon which has 4 1 gig connections and the Equallogic on the other end that has 2 1 gig connections. The Equallogic does multipathing and it evenly distributes the load between both. The Equallogic so far is far superior with its ISCi connections that the nfs type share is.

EMC set up the Isilon but we are renting rack space from a provider so maybe there is some slowness to the network connections some how.

6 Posts

August 10th, 2012 13:00

Someone has changed my password on me so I cannot get into the unit but the version on the login page is

v6.5.5.4.  The network fellow who set it up didn't seem interested in Link Aggregation which I asked him about or setting the MTUs to 9000. I am used to that on the EqualLogics with its iScsi MTUs set to 9000 both on the switch and in ESXi as well as disabling storm control and port-spanning. We are renting these spaces so we have to go through the provider to make these changes.

The way I was told to set up shares for the staff was directly to the Isilon on the nfs shares. Is there ever any issues with that as opposed to building a server and sharing out that way?

And thanks for the help so far.

17 Posts

August 10th, 2012 13:00

macplano,

My question would be... What is your transfer speed like for a single large file?

Independent of the network storage type, large contiguous files typically transfer faster than many small files.  As a large file is sent across the network, that one file is being sent.  As small files are being sent, additonal pieces of information like permissions, attributes, etc (metadata) are being sent as well.  Those additional bits of info have to be received, written, etc.  With a single file, this only happens once. 

Additionally the copy process has to keep track of which files have been copied and which haven't.  A good analogy would be moving the contents of a house (having to keep up with) vs. moving the entire house.  Many things vs. a single thing. 

As a result, when it comes to comparing 1TB of 1 large file vs. 1TB of 1000s of files, the similarity ends at both being 1TB.

Regarding the front-end connectivity of your Isilon cluster, 4 1GB connections <> 1 4GB connection. With NFS, you aren't going to be able to leverage the aggregate of the connections in a single connection, but rather maximum of one of the connections.  I have to recommend Scott Lowe's content of NFS in a vSphere environment here: http://blog.scottlowe.org/2012/07/03/vsphere-on-nfs-design-considerations-presentation/ for a better understanding of how NFS works & and with vSphere.

Your Isilon cluster will also present iSCSI storage, provided you are licensed for it.  If you are, you could setup an iSCSI datastore and see if behaves in a similar fashion to other arrays. 

I don't see it listed here, but are you using some form of intelligent load balancing with your existing iSCSI array?  Out of the box vSphere does not perform load balancing across paths for better throughput.  Are you using PowerPath VE in conjunction with your existing iSCSI array?

I would start by copying something like a large ISO file (Windows 7 or something) to/from the Isilon cluster for a comparison.

Let us know how it goes!

Thanks,

Jase

6 Posts

August 10th, 2012 13:00

yes a 3 gig file transfers at near gigbit speed at about 100+ meagbytes per second. And that happens whether it is from an nfs share off the Isilon to a local esxi host or to the remote datacenter. So it could be all those little files etc. What I am going to do is test tranferring that many files inside our own building and see what kind of speed we get.

I have been left in the dark so to speak on this install but I am tasked with making it work correctly.

I will post more as I find out things.

Gary

1 Rookie

 • 

20.4K Posts

August 12th, 2012 08:00

use iperf to test raw pipe speed, get protocol out of the picture.

3 Posts

October 2nd, 2012 16:00

We use NFS for connecting with MAC OSX and CIFS for connecting with Window7 machines.

In the case cited, both the client putting data on the Isilon and the render blades loading data from Isilon,

are via CIFS.

-Jay-

3 Posts

October 2nd, 2012 16:00

I'm a new IT person at a CGI facility and new to interacting with the Isilon as well so I'm finding these threads useful.

We have our own challenges with getting good and consistent performance with our Isilon in a mixed Mac and Windows shop. I'll start by sharing that our shop is not on the most current Isilon software (we're on 5.5.7.9).

Not my call, but we attempt to run in such a way that all files are owned by "nobody" so there aren't as many access restrictions during busy production.  For reasons I don't fully understand, and has little to do with the isilon,

many of our files are not always owned by "nobody" as intended.

Probably deserving of a separate thread, but I've been able to correlate high incidence of sporatically failed texture reads during concurrent renders, with the ownership or group attributes of the texture files.  The files exist, RW access, can be opened and verified, but at a certain level of concurrent access from the Isilon during renders,

I get failures of a nature  "can not open file (incorrect arguments)".

       Strange as it sounds, when I examine the attributes of those files on the Isilon, I find that their owner:group are

both numbers and not resolved to a a string of any sort. Soon as I chown'd them to "nobody:admin" or something

ordinary like that,  the concurrent texture reads seem to sail through without errors. My current thinking is that unresolved ownership may cause extra overhead that resolved ownership does not, and that extra overhead can't keep up the concurrency at some point and begins to fail.  I'm a noob with the Isilon, found this interesting.

-Jay-

6 Posts

October 2nd, 2012 16:00

We have macs and PCs and everything is connected by going to the cifs. We have a major major problem when the Macs synch with the Isilon on the Documents folder. Here are some of the scenarios:

1. User had 6 gigs of data local on the MAC. The files synched up about a month ago and about 2 weeks later the process deleted them off the Isilon and the Mac itself.

2. A lot of users end up with their folders "hidden" on the isilon so I have to "unhide" them.

3. We had 4 cases this week where 80% of users files disappeared.

4. Many times the Mac users can't even connect to their home share on the Isilon. We had no issue when they connected to a Windows server that was not connected in any way to the Isilon.

5. On a restore all the permissions are shot to pieces and the backup job log tells me access denied on all of them but they seem fine to the user.

We have turned of the synching for now.

I wanted to put all the shares on a Windows 2008 server and share them out that way but they wanted them just shared out as NFS. That means I can't use an agent for backups and we can't use smb for connecting. If I had my choice I would put them on a Windows Server, Use ExtremeZipIP which makes windows shares as if they were afp shares, and be done with it.

6. We have all our major VMs on the Isilon but we have no issues with that.

No Events found!

Top