Unsolved

This post is more than 5 years old

1 Rookie

 • 

28 Posts

3693

June 28th, 2017 12:00

Unable to use Isilon NFS share over 538G

We are migrating an NFS share from NetApp CDOT to Isilon.   The issue is the application that uses the Isilon NFS share gets an error when the quota is over 538G in size.   The application worked fine when using NetApp with a 1TB volume/NFS share.

NetApp:  8.3.2P1

Isilon:  8.0.0.4

Application:  Lexmark ImageNow

   (note on the application, it is 32bit and we DID find the Isilon option --return-32bit-file-ids=true.  That did help with reads, but we are still getting an error on writes).

Error from application:

11:17:25.512007(f3cf2b40)       CINODBCConnection::BeginTransaction(87GY8Y) Beginning Transaction.

11:17:25.549089(f3cf2b40)       info (cat=54HT8F id=54MW4T loc=359QY5 ref=CN63H5): A File System failure has occurred.: CINFileSystem::DiskSpace error: : : Value too large for defined data type 75

11:17:25.550487(f3cf2b40)       info (cat=54HT8M id=54MWDM loc=35WRMJ ref=CN63H5): No more OSM tree for OSM set (osm_01).: osm_01 doesn't have any OSM tree to span

11:17:25.550657(f3cf2b40)       info (cat=54HT8F id=54MW4X loc=3HHVQ9 ref=CN63H5): Disk is below the minimum required free space.: Failed to store stream contents to OSM. osm_01.00048 has not enough space, and  has no more tree to span.

11:17:25.552118(f3cf2b40)       CINODBCConnection::RollbackTransaction(87GY8Y) Transaction

                Elapsed Time:  0.039970

                Database Time: 0.029309 (73.327495%)

11:17:25.552318(f3cf2b40)

11:17:25.552473(f3cf2b40)       CINOutputProtocol::WriteString: {FAIL}

11:17:25.552624(f3cf2b40)       CINOutputProtocol::WriteString: {Disk is below the minimum required free space. (3HHVQ9:CN63H5)}

Have opened a case with the application vendor Lexmark ImageNow but we are not getting anywhere.

Have opened a SR with EMC but that has come to a standstill as well.  We did packet captures but it did not reveal anything useful.

The application works fine when the quota on the Isilon is set to 538G or less.

We have tried different quota settings as well as the --block-size setting for 'isi nfs exports...'

I am hoping that someone has come across this issue or something similar to it.

252 Posts

June 29th, 2017 08:00

Hi brichtab,

I don't have an answer for you, but I brought up this issue to one of the top NFS guys in support. It has piqued his interest and would like to take a deeper look. Can you provide an SR number? He has some theories but would like to get a deeper dive into the information that has already been gathered.

Thanks.

33 Posts

June 29th, 2017 08:00

It's a problem with the 32bit system calls like statvfs or stat returning EOVERFLOW and the use application evaluating that as an error.  We have seen it before on older closed source apps.  There is no easy fix.  Under linux I was able to hack something but it wasn't pretty.

1 Rookie

 • 

28 Posts

July 7th, 2017 06:00

sjones5, this is the SR 06984549.

Thanks!

1 Rookie

 • 

28 Posts

July 27th, 2017 07:00

What is the block size that Isilon uses?

This is the latest response from the vendor:

The Linux binaries for Application Server are all 32-bit executables. Because of this, you are going to be limited on the disk space that you can use. The largest size of an unsigned long integer (in 32bits) is 4294967295. This is about 4 Gigabytes. However, with a block size of 1024, which is pretty standard for a default Unix install, that brings it up to about 4 TB max. If you are using a different block size with Isilon, your limit might be different.

To resolve this issue, you will need to either modify the block size or reduce the disk space on the Application Server.

450 Posts

July 27th, 2017 12:00

Isilon uses 8KB blocks.  Being a 32-bit application doesn't necessarily mean that the application cannot address data with 64-bit file-id's, that's just likely an application limitation.  I've seen that most frequently on applications that are written in JAVA (old versions), or run on older versions of Solaris.  There is, however an option that's configurable per NFS export (I think at the CLI only) to return just 32-bit file-id's, which may be beneficial to you here.  It more or less is just a stop-gap until the application vendor enhances the application.

~Chris

6 Operator

 • 

1.2K Posts

July 31st, 2017 12:00

Couble of things here, file-id, block sizes and (drums rolling) inode counts.

Where the 32-bit file-id issue exists for older apps, it would not

be expected to depend on the volume size (quota limit).

So it's most likely that is not the problem here.

Block sizes confusingly appear on different levels:

- disk media: traditional 512B, "modern" 4K

- filesystem layout/allocation, OneFS fixed at 8KB

- transfer size at application/protocol level

  (local and networked filesystems can "suggest" a value, can be configurable, often 128K or 512K for NFS)

- filesystem capacity reporting in multiples of a choosable unit of measure

  (right what brichtab quoted two posts above)

The latter can help expressing huge capacities with 32-bit values,

just by choosing a large "unit blocksize" on the server.

With OneFS, if a 32-bit application still chokes on the capacity inquiries,

this ist most likely because of the inode count returned by the same system call (statfs(2)).

The thing is that OneFS does NOT report any useful inode counts at all,

but instead repeats the capacity (total/used/avail) now in multiples of 512B! (the minimal inode size)

It might make some sense for total/available inodes as all free capacity

can theoretically be used for inodes in OneFS (unlike other filesystems which

preallocate a fixed space for inodes). It makes almost no sense for estimating

the used inode count, though.

In other words, because they are based on a 512B unit measure, the OneFS inode counts

always tend to come out HUGE and can easily overflow 32-bit variables,

and there's nothing one can do about it. (other than setting capacity quotas,

but if you need a certain capacity, that comes with this strictly computed number

of inodes. Bummer for 32-bit)

fwiw

-- Peter

1 Rookie

 • 

28 Posts

August 1st, 2017 13:00

Thank you all for the input on this.

Chris K, we did enable the "32-bit-file-ids" and that got us half way.  After doing so the application was able to perform Reads when the NFS share was any size, but still not Writes.  Those were still capped at 538G quota.

So in the end it sounds like this is just not going to work on Isilon.  (at least that is what the application vendors last word is)

So it works on NetApp (any size volume) because of the 4KB block size and you can independently adjust inodes for a volume?

6 Operator

 • 

1.2K Posts

August 2nd, 2017 02:00

You can see the inode counts with "df -i

", that should reveal significant differences.

-- Peter

450 Posts

August 2nd, 2017 07:00

Moving the data to NetApp may not solve your problems either; this comes down to the app vendor enhancing their application to deal with larger filesystems.  If you're a big-enough customer of theirs I'd personally push them on it.  It's not necessarily anything wrong with how Isilon is doing things, instead just that the scale of a cluster is a different order of magnitude larger than what they're used to.  And as Peter pointed out, there's a lot of nuance to the exact figure that you'll see, but the point is that there are plenty of other applications out there that have zero problem with how this works on Isilon today.

One other possible option to consider.  Does the application support data access over SMB instead of NFSv3?  Could you therefore use a service account and access the same dataset over SMB, and would it still have the same issue?  It's a bit of an out-of-the-box solution, but maybe it'd work.  If you have a stage environment perhaps it's worth a shot.

~Chris

1 Rookie

 • 

28 Posts

August 2nd, 2017 08:00

Chris, not a bad idea about the SMB, I will look into that.

Also, the share is currently on NetApp CDOT and works fine with a 1T volume.   We are migrating all our shares from NetApp to Isilon so we are stuck with this last share to move

Plan B is going back to using SAN disk (which they were using back before my time).

-Brian

0 events found

No Events found!

Top