Start a Conversation

This post is more than 5 years old

Solved!

Go to Solution

1515

June 28th, 2013 12:00

File Count Per Node Recommendations

Looking for recommendations on number of files per node. I seem to have numerous existing customers that are having massive challenges with jobs completing. In some cases certain jobs have never successfully run. I have them all engaged with technical support but the one trend that I've noticed is that my predecessor didn't necessarily take into consideration the amount of files needed to support rather focused on the capacity required and on a hunch we've remedied one of these accounts by adding additional nodes to the cluster. Obviously, this isn't a fix I'd like to offer up often so I'd like to know if there is any good estimate of files supported per node. Or rather, where should I look at leveraging smaller drives to manipulate the CPU/TiB ratio?

August 30th, 2013 15:00

There is no best ratio of files per node, or TiB per CPU. The common response to the question will be "it depends on the workflow and data layout". You may have billions of files in an archive, or transient data in scratch space where Isilon operates just fine. Or a much smaller data set that is problematic. you have correctly identified the real challenge in the Job Engine which is serialized (runs one job at a time) and jobs are often paused by higher priority jobs starting. The results is that very useful and nesessary jobs do not complete in a timely fashion, some times not at all.

Scaling CPU can help by scaling Job Engine threads and SSD can assist in the metadata operations of the Job Engine, however the solution (we hope) will be the improvements in the Job Engine itself.  OneFS 7.1 will include Job Engine 2.0 which can run mutiple concurrent jobs (3) thereby improve effeciency and effectiveness of internal operations.

1.2K Posts

August 20th, 2013 09:00

5 Practitioner

 • 

274.2K Posts

August 20th, 2013 19:00

Hi,

Are you referring to the amount of files (inodes) that can be hosted on a node or asking at how many open files can be actively open on a particular node at one time?

If it's the later than we would also need know what version on OneFS you are running as this limit was specific per OneFS version.

Robert

17 Posts

December 12th, 2013 15:00

This OneFS Limits document can help with setting appropriate expectations and properly planning

the cluster configuration:

https://support.emc.com/docu50238_Isilon-Guidelines-for-Large-Workloads-.pdf?language=en_US

1.2K Posts

December 13th, 2013 02:00

Fantastic document, but the recommendations for file and directory counts mainly address the end users' view.

The Isilon admin needs to learn about the impact of the file count on the performance of the restripe jobs.

You can be perfectly within the limits recommended in this paper, but find yourself in trouble with protect, autobalance

and smartpools jobs taking too long (weeks...). Additional advise on file count is needed to keep restripe times

within acceptable limits on a given cluster.  On the same cluster, the same total amount of data will lead

to different restripe performance depending wether it is spread to all 1 MB sized files or all 1 GB sized files,

and it would be good to predict that performance when planning a cluster.

Would be glad to see some guidelines from Isilon for this.

So far the cases from the community where clusters  were stuck in restripes,

the common denominator I observed was that the ratio

"total files in node pool" : "total num of disks in node pool" (all SATA, no SSD)

exceeded by far 1 Mio : 1

Cheers

-- Peter

No Events found!

Top