i'm looking for way to charge back customers via nodeclasses (i.e. H500/A200).
In our example we have a A200/H500 Cluster. The endcustomer has one directory. The directory is spreaded via Smartpools over the two nodeclasses. (Policy based, 90days and last accessed). Now I need a possibility to monitor how many space the customer is consuming on each nodeclass in one directory.
Split the directory and pin each to one Nodepool,is not possible.
I think isi get- D to all files via script and grep on pool is to slow. To many files.
Any ideas? Thank you
Have you checked out file systems analystics in InsightIQ?
With OneFS 8.0 and later the directory usage statics
can be filtered by node pools and by tiers.
As a less efficient alternative, but also much easier to set-up and to process the results:
isi filepool apply -s -r /ifs/path/to/directory
The output is organised by file pool policies, which you need to map
to the node pools according to the actual policy configs.
Check out the Byte counts in the lines 'File data placed on HDDs' (and '... on SSDs' if relevant)
Couple of notes, though:
- Just like the SmartPools and SmartPoolsTree job this command will restripe/migrate files, so
it might not necessarily be faster than the isi get + grep approach -- but simple as it is, it's worth a try.
- Run it right after a regular SmartPools or SmartPoolsTree has finished, to keep the amount
of restriping/migrating as low as possible.
- Using -n or -d options to suppress restriping/migrating sadly will nullify the 'File data placed...' statistics
- Unfortunately, both the SmartPools and SmartPoolsTree always omit the 'File data placed...' statistics,
even with regular (i.e. restriping) runs.
thank you for your answer.
With InsightIQ it is not possible to get this information. Because in "File System Analytics" its not possible to show a directory devide in Nodepools. or i cant findit. Even with th iiq_export tool you get only the same data. I also reviewed the raw Insight iQ database. there is the needed information (map between file/dir and nodepool) but only for the last biggest 1000 files.i.e.
Your approach with "isi filepool" sounds interresting, but how you can match Nodepools with Filepool policies? Because we have for example a archive policy and a default file policy where is the link between the "hardware" node class and the filepool policy? Than i have to "pin" the file pool policy to nodepool. Right?
Chris, 'Nodepools' is not a breakout but a filter, so it takes multiple queries in sequence.
In the WebGUI the "Apply" button seems to be broken, I see filters applied only
when I change selections further down the page. And the pie chart and per directory stats right to it
seem never to get filtered.
Have a look at the usage histograms for logical or physical size to see the
effect of multiple filters applied (directory, nodepools).
(I have no nodepools on my virtual Isilon at the moment.)
Calculate the approximate usage by multiplying file count by avg 'bucket' value,
and accumulate over all buckets;
probably not exact enough for chargebacks, though.
As for really exact 'isi filepool apply':
Each file pool policy has either a nodepool, a tier, or an implicit default target.
Extracting these from the SmartPools config programatically would require some effort for sure.
For a quick start, I'd inspect the SmartPools config in the WebGUI, and extract by hand
the rule->pool mappings that can apply to the directories in question.
This little table can get hardcoded into a small proof-of-concept script
that parses the 'isi filepool apply' statistics. Makes sense?