This post is more than 5 years old
1 Rookie
•
121 Posts
0
1998
March 6th, 2015 18:00
space calculation
1 billion of 1KB file, N+3 protection on NL nodes, how much space consume? Can you please let me know if I use the right way to get answer?
My calculation: 1 x 8KB x 4 + (512K x 4) = 34KB each file ----> 34 TB for 1 billion files
0 events found
No Events found!


kipcranford
125 Posts
0
March 9th, 2015 08:00
A 1 KiB file will be effectively mirrored by OneFS (any file <= 128 KiB will be).
The OneFS block size is 8 KiB.
At N+3, you need 4 copies of the data to satisfy the protection policy.
So, the 1 KiB of raw data will consume a full 8 KiB OneFS block, and that block will be mirrored 3 additional times to satisfy your protection policy. So, that's 4 8 KiB blocks, or 32 KiB total for the data, plus 2 KiB for the inodes, or 34 KiB for each file.
So yes, you have it correct...
Peter_Sero
6 Operator
•
1.2K Posts
0
March 7th, 2015 01:00
If this is a real project, better work directly with Isilon to get the overall usability (metadata performance!) taken care of...
It's usually not the capacity that matters most in many-small-files scenarios
my 2 cents
-- Peter
Peter_Sero
6 Operator
•
1.2K Posts
0
March 9th, 2015 08:00
We found in OneFS 6.5 that the metadata was increased in steps of 8KB (per copy) after the initial 512B. That can become quite painful when metadata is kept on SSD, and then lots of files are affected at the same time (deletion of many snapshot-protected small files!)
Is that unit of increment for metadata still 8KB in 7.1.1 and 7.2?
-- Peter
kipcranford
125 Posts
2
March 9th, 2015 08:00
> We found in OneFS 6.5 that the metadata was increased in steps of 8KB (per copy) after the initial 512B.
The inode itself is 512 B, and has a limited amount of space within it to store what we call "dynamic attributes" (you can see these if you dump the inode structure using 'isi get -DD "), of which there are several types. When this attribute space within the inode is full such that block addresses cannot be inlined, OneFS creates a pointer to an extension block. That extension block is a full 8 KiB, and is used to just store other block addresses. In this case, it's called a "single indirect" extension block; OneFS also supports double and triple indirect extension blocks, if they're needed.
Inodes, though, are always allocated in 512 B chunks.
> Is that unit of increment for metadata still 8KB in 7.1.1 and 7.2?
Yes, the increment is the same. However, the point at which OneFS starts creating indirects is a bit harder to predict, since it depends a bit on what's in the inode.
And yes, you're correct, all this metadata information (the inode(s), the extension blocks, if any, the CRCs, etc) all get stored on SSD. So it's not just the single 512 B mirror, it could be lots more data. This has ramifications, for example, when trying to size the SSD capacity for one of our "SSD storage" strategies (e.g. "metadata-write", or all metadata on SSD), since you have to consider all this data. That's one reason why L3 cache can be an attractive alternative to an SSD storage paradigm, since it removes some of these sizing complexities...
Peter_Sero
6 Operator
•
1.2K Posts
0
March 9th, 2015 09:00
Great explanation!
Sizing SSDs for metadata is tricky indeed, as we found during a mass removal with snapshots. The cluster sends out warnings at 75% SSD usage; and I must say, leaving another 25% as headroom is definitely a wise decision.
As long as you stay under 100% you know that your metadata is on SSD -- and this is was leaves me sceptic with L3 for metadata: You cannot (can you?) know how much has been dropped out of the cache already. If that will be the other billion of your two billion files, then there will be some surprise waiting at the next FlexProtect, MultiScan or tree-walk style client access...
-- Peter
kipcranford
125 Posts
0
March 9th, 2015 10:00
> You cannot (can you?) know how much has been dropped out of the cache already
Correct, there is no way to really know this. And true, metadata for frigid data can indeed be bumped out of L3 cache (I wasn't really exploring performance considerations of L3 vs. SSD storage
. This does highlight that L3 doesn't completely eliminate sizing considerations; there are still tradeoffs...
Peter_Sero
6 Operator
•
1.2K Posts
0
March 9th, 2015 10:00
Thanks again!
kipcranford
125 Posts
0
March 16th, 2015 08:00
> How are you referring that we need 4 copies to protect the data for N+3 policy?
The original question asked about 1 KiB files. In that case, the file data consumes 1 8 KiB block in OneFS. Since this is <= 128 KiB, OneFS will effectively mirror this data in order to satisfy the defined protection policy. A protection policy of +3 means you want to survive 3 drive failures or 3 node failures without losing access to your data. The only way for OneFS to ensure this is to have 4 copies of the 8 KiB block available -- 1 8 KiB block of the original file data + 3 mirror copies of that data.
sandyreddy
3 Posts
0
March 16th, 2015 08:00
At N+3, you need 4 copies of the data to satisfy the protection policy.
How are you referring that we need 4 copies to protect the data for N+3 policy?