Unsolved
This post is more than 5 years old
2 Posts
0
1133
November 13th, 2016 23:00
OneFS capacity after smartfail
The OneFS 7.1 user guide stated " After all data migration is complete, OneFS logically removes the device from the cluster, the cluster logically changes its width to the new configuration"
Does it mean the usable capacity of OneFS will be decreased accordingly after a disk was smart-failed and removed from the node?
If no new disk is added back to the node afterwards, are the existing data in the cluster protected?
Provided that we keep the OneFS at most 70% full, can we keep the cluster running protected with a few disks removed via smart-fail?
Your advice and comments will be very much appreciated
Thanks
No Events found!


Peter_Sero
6 Operator
•
1.2K Posts
0
November 14th, 2016 04:00
> Does it mean the usable capacity of OneFS will be decreased accordingly after a disk was smart-failed and removed from the node?
yes, see isi status
or df /ifs
> If no new disk is added back to the node afterwards, are the existing data in the cluster protected?
yes, and it's even better: you never leave full protection during a smartfail.
Smartfail means a graceful decommissioning of a drive
and occurs on two occasions: a) deliberate or b) automatic pro-active removal.
In both cases, the content of the drive is still accessible and considered 'good'.
As opposed to an outage of a faulty drive, or a deliberate 'stopfail',
where the cluster is underprotected until the FlexProtect job finishes.
> Provided that we keep the OneFS at most 70% full, can we keep the cluster running protected with a few disks removed via smart-fail?
Absolutely, you do the exact math, but 70% would usually leave
headroom for 'a few drives' indeed. In many cases even for
smartfailing a full node.
You have just uncovered what largely makes the magic of OneFS.
And it all works identically with a virtual cluster aka simulator:
seeing is believing.
Cheers
-- Peter
juilian
2 Posts
0
November 14th, 2016 20:00
Thank you for your expert advice.
I have some follow-up questions.
If we choose to let any failed disks removed from our 6-node OneFS following FlexProtect /Smart-fail and do not add any new disks afterwards. It may happen that the number of disks in a node may differ across nodes.Will such imbalance result in reduced stability such as lower protection level and performance?
Thanks.
Peter_Sero
6 Operator
•
1.2K Posts
0
November 15th, 2016 04:00
Protection always comes first! Unless FlexProtect fails, files are protected.
When new files can't be protected, writes are denied.
At the same time, OneFS tries to keep all data balanced, and
this lead to 'interesting' effects if node capacities are unequal due to removed disks.
Drives in affected nodes become full early, and using up the last free
blocks leads to fragmentation which usually impacts performance.
Another effect, thinking of an extreme case, is wasting capacity when a single
node has more remaining disks than the other nodes. That single node's
excess capacity cannot be fully used up, because no other nodes
provide matching space for striping data according to the requested protection level.
Data writes would be denied, despite OneFS still reporting *less* than 100%
raw capacity used. Makes sense (without a picture)?
-- Peter