Re: Ask the Expert: Isilon Performance Analysis
We monitor the drives for fail or stall rates.
disi -I hwhealth ls # will list ECC and STALL occurences.
If a drive fails sufficiently it will be smartfailed at which point we will not target new writes. If the drive is healthy and eligible for a write based on protection level and free space, we will write to it.
sysctl efs.lbm.drive_space # local to each node, reports the total blocks and used block space.
NOTE: When you replace a failed drive it is important to allow multiscan (collect * autobalance) to run. It is ideal to have all drives
OneFS's disk scale-out model does do a good job of leveraging all the disks in the system. If a drive has too many queued I/O's we will still queue to it. When you are in this state you should notice from the isi statistics drive -nall -long --orderby=timeinq | head -14 and then repeat into tail -14, that your top 10 and bottom 10 drives have a uniform degree of queued I/O.
isi statistics <sub-command> --help, support a --csv is the comma separated output format and is as close a means to export data to excel or another tool.
The latest version of InsightIQ the fully fledged Performance and Trend analysis is another tool to consider for reporting.