PowerScale-CloudPools lnn rename cookie reservation
Summary: Renaming nodes to previously used Logical Node Numbers (LNNs) can induce cookie key reservation files to be reused incorrectly. This can increase the risk of multiple LINs/Files writing to the same cloud data objects (CDOs). ...
Symptoms
There are various signatures to identify this issue:
In the idi.log:
IDI_VERIFY=bcm_verify_invalidate_on_valid_storage_layer:577 | COND=BAM Cache Manager invalidate verification | MSG=Cache invalidation is attempted over invalid storage layer: range lbn 512-767 found chunk LBNS: [512,767] snap HEAD revec: [0,0,35184372056064:8192#254 from snap HEAD, (sparse)#2 from snap HEAD ] | LINSNAP= 1:1111:1111
Review the isi_cpool_d.log for any of the errors below:
Range Error:
failed due to error code=12, msg=clapi error: CL_ABORTED_BY_CALLBACK; failed to transfer object range, invalid offset or output stream.
CloudPools Integrity error can be encountered during a cloud recall:
CloudPools data integrity error and CL_CHECKSUM_MISMATCH: failed to match the checksum: [error code: CBM_INTEGRITY_FAILURE]
In /var/log/messages the error below:
Failed Assertion:
isi_cpool_d: *** FAILED ASSERTION res == 0 @ /b/mnt/src/isilon/lib/isi_cpool_cbm2/src/ncoi.c:1025:
Cause
This is the result of cpool_fd_store using lnn in the filenames for cookie key reservation files.
Resolution
If CloudPools has never been licensed on the cluster but the health check receives the critical alert below, it is safe to ignore:
"Your CloudPools are susceptible to data integrity issues. If a PowerScale node's Logical Node Number (LNN) is changed and another node claims the vacated LNN, it could lead to cookie key reservation files being reused incorrectly and can increase the risk of multiple LINs/Files writing to the same Cloud Data Objects(CDOs)."
Renumbering Logical Node Number (LNN) can impact the CloudPools functionality, perform the steps below to properly renumber LNNs.
To renumber LNNs, see: KB 000022252
If there is an active CloudPools job running on the cluster, the following steps must also be completed.*Failing to do so induces the risk of Data Loss.*
(Record data from all commands run from this kb)
The CloudPools daemon must be disabled, the lnn renumbered, and the CloudPools daemon enabled.
1. Disable the CloudPools daemon.
#isi_for_array isi services -a isi_cpool_d disable
2. Verify the isi_cpool_d daemon has been stopped:
#isi_for_array ps -lwp `pgrep isi_cpool_d`
3. Renumber the lnns:
# isi config >>> lnnset [<old lnn> <new lnn>] >>> isi_lcd_d restart >>> commit >>> exit
4. Remove the existing reservation files for both the Old and New LNNs.
#rm -fv /ifs/.ifsvar/modules/cloud/cookie_res_*_[OLD_LNN] #rm -fv /ifs/.ifsvar/modules/cloud/cookie_res_*_[NEW_LNN] #rm -fv /ifs/.ifsvar/modules/cloud/ncoi_key_res_*_[OLD_LNN] #rm -fv /ifs/.ifsvar/modules/cloud/ncoi_key_res_*_[NEW_LNN]
5. Start the cpool_d daemon.
#isi_for_array isi services -a isi_cpool_d enable
If LNN renumbering has previously occurred, perform the commands below:
1. Stop the CloudPools daemon.
#isi_for_array isi services -a isi_cpool_d disable
2. Verify the isi_cpool_d daemon has been stopped:
#isi_for_array ps -lwp `pgrep isi_cpool_d`
3. Wait 10 seconds.
#isi_for_array isi services -a isi_cpool_d enable
****Failing to do so induces the risk of Data Loss.
Additional Information