PowerScale: How to back up and delete SyncIQ reports that cause slow WebUI and CLI
Summary: This article provides detailed instructions for cleaning up SyncIQ reports to maintain optimal system performance.
Symptoms
By design, SyncIQ generates a report for every job execution, including failed jobs. OneFS retains both reports and subreports for each job instance. When the number of SyncIQ reports accumulates beyond the default limit of 2,000, system performance can be adversely affected. This condition may result in degraded responsiveness within the OneFS Web Administration Interface (WebUI) and slower execution of SyncIQ CLI commands.
Cause
SyncIQ reports exceed the 2000 maximum reports default.
Resolution
Remediation Steps:
Cancel any running job and disable the SyncIQ service on the cluster:
- Cancel all running SyncIQ jobs:
isi sync jobs cancel --all
- Verify that the jobs are canceled:
isi sync jobs reports list
If the command is unresponsive, check and ensure that there are no entries in the following path:
ls -alh /ifs/.ifsvar/modules/tsm/sched/run
- Disable the SyncIQ service:
isi services -a isi_migrate disable
- Verify that all processes have stopped:
isi_for_array -sX ps -auwwx | grep migr | grep -v grep
Run the following procedure to clear reports:
- If a backup is required, take a backup of the
reportsdirectory (This may take time).
cp -r /ifs/.ifsvar/modules/tsm/sched/reports /ifs/data/Isilon_Support/synciqreports_backup
- Delete
reportsjob to remove the reports directory:
find /ifs/.ifsvar/modules/tsm/sched/reports -type f -name "report-*.gc" -exec rm -f {} \;
Alternatively, delete all but the last three days of reports. For example:
find /ifs/.ifsvar/modules/tsm/sched/reports -type f -name "report-*.gc" -Btime +3d -exec rm -f {} \;
- After deletion, rebuild the reporting database:
python -c 'import isi.fs.siq;isi.fs.siq.SyncIQUtils().rebuild_reportdb();'
Restart the SyncIQ service and verify it is running:
- Reenable the
syncservice:
isi services -a isi_migrate enable
- Verify that the service is running on each node:
isi_for_array -sX ps -auwwx | grep migr | grep -v grep
-
Retry WebUI and CLI commands to ensure that it is functional again.
Policies configured with
Skip When Source Unmodified: Yes, using this procedure can cause improper reporting. The job_id for a skipped job does not iterate. This causes reports to build up again. The workaround is to manually run a successful job.