Streaming Data Platform: How to resume PSearch Indexing when PSearch is unable to read events
Resumen: How to resume PSearch Indexing when PSearch is unable to read events
Síntomas
Identify the issue
User found PSearch stops reading events, following these steps to identify the issue:
- Check logs of psearch shardworkers:
kubectl logs psearch-shardworker-0 -n $project kubectl logs psearch-shardworker-1 -n $project kubectl logs psearch-shardworker-2 -n $project
- Found the messages as below. That means the psearch shardworker's cache is full
-
2023-07-13 04:34:46,672 4939478 [6c9ea23c-b72d-47d9-84b7-a8f1590a22d4-Lucene Merge Thread #897] INFO com.dell.pravegasearch.shardworker.engine.merge.PSearchMergeScheduler: [] - cannot acquire enough space, estimated 212524805 available 40222923 required 172301882 VS expunged 0, global expunged 0
Causa
Resolución
Work around the issue
1. Restart the shardworkers who have reached their cache capacity.
kubectl delete po psearch-shardworker-x -n $project
2. Open PSearch UI (credentials: admin/admin) -> Dev Tools, perform the query to get the latest indexed event:
GET solong3/_search
{
"query": {
"match_all": {}
},
"sort": [
{
"event_time": {
"order": "desc"
}
}
],
"size": 1
}
3. In the VM terminal, run “date -d” on the returned timestamp. This value is the timestamp of the most recent processed event in psearch. You can compare it with the current system time to see how many events are still in the backlog and have not been processed yet.
date -d @1689147333.022 Wed 12 Jul 2023 07:35:33 AM UTC
4. Run the command in Dev Tools to set index retention to 10 s:
PUT solong3/_settings
{
"index.retention_time": "10"
}
5. The indexing process should be faster than before, otherwise restart shardworker again. Wait for a while and then go to steps 2 and 3 to see the backlog processing progress. Once the backlog has been fully consumed and the read and write keep the same pace, reset the index retention to the original value (for example: 3600 s).
PUT solong3/_settings
{ "index.retention_time": "3600" }
The effect of the workaround
The data prior to step 5 cannot be searched since they are all get retained. This workaround only affects the searchability of data and does not delete any user data.