ECS: Dial Home - DT Memory/Disk Cache Size Alerts
Summary: This is a dial-home introduced in ECS 3.4 to detect Directory Table (DT) Memory/Disk Cache Size Alerts.
Symptoms
Symptom:
Alert for Blobsvc in ECS UI:
Blobsvc:
Page Table disk cache size for blob process is <a value larger than 10240> MB greater than the specified threshold of 10240 MB on node <node IP>.
Page Table memory cache size for blob process is <a value larger than 1024> MB greater than the specified threshold of 1024 MB on node <node IP>.
SR:
Page Table disk cache size for sr process is <a value larger than 10240> MB greater than the specified threshold of 10240 MB on node <node IP>.
Page Table memory cache size for sr process is <a value larger than 1024> MB greater than the specified threshold of 1024 MB on node <node IP>.
Cause
Four DT-related sensors are added in ECS 3.4:
- Blobsvc Memory Cache Size
- Blobsvc Disk Cache Size
- SR Service Memory Cache Size
- SR Service Disk Cache Size
The new DT engine uses SSD disk as memory extension. Blobsvc and SR service are the major services which leverage this new SSD disk swap in or out strategy.
Resolution
There is a workaround available what can be applied in order to change these thresholds to a higher value.
Receiving this false alert is not impacting the ECS operation.
Open a Service Request by mentioning this KB and request assistance in getting this workaround applied.
Background:
Page Table has a swap in and swap out mechanism. If swap out to disk speed is not catching up to swap in speed, then it would exceed the memory threshold (1G reserved). 10G for disk threshold is not enough for many use cases. We must tune the threshold to avoid frequent alerts in the UI and connect home alerts.