Coalescer CFIFO Thread Locking
Summary: Cluster deadlocking from cfifo thread contention
Symptoms
Possible unresponsive isi commands, performance degradation, and client connection may appear to become unresponsive due to locking contention.
Cause
In some rare cases, the cluster may experience deadlocking due to thread contention.
This can be when multiple threads are performing an insert, while another thread is performing coalescer flush.
Resolution
This is resolved in 9.4.0.17+, 9.5.0.3 hotfix, 9.5.0.7.
To confirm the issue is occurring live, check for 'cfifo' thread locking on the cluster with the below command--
# isi_for_array ‘sysctl kern.proc.all_stacks |grep cfifo’
If cfifo is shown waiting over 100k ticks, then a panic of the node will be required to release the lock.
Example--
Waiting on 0xfffff8142b2dd580 with msg "cfifo" for 32619857 ticks <------ cfifo waiting for over 100k ticks Stack: -------------------------------------------------- kernel:sched_switch+0xbcc kernel:mi_switch+0x128 kernel:sleepq_wait+0x2b kernel:_sleep+0x264 kernel:write_sleep+0x4e kernel:coalescer_insert+0x1e26 kernel:coalescer_write+0x2bfe kernel:bam_coal_write+0x64 kernel:_ifs_write_mbuf+0x6b kernel:ifs_vnop_wrapunlocked_write_mbuf+0xdc kernel:VOP_UNLOCKED_WRITE_MBUF_APV+0x93 isi_lwext.ko:lwextsvc_write+0x4ff kernel:amd64_syscall+0x380 --------------------------------------------------
In case there is concern of experiencing the above issue, a case must be raised to support with a full log gather for review.