1 Rookie

 • 

106 Posts

November 3rd, 2014 09:00

The network interface firmware will register the link status and provide that information to OneFS (standard behavior for all machines).  The individual node should register that change right away - you should be able to see it if you query 'ifconfig' on that node right away.

Upon the change in status - the log files should also reflect the change in /var/log/messages.

Then the event is handed off to CELOG for alert notification.  The alert is created and then follows configured rules to provide event notification.  The 30 second figure makes me think of a timeout somewhere.  So there may be some delay in looking up DNS records in the CELOG configuration before alerts can be sent out. 

So if you are looking for faster turnaround on the event notifications, I would run a test to see the order of operations and further pinpoint where the delay is and the exact delay. 

Tail the /var/log/messages file

Unplug an ethernet cable (preferably avoiding critical business impact as this will cause movement and possibly interrupt connected clients!)

Run ifconfig on that node to verify the immediate registering of the link status change.

Note the exact moment of disconnection and verify it's in the tailed messages log.

Watch isi status output for alert notification - noting the time delay.

4 Operator

 • 

2.8K Posts

November 7th, 2014 02:00

Hi Aya,

There are three major processes (isi_celog_monitor, isi_celog_coalescer and isi_celog_notification ) to manage cluster-wide event logging. isi_celog_monitor process in each node is responsible for querying and receiving events from isi_stats_d and kernel, and then pass these information to the events coalescer.

If there is link down in any node, the isi_celog_monitor look for this event firstly, then send it to isi_celog_coalescer. The isi_celog_coalescer process will send this event to celog master. This procedure have used some times, but the whole procedure is very fast.A few customer give complain for Isilon event notification.For more information about cluster event, you can visit the internal-used document Isilon-Troubleshooting-Guide:-Administration---Cluster-Event-Log-(CELOG)

ce.jpg

4 Operator

 • 

1.2K Posts

November 9th, 2014 18:00

Hi Jeffey

great diagram! -- can you reveal one piece of information from the internal doc,

namely wether the coalescer has a time window within which

events get merged? Such a window would produce some delay

for any single event, when it is being hold until finding that no further

events arrive that might get coalesced.


Thanks


-- Peter

4 Operator

 • 

2.8K Posts

November 10th, 2014 00:00

Hi Peter,

This diagram was posted by a local SME in the Chinese Forum before, he confirmed this diagram isn't internal use only. So I copied it from the Chinese Forum. I l suppose the events have delay, but I don't look for the specific value of the delay in the document. 

Community Manager

 • 

7.4K Posts

November 11th, 2014 19:00

Jeffy ...

thanks you so much for the update ...

BTW ,,,, If customer wants to check every single event ( before showing as Coalecing event) Use isi_celog_monitor ?

Is that possible ? to check all the single event in Isilon ..

4 Operator

 • 

2.8K Posts

November 13th, 2014 01:00

Hi Aya,

For checking all event activity, you can log in to a node through the CLI, then run isi events list comand to view the local event log for the node.

No Events found!

Top