Unsolved
This post is more than 5 years old
107 Posts
0
4457
How to stop "cached" Isilon EMAIL alerts during and after upgrade?
So the past couple of times I have upgraded Isilon clusters, I have shut off the RMR relay (just removed it from the SNMP alerts) which stopped Isilon from flooding our mailboxes (and more importantly, flooding the mailboxes of our management team). That works great, until you add the relay host back in. Even when I kill all the alerts, it appears Isilon backlogs all the emails so when I add the relay back, BOOM -- a flood of emails.
The emails are necessary if it was a PRODUCTION issue, but isn't there a maintenance mode or something when upgrading Isilon that says "Oh wait, someone is upgrading my software -- I shouldn't create any alerts!" ??
And before you ask, yes, I checked the community and couldn't find anything in writing that says how we do that. I have another upgrade this Saturday and next Saturday and I really want to know how I can kill these alerts and NOT have any emails generated during and AFTER the upgrades.
Cheers!
Brian_Coulombe_
107 Posts
0
October 21st, 2015 07:00
We did that however, when we turned it back on their mailboxes got flooded with Isilon alerts from the upgrade. Do we need to cancel the alerts before we turn snmpd back on?
I'm not sure if it's the native alerts but in this case, both snmpd and rmr relay were removed during the upgrade. I canceled all the alerts and connected back to the RMR relay and BLAM! Ton of alerts flooding mailboxes.
Gotta get this fixed before the next upgrade.
Peter_Sero
1.2K Posts
0
October 21st, 2015 07:00
isi services snmpd disable/enable
Or are OneFS "native" alerts (non SNMP) also an issue?
-- Peter
Peter_Sero
1.2K Posts
0
October 21st, 2015 09:00
"Native" means notifications directly sent by OneFS via SMTP (Simple Mail Transport Protocol).
One can mess around with these, too...
Can you confirm that you are using SNMP alerts aka "traps" (SNMP = Simple Network Management Protocol)?
Brian_Coulombe_
107 Posts
0
October 21st, 2015 09:00
Yes, SNMP traps. That is why I mentioned the RMR relay address. When I remove the RMR address, no alerts. When I add the RMR address back, even though I have suppressed all the alerts, they just seem to fire away from cache.
What we need -- and maybe this is a Service Enhancement request -- is to be able to put Isilon in "maintenance mode" so no alerts are generated during an upgrade. Even the GUI shows a user that the cluster is being upgraded, why on earth would it send alerts related reboots? SILLY!
Peter_Sero
1.2K Posts
0
October 21st, 2015 10:00
OK, try it out with a virtual nodes: quiet & cancel all events, then re-enable snmpd, the re-add the RMR.
BTW, any chance that your RMR tool can told to suppress traps from one device for while...?
Finally, yes talk to your account team about an enhancement request... you might find out something interesting
-- Peter
Brian_Coulombe_
107 Posts
0
October 21st, 2015 11:00
I don't have a virtual infrastructure available to test on just yet. I tried removing the RMR relay address which worked. Then I cancelled all alerts, then added the RMR relay address back in and still got flooded with alerts. I can try to disable snmpd this time as well (I don't recall doing so last time). Definitely an EMC support thing and enhancement request!
Peter_Sero
1.2K Posts
0
October 22nd, 2015 01:00
The free VMware Player will do it, even on a laptop.
And keep in mind, cancelling and quieting OneFS events are two distinct actions, to be taken in that order.
-- Peter
Brian_Coulombe_
107 Posts
0
October 22nd, 2015 04:00
I'm fairly certain I hit "cancel all events" but there's always a chance I just quieted them. I'll double check that this weekend since I have ANOTHER upgrade......(and another one the week after).
carlilek
205 Posts
0
October 22nd, 2015 04:00
Here's a script to clear the celog and stop notifications. It'll reaaaaaally clear the celog, though!
#!/bin/bash
isi services -a celog_coalescer disable
isi services -a celog_monitor disable
isi services -a celog_notification disable
sleep 120
isi_for_array killall isi_mcp
isi_for_array pkill isi_celog_
sleep 60
isi_for_array rm -rf /var/db/celog/*
isi_for_array rm -rf /var/db/celog_master/*
rm -rf /ifs/.ifsvar/db/celog/*
isi_for_array isi_mcp
sleep 30
isi services -a celog_coalescer enable
sleep 30
isi services -a celog_monitor enable
sleep 30
isi services -a celog_notification enable
sleep 30
isi services -a celog_coalescer enable
isi services -a celog_monitor enable
isi services -a celog_notification enable
This was compiled from a list of commands given to me by support when I was complaining about hte same thing.
On this last upgrade to 7.2.1, I had to run it about 5 times before the alerts stopped. Yay.
carlilek
205 Posts
0
October 22nd, 2015 08:00
Cancel all events never works for me to stop the torrent of emails, especially after an upgrade.
Brian_Coulombe_
107 Posts
0
October 26th, 2015 05:00
Actually cancelling all events worked this weekend. Of course, that clears out your local history BUT -- on the plus side, we didn't get bombarded with email alerts
Anonymous
5 Practitioner
5 Practitioner
•
274.2K Posts
1
October 27th, 2015 11:00
OneFS doesn't have a maintenance mode today but in a future OneFS release it will. For exactly this purpose. The way it will work is you will be able to specify a time period for the maintenance mode. During that time OneFS will store all events that occur. After the maintenance mode is over, if there are any unresolved events it will alert (email) on those unresolved events only. We are doing this so we make sure to alert if a disk fails during maintenance mode (for instance) we want to make sure to send the right alert out if the disk isn't fixed by the time the maintenance mode is over.
You can use this maintenance period for upgrades, planned smartfails, hardware moves, etc. basically whenever you know the system will be experiencing "issues" and you don't need EMC support to try to help you with them.
Stay tuned!
Rothweiler
1 Message
0
April 13th, 2021 11:00
It looks like this arrived with OneFS 8
https://www.dell.com/support/kbdoc/en-us/000022784/onefs-8-0-how-to-place-the-isilon-cluster-into-maintenance-mode?lang=en
Example -
isi event settings modify --maintenance-start 2017-02-23T22:00:00 --maintenance-duration 2H