Unsolved
This post is more than 5 years old
21 Posts
0
198803
July 20th, 2012 18:00
vCenter events based alarm/rules
Hi,
I want to setup a rule/alarm, to contact our team when one of our ESX lose a network path or loose connectivity to a datastore.
We already see those events in vCenter but we want to leverage vFoglight instead.
So how can I create a rule to grab those specific event and generate an alarm.
Event examples from vcenter:
- Lost path redundancy to storage device naa.600508b1001c1b0c0f55182c01d5db00. Path vmhba0:C0:T0:L0 is down. Affected datastores: "cos*****".warning 2012/07/10 22:01:08
- Alarm 'Cannot connect to storage' on hw***** changed from Gray to Gray info 2012/07/16 10:54:32
- Path redundancy to storage device naa.6006016019312d009c8478678a04e111 degraded. Path vmhba1:C0:T0:L9 is down. Affected datastores:"H******".
- Alarm 'Network uplink redundancy lost' on ed**** changed from Green to Red info 2012/07/20 10:14:58
- Alarm 'Host hardware power status' on ph**** changed from Gray to Red info 2012/07/18 05:41:40
Let me know if you need more details,
Thank you
Alex
No Events found!



DELL-Thomas B
171 Posts
0
July 21st, 2012 22:00
I've been looking at something like this for addition to my community cartridge for datastores. The basic way to do this is roll through the events and see which one's are related to a datastore and then fire when I find appropriate text and rule scope. I've not added the rule text yet, but the basic rule script looks like this, where the scope would be VMWDatastore:
ds = server.get("DataService") // Data Service
vCenters = #!VMWVirtualCenter#.getTopologyObjects() // Get all vCenters
for(vCenter in vCenters) // Loop through each vCenter's current events
{
events = ds.retrieveLatestValue(vCenter, "events").value
for(event in events) // Loop through each event
{
if (event.message.contains("Lost path redundancy to storage device") && (event.eventType == 'DatastoreEvent') && (event.datastore.name == scope.name))
{
return true // If text matches and it is a datastore event and the datastore matches the scope, return true
}
}
}
return false
sbc350
21 Posts
0
August 7th, 2012 19:00
Hi Thomas,
I created a Simple Rule with the condition you provided. But when we disable an HBA on our ESX to simulate a Lost path. We received the "Lost path" events in Vcenter but the vfoglight rule doesnt pick it up.
Rule details:
-Simple Rule
-Data Driven
-Rule Scope: VMWDatastore
-Under "Fire" , pasted your condition under "Condition"
-Severity Level and action are empty
-Everything else is default.
-Rule is enabled
I see the events in VCenter Events and under each ESX and Datastore, but the rule doesnt seem to pick it up.
Thank you
Alex
DELL-Thomas B
171 Posts
0
August 8th, 2012 02:00
I don't have any HBA's but it might not be a Datastore type event. You could remove the event type part of the condition and see if that fires then.
sbc350
21 Posts
0
August 9th, 2012 16:00
THank you for your quick reply.
I see the events, and since they are coming from the Main vCenter events tab. Can we pull the events from each datastore event tab instead?
I think our issue is since our infrastructure is quite big the main vcenter events get clear fairly quick as tons of events are happening per seconds/minutes.
DELL-Thomas B
171 Posts
0
August 9th, 2012 16:00
We pull the events information from the api for the recent history, so it should have all of them in there even if they clear from the vCenter UI. We don't pull historical events, that can bit a bit nasty of an API pull from VMware's perspective. If you can get the HBA to turn off again and run this again it should give us the text we need to get this event to fire properly.
DELL-Thomas B
171 Posts
0
August 9th, 2012 16:00
That would be helpful, this will return a list of all of the current events and then the properties for each. From that it should be possible to figure out what is going on.
ds = server.get("DataService") // Data Service
def result = [] as List
vCenters = #!VMWVirtualCenter#.getTopologyObjects() // Get all vCenters
for(vCenter in vCenters) // Loop through each vCenter's current events
{
events = ds.retrieveLatestValue(vCenter, "events").value
for(event in events) // Loop through each event
{
result.add( [ "event":(event.properties) ] )
}
}
return result
sbc350
21 Posts
0
August 9th, 2012 16:00
If I remove :
&& (event.eventType == 'DatastoreEvent') && (event.datastore.name == scope.name))
from the condition, it still doesnt catch the event, even if I see tons of events in VCenter under events or in Datatstore under specific datastores under events...
Is there a way to insert some kind of debug within the condition to see the events message?
Thank you
Alex
sbc350
21 Posts
0
August 9th, 2012 18:00
I went with most simple condition and it still doesnt fire.
And using your debug condition I was able to monitor the events, here are 2 examples that the debug condition caught:
{event={namespace=foglight-5, handler=null, uniqueId=/anonymous/bdcdcfdc-9d17-4e0d, earliestDate=null, propertyValues={message=Lost uplink redundancy on virtual switch "vSwitch0". Physical NIC vmnic0 is down. Affected portgroups:"vmotion Network", "Management Network".
{event={namespace=foglight-5, handler=null, uniqueId=/anonymous/aeabed73-e81a-4413-8252-8c9dc179ecb5, earliestDate=null, propertyValues={message=Lost path redundancy to storage device naa.60060e8005bdf7000000bdf7000018bd. Path vmhba1:C0:T0:L111 is down. Affected datastores: PPD-XX-XXXX5-14., esxServer=foglight-5:VMWESXServer:2af748ef-03c6-4aa7-89a2-10f98f3624b9:1320 datasource=foglight-5:foglight-5, dvs=foglight-5:VMWDVS:ce71e4ef-8b2a-4467-b3f0-9140f9b17647:1 datasource=foglight-5:foglight-5, userName=, datacenter=foglight-5:VMWDatacenter:b7b7eb9a-bf08-490d-a7b0-111fb76e50b4:86 datasource=foglight-5:foglight-5, cluster=foglight-5:VMWCluster:ab9e5c1b-54d2-485a-92fd-fdbada3c5e49:7 datasource=foglight-5:foglight-5, eventTime=2012-08-09 15:25:46.442, eventType=EventEx, datastore=foglight-5:VMWDatastore:e3fb1efb-f68d-4b50-b0b5-50a754570afa:373 datasource=foglight-5:foglight-5, uniqueId=/anonymous/aeabed73-e81a-4413-8252-8c9dc179ecb5}, type=foglight-5:VMWEvent:3, class="class" com.quest.nitro.service.topology.provider.wcfsdo.DataObjectImpl, dataSourceType=foglight-5, container=null, dataSourceId=foglight-5, containmentProperty=null}}
Here is the condition using only one "IF condition" instead of 3.
ds = server.get("DataService") // Data Service
vCenters = #!VMWVirtualCenter#.getTopologyObjects() // Get all vCenters
for(vCenter in vCenters) // Loop through each vCenter's current events
{
events = ds.retrieveLatestValue(vCenter, "events").value
for(event in events) // Loop through each event
{
if (event.message.contains("down"))
{
return true // If text matches and it is a datastore event and the datastore matches the scope, return true
}
}
}
return false
DELL-Thomas B
171 Posts
0
August 10th, 2012 01:00
Are you on vFog 6.6.1 or prior? There is a bug in the alarm service that could be causing this to not fire. Try running this groovy script from the script console and see if it returns true. Also we will want to make sure the event condition still has the datastore name equals part, otherwise this will return true for every datastore!
DELL-Thomas B
171 Posts
0
August 10th, 2012 13:00
The only thing to still be careful on is this rule will fire on every datastore because the condition is true across the board. You need to add a filter for the datastore bit to match the name of the scope, otherwise it will show for every one.
sbc350
21 Posts
0
August 10th, 2012 13:00
We are using 6.6.2
I ran the script in the script console but it returns false.
As I wrote earlier, I think our events get rolled-up too fast for vfoglight to pick up the datastore events.
But we get a lot of DVS link-up/link-down. So I modified your script to fire on dvs-link-down. Here is the script below:
When I run this one manually it returns "True" but doesnt fire an alarm. The alarm count is still at 0.
ds = server.get("DataService") // Data Service
vCenters = #!VMWVirtualCenter#.getTopologyObjects() // Get all vCenters
for(vCenter in vCenters) // Loop through each vCenter's current events
{
events = ds.retrieveLatestValue(vCenter, "events").value
for(event in events) // Loop through each event
{
if (event.message.contains("down") && (event.eventType == 'DvsPortLinkDownEvent'))
{
return true // If text matches and it is a datastore event and the datastore matches the scope, return true
}
}
}
return false
sbc350
21 Posts
0
August 10th, 2012 13:00
OK I feel stupid... but for the benefits of others.... for whatever reason the Alarm counter beside the rule still shows 0, but if I dig into the alarms dashboard I see the alarm being triggered.
So would someone knows why the alarm counter in rule management still shows zero even if it triggered multiple alarms in the alarm dashboard?
Thank you
DELL-Thomas B
171 Posts
0
August 10th, 2012 13:00
Not sure on that one, it should refresh based on the current alarm count for a given rule. Support may have a better answer (could be a known issue...)
sbc350
21 Posts
0
August 10th, 2012 19:00
Ok let me backtrack, I see the alarm is working when I use the version below of the script . Network related instead of datastore and it works. But the original script to trigger on datastore event still doesnt work..
ds = server.get("DataService") // Data Service
vCenters = #!VMWVirtualCenter#.getTopologyObjects() // Get all vCenters
for(vCenter in vCenters) // Loop through each vCenter's current events
{
events = ds.retrieveLatestValue(vCenter, "events").value
for(event in events) // Loop through each event
{
if (event.message.contains("down") && (event.eventType == 'DvsPortLinkDownEvent'))
{
return true // If text matches and it is a datastore event and the datastore matches the scope, return true
}
}
}
return false
DELL-Thomas B
171 Posts
0
August 10th, 2012 20:00
Based on what the alarm message actually has it would need a condition like below. I assumed it might be an eventType for Datastore, but it shows up as an EventEx, so that had to be changed or removed all together.
if(event.message.contains("Lost path redundancy to storage device") && (event.eventType == 'EventEx') && (event.datastore.name == scope.name))