Start a Conversation

Unsolved

This post is more than 5 years old

32914

April 30th, 2013 09:00

Optional Fix for False System is up/down alerts in OME 1.1 / 1.1.1

OME Team,

We are posting an optional patch today for OME 1.1/1.1.1 to resolve an issue related to false "System is up/down" alerts getting logged. This applies to users of OME 1.1/1.1.1 only.  Please review the readme carefully to understand the issue that has been addressed.

Click here to view the Readme and download the patch.

Thanks,
Raj Shresta

1 Message

July 24th, 2013 06:00

Was this patch bundled with the latest release 1.2 ?

I am getting these false alarms quite frequently even on the latest version which was only releases a few weeks ago.

327 Posts

July 24th, 2013 08:00

Yes. The fix was included as part of OME 1.2 in addition to few more bug fixes. For the devices that are generating false alarms, can you try these steps:

1) Check the device for which the alert is logged and get the device name/IPAddress.

2) From OME system (using OS command prompt) run this command for few mins:

ping -w 400 -n 1 IPaddress/Hostname -t

3) If any of the response packet (reply) exceeds 400ms (check the value under "time" field for each packet), then increase the ICMP timeout in Discovery wizard for that range. The time out should be set to max. time + 200ms taken by any ping response packet.

4) If any of the response packet (reply) timed out  (Request timed out), then increase the ICMP retries in Discovery wizard for that range. The retry should be set to number of continuous timed out response+1 .

Recommended settings (could still vary based on network bandwidth):

LAN environment: Increase ICMP timeout in Discovery wizard for all the ranges to 1 sec and 4 retries.

WAN environment: Increase ICMP timeout in Discovery wizard for all the ranges to 5 secs and 6 retries.

NOTE: Increasing ICMP timeout and retries will impact discovery performance and depends on:

If a discovery range has lot of devices that are unreachable on the network

If the network bandwidth is slow and takes multiple retries to ping a device

11 Posts

August 8th, 2013 08:00

Hello.

I have updsated to OME 1.2, maxed out the SNMP settings and I still receive false alerts for items.

Is there anoither way to stop this?

Thanks,
Sean

 

5 Posts

February 24th, 2014 23:00

Hi,

I am seeing the very same issue, false "System is down" alerts from time to time.  "System is up" event gets logged about 4 seconds later.  On OME 1.1.1 with the following patch.

 

Optional Fix for False System is up/down alerts in OME 1.1/1.1.1

http://en.community.dell.com/techcenter/extras/m/white_papers/20358671.aspx

Is there any way to stop this false alert? I wish to avoid upgrading to 1.2 as this version does not support ESXi5.0U3.

TK 

2.8K Posts

February 25th, 2014 13:00

Seems like the hotfix addressed this for most folks.

I wonder, what is your preference setting for the status poll?  Did you change it from the default of 60 min?

And are your devices all LAN or do you have any on WAN?

Yeah I don't see U3 in the 1.2 support matrix.  But do you see it in the 1.1 matrix?  Have you tested it in a quick VM install to sniff it out with 1.2?

Thx

Rob

5 Posts

March 4th, 2014 17:00

Rob,

preference settings for the status poll was never touched, so it should be 60min.

all servers are on the same network segment on LAN.

I didn't realize the U3 support was never there with 1.1 anyway. 

I hear 1.3 is being released this week, so wondering whether this false system is down alert is addressed with this version.

TK

2.8K Posts

March 4th, 2014 18:00

Yeah, so I'm not sure there is a single cause for the false up/down.  But I know that some of these causes have been addressed in OME 1.3.  So good to give it a try.

Regards,

Rob

No Events found!

Top