I am running Windows Server 2012 R2 with DELL OME 2.1 installed. I noticed that from time to time the application stops receiving traps. Application is operational (I am able to log in to app) but I do not see new alerts on Dashboard.
I have a big setup - around 500 discovered devices so we are not allowed to miss any critical trap.
When the problem appear all services are running including SNMP Trap service. Database is running on separate Windows 2012 R2 server - MS SQL. To resolve the problem I reboot the server - previously it was helping but now not always, sometimes i need to reboot 2-3 times
Is there any dependency between the services? maybe they are starting in wrong order during boot and that is causing the problems ?!
DSM Essentials DA Service
DSM Essentials Host Service
DSM Essentials Network Monitor
DSM Essentials Task Manager
DSM SA Connection Service
I am not sure where to start troubleshooting... I would really appreciate some help
Hi and thanks for the post.
It would be great if you could open a ticket so we can get more info on this to see what's up (800-945-3355).
An alternative to rebooting might be to click the restart services button in the OME settings screen. But you should not even have to do that. Do you have any other software running on the OME server? Also good to know how many core/RAM on the OME box.
Anything unusual like a high number of alerts per second for your environment?
That was quick !
The server is dedicated for DELL OME. 4vCPU, 8GB of memory. I am not experiencing any capacity problems on the server.
Yes it is possible that application stops responding the alerts due to high number of traps... but I am not sure if that is the real reason.
I also noticed that sometimes it is happening after restart so that is why I updated previous post with a question is there any recommended order in which the services should be started ?
Your core/RAM are certainly sufficient.
I think OME should be able to handle 30-50 traps per second without trouble. But I doubt you are getting anything like that over a sustained period as you would be a pretty stressed-out administrator
The restart button in the settings page should restart in the order required. Even doing it in services.msc they might have dependencies defined. I don't think it matters though.
As I said, if it does not resolve, best to open a ticket if you can so the support folks can take a look. They may need to look at a crash dump file to see what's up.
Are you running anything other on the OME server that might have a SNMP trap listener that hijacks the SNMP Service from Windows?
I have had the same problem and I have found one way to know why traps are not being received to our OME.
The port 162 shoud be free/opened before start OME Windows services. To check that, execute CMD and put the following command: netstat -aon. You will get to see a line similar to below in Windows OS:
UDP - 0.0.0.0:162 - *:* - 4335
This means that the port 162 is already bind by some application whose process id is 4335. You can either kill this process from the task manager, or asure that this proces does not start in the next reboot. Once you have confirmed this, Try to restart the services or reboot the server. After that, try to send a SNMP trap from one IDRAC against your OME console.
Hope this helps you.