Start a Conversation

Unsolved

This post is more than 5 years old

11043

July 22nd, 2016 12:00

RACADM - iDRAC 8 - R630 - Remote Syslog Alerting - Active DIMM Errors are not logging - test messages are being sent to the remote syslog

Hello -

I've successfully set up a remote rsyslog server, and have configured the iDRAC on another machine, which has an active DIMM error, to report rsyslog messages to this rsyslog server.  

I am able to successfully send a test message from the iDRAC on the server in question, and the messages appear in the rsyslog /var/log/servername directory.


racadm>>eventfilters test -i MEM0702

racadm eventfilters test -i MEM0702
RAC1027: Successfully sent the alert for the specified event to the configured
destination.
Verify if the alerts were received by the configured destination. Otherwise,
reconfigure the destination and retry the operation.




rsyslog logfile output:

Jul 21 22:57:50 server01.domain.com Severity: Critical, Category: System Health, MessageID: MEM0702, Message: Correctable memory error rate exceeded for DIMM1.


SEL Log in the server in question with a bad DIMM

I've let the system sit for about 16 hours now, configured properly to send out alerts, I have verified that the server can indeed send messages successfully.  Why am I not seeing this active error appearing in the syslog?  

Will the system only send out NEW alert occurrences AFTER configuring the remote syslogging?  Here are some more screenshots of the configuration within racadm GUI

Alert Config

Syslog Settings

Also, with active DIMM errors, why is the Server Heath showing OK?

Server Health

1.  Why am I not seeing the system recognize that there are active DIMM errors and reporting it to the remote rsyslog server?

Thank you for your time :)

3 Posts

July 22nd, 2016 12:00

I'm seeing some information online that shows that a racadm racreset applies changes, is this true?  Seems silly to have to reset the rac to incorporate changes, but just throwing it out there just in case.

3 Posts

July 25th, 2016 15:00

Following up, over the weekend I saw some informational messages appear on their own into the remote rsyslog server directory for this server in question.  Those being:

Jul 23 03:00:15 server01 Severity: Informational, Category: Storage, MessageID: CTL37, Message: A Patrol Read operation started for Integrated RAID Controller 1.
Jul 23 03:51:00 server01 Severity: Informational, Category: Storage, MessageID: CTL38, Message: The Patrol Read operation completed for Integrated RAID Controller 1.

With this said, it appears that events that occur after configuring the remote syslog server.  The DIMM errors are not showing up in this log are still prevalent.  I'm wondering if clearing the events out of the SEL will allow these DIMM errors to kick in again and then be logged - this is quite frustrating.

Thanks

2 Posts

February 8th, 2017 21:00

hello,

1. I need dell server snmp and iDrac snmp configuration/setup.

2. can i use Prtg monitoring tool for above two?

I would be greatly appreciated it if anyone help me.

No Events found!

Top