2 Intern

 • 

143 Posts

September 23rd, 2021 10:00

Trying to do some more troubleshooting. I see no packet loss or even latency when the "device did not respond" errors occur.

I'm beginning to wonder if my iDRAC module is going bad...

Moderator

 • 

5.4K Posts

September 23rd, 2021 19:00

Hi, you got all 3 of these version checked?

ISM  _(Dell EMC iDRAC Service Module for Windows, v4.1.0.0)

https://dell.to/2XN0Zf5

 

OMSA_ Dell EMC OpenManage Server Administrator Managed Node for Windows, v10.2.0.0

https://dell.to/2XN0ZM7

 

idrac  iDRAC 2.81.81.81

 

https://dell.to/3lMO8kU

4 Operator

 • 

3K Posts

September 23rd, 2021 20:00

Can you check whether iDRAC is showing power information? iDRAC will not show power details for systems with cabled power supply. Can you check whether your server have cabled power supply or redundant power supply?

Moderator

 • 

4.7K Posts

September 24th, 2021 08:00

Hello justin gray,

 

I see you have reset the DRAC. Did you use command line: racadm racreset

 

Try a flea power drain and check results:

Drain flea power (shut down, disconnect power cables and Network cables, hold in power button 20 seconds with cords removed).  After flea power drain, system has to set for 3 minutes for DRAC to reset without any power plugged in, then plug in NIC and power but wait 2 minutes before power on to give DRAC time to initialize.

2 Intern

 • 

143 Posts

September 24th, 2021 08:00

Yes, correct. Latest versions across the board.

2 Intern

 • 

143 Posts

September 24th, 2021 09:00

I did an iDRAC reset via OMEnt several times. Also did a soft reset holding in the iDRAC button on the front for 30 seconds (power connected). I had also powered it down and removed plugs for at least 15 minutes before powering back up (several times). Did not hold in the power button under those conditions, but will try that, as well.

2 Intern

 • 

143 Posts

September 24th, 2021 09:00

I'm not sure what you mean by "cabled power supply". It does not have a redundant power supply.

The iDRAC gui only shows voltages in the form of 'Good', so maybe actually values just aren't available. Seems odd, though.

I'm still stumped by the device not responding issues when OMEnt tries to read power and temp values.

4 Operator

 • 

3K Posts

September 26th, 2021 22:00

If server does not have redundant power supply then it is a cabled power supply and iDRAC/OME will not be showing any power related data (Consumption, Reading etc.)

2 Intern

 • 

143 Posts

September 28th, 2021 09:00

I guess that would explain why no power readings are present.

However, I'm still faced with the random 'device did not respond' errors on both power and temp. It doesn't seem to happen on any devices except for this one:

justingray_0-1632847640107.png

 

2 Intern

 • 

143 Posts

September 28th, 2021 11:00

That was confirmed in my first and fourth post. Everything is the latest available version; BIOS, all firmware, iDRAC, OMSA and ISM.

Moderator

 • 

9.7K Posts

September 28th, 2021 11:00

I would start with making sure the server is up to date on BIOS, iDrac, etc. Would you confirm what version you are currently at?

 

Let us know.

 

 

Moderator

 • 

9.7K Posts

September 28th, 2021 13:00

Sorry, I didn't see that. Do you see that same CDEV error on the hourly status poll for that device ever?

2 Intern

 • 

143 Posts

October 5th, 2021 12:00

Sorry, not entirely sure where to see the results of the hourly status poll. I looked under all of the 'Monitor' options but see nothing related. Where might I locate this?

2 Intern

 • 

143 Posts

October 5th, 2021 13:00

Thanks for the clarification. In the list of devices, it always has a green check. There's nothing logged at all for it under the alerts pane.

Typically when I open the device page everything looks good for about a minute. Then the errors start with either one or both (power and/or temp). Then the error(s) clear and then come back at random intervals.

It presents like a connectivity issue, though the device page remains up and other stats are good. And I show no dropped packets whenever these errors pop up.

Moderator

 • 

4.7K Posts

October 5th, 2021 13:00

Hello Justin, 

 

I found an entry in the release notes that may explain this:

 

Page 13 https://dell.to/3a7hJ32

Issue 26

Description: For the YX3X servers, few of the Subsystem Health section details, available on the individual device's Overview page, such as the Storage, Temperature, and License details are displayed as 'No Data available,' even when their health status is 'OK.' [155425]

No Events found!

Top