Unsolved

This post is more than 5 years old

4 Posts

6038

April 1st, 2010 22:00

iDRAC Express reliability on larger networks

I recently added 45 R410s to our lab expecting to manage them with the iDRAC express.

Shortly after deployment the automation started failing due to IPMI failures; Even though the iDRACs are configured with static IPs they simply stop responding to ARP requests, making it impossible to connect.
The only known work-around so far has been to power cycle the rack (removing power from the BMC).

racadm gettracelog had some suspicious messages:

Description: bonding: bond0: link status down for idle interface eth2, disabling it in 5000 ms.
Description: Link beat lost.
Description: ResetPhyChip Failed
Description: Wait for auto-negotiation complete...PhyID is 0
Description: NIC Driver: ResetP()**
Description: Open NCSI Main Device Fail!
Description: Executing '/etc/ifplugd/ifplugd.action eth0 down'.

dell support suggested that my network is simply 'too busy' for the shared ethernet configuration,
even though the host OS does *not* use the onboard gigabit, but a 10G card..

'''
The iDRAC Express shares an Ethernet port with the system?s NIC and with as much broadcast traffic that the customer has reported being on this subnet it is very likely that the iDRAC simply cannot respond to the sheer amount of requests.
'''

24hr packet count:
eth1 Link encap:Ethernet HWaddr 00:26:B9:58:E1:CC
RX packets:3375239 errors:89750 dropped:0 overruns:0 frame:0
TX packets:240077 errors:0 dropped:0 overruns:0 carrier:0
eth2 Link encap:Ethernet HWaddr 00:26:B9:58:E1:CC
RX packets:3375134 errors:0 dropped:0 overruns:0 frame:0
TX packets:8435 errors:0 dropped:0 overruns:0 carrier:0

Is this a common problem ? Any other known work-arounds ?

2 Posts

July 20th, 2010 10:00

Did you ever come up with a solution for this? Segmenting and ACLs? Any resolution from Dell? I'm seeing the exact same thing here.

DRAC bascially appears to halt all network traffic. Setting changes made through OM appear to take effect, however setting the card to DHCP also produced no traffic. This is happening on all systems I have tested so far. Judging from the arp table in my firewall occasionally the DRAC does respond to arp requests, but the failure to pass traffic is nearly complete and in my case does not appear to take more than about an hour to trigger.

0 events found

No Events found!

Top