Highlighted

Updating the PSUs firmware causes network to crash (or at least server switches)

Some really strange stuff here while updating a PE R720! I tried updating the PSU along with the other hardware during within the Firmware Update in the LCC. It seemed to not have any issues with all the other firmware updates until it hit the PSUs. Then, oddly enough, when I attempted to update the PSUs my phone would start ringing with my users not being able to connect to any of my other servers. !!! I couldn't find the problem and it started working again in about a minute or two. I just considered it some kind of "network hiccup" and left it at that. BUT I noticed that the PSUs failed fully update, so I attempted the update again ... BAM! Network down, users calling, and PSUs failed to completely update to the newest version again... Coincidence ... maybe...

SO, I had another server to do. I did all the other updates, waited until a slow period during the day, started some pings, and then updated the PSUs ... BAM all pings failing and network is down. I ran to the server room to see if I could determine if the server Gbe switches (2x PC6248) crashed but it wasn't obviously rebooting. I then dug into the switch logs and it looks like it forces my second switch to become the the STP root and proceeds to freak out:

Spanning Tree Topology Change: 0, Unit: 1
2/0/22 is transitioned from the Learning state to the Forwarding state in instance 0
Unit 2 elected as the new STP root
Instance 0 has elected a new STP root: 8000:d067:e583:55c0
Spanning Tree Topology Change: 0, Unit: 1
Instance 0 has elected a new STP root: 8000:a4ba:db56:b4bb
Spanning Tree Topology Change: 0, Unit: 1
Spanning Tree Topology Change: 0, Unit: 1
Unit 2 elected as the new STP root
Instance 0 has elected a new STP root: 8000:d067:e583:55c0
Instance 0 has elected a new STP root: 8000:a4ba:db56:b4bb
Unit 2 elected as the new STP root
Instance 0 has elected a new STP root: 8000:d067:e583:55c0
Instance 0 has elected a new STP root: 8000:a4ba:db56:b4bb
Unit 2 elected as the new STP root
Instance 0 has elected a new STP root: 8000:d067:e583:55c0
Instance 0 has elected a new STP root: 8000:a4ba:db56:b4bb
Unit 2 elected as the new STP root
Instance 0 has elected a new STP root: 8000:d067:e583:55c0
Instance 0 has elected a new STP root: 8000:a4ba:db56:b4bb
Link Down: 1/0/12
Link on 1/0/12 is failed
Link Down: 1/0/44
Link on 1/0/44 is failed
1/0/12 is transitioned from the Forwarding state to the Blocking state in instance 0
1/0/44 is transitioned from the Forwarding state to the Blocking state in instance 0
Link Down: 2/0/10
Link on 2/0/10 is failed
2/0/10 is transitioned from the Forwarding state to the Blocking state in instance 0
Link Down: 2/0/44
Link on 2/0/44 is failed
Link Down: 2/0/22
Link on 2/0/22 is failed
2/0/44 is transitioned from the Forwarding state to the Blocking state in instance 0
2/0/22 is transitioned from the Forwarding state to the Blocking state in instance 0
Link Up: 1/0/12
Link Up: 1/0/44
1/0/12 is transitioned from the Forwarding state to the Blocking state in instance 0
1/0/44 is transitioned from the Forwarding state to the Blocking state in instance 0
Link Up: 2/0/10
2/0/10 is transitioned from the Forwarding state to the Blocking state in instance 0
Link Up: 2/0/22
2/0/22 is transitioned from the Forwarding state to the Blocking state in instance 0
Link Up: 2/0/44
2/0/44 is transitioned from the Forwarding state to the Blocking state in instance 0
1/0/12 is transitioned from the Learning state to the Forwarding state in instance 0
1/0/44 is transitioned from the Learning state to the Forwarding state in instance 0
2/0/10 is transitioned from the Learning state to the Forwarding state in instance 0
Spanning Tree Topology Change: 0, Unit: 1
2/0/22 is transitioned from the Learning state to the Forwarding state in instance 0
2/0/44 is transitioned from the Learning state to the Forwarding state in instance 0
Spanning Tree Topology Change: 0, Unit: 1
Spanning Tree Topology Change: 0, Unit: 1
Spanning Tree Topology Change: 0, Unit: 1
Spanning Tree Topology Change: 0, Unit: 1
Spanning Tree Topology Change: 0, Unit: 1
Spanning Tree Topology Change: 0, Unit: 1
Spanning Tree Topology Change: 0, Unit: 1
Spanning Tree Topology Change: 0, Unit: 1

Then and all is well again...

Whaaaaa?  I have no idea why updating the PSU firmware on the server would cause this. I can remove power from the server completely and it doesn't cause this problem. Could it cause a surge on the PDU that the switch and Server is plugged into?

Also strange is that the PSUs didn't update to the latest version available from the Dell FTP server.  LCC log shows  Version change from previous version 69.45.99 -> Current version 69.45.9B ... so does that mean I have to go in steps instead directly to 6B.47.9B ... HOWEVER that I can somewhat understand ... the network madness I cannot.

Any ideas?  Is there some ghost in the PSU firmware? (It is close to Halloween) OOohhh Spooky ... yea actually when it takes out my network ... I'm pretty spooked!!!!!

0 Kudos
4 Replies
Moderator
Moderator

RE: Updating the PSUs firmware causes network to crash (or at least server switches)

SDFJKSFDJNKSFDSFDBJNK,

That is odd. When you ran the Lifecycle controller update, did the other devices update successfully, or was it just the PSU firmware that failed to update? What I would suggest is trying to run the individual update for the PSU alone. I am not certain of your OS, but if you let me know I can get you the proper link. 

If you happen to be on Server 2008, then this is the update you would need - http://dell.to/1uLqBgz

Run it from within the OS and let me know how it goes.

Chris Hawk

Dell | Social Outreach Services - Enterprise
Get Support on Twitter @DellCaresPro 
Download the Dell Quick Resource Locator app today to access PowerEdge support content on your mobile device! (iOS, Android, Windows)

0 Kudos

RE: Updating the PSUs firmware causes network to crash (or at least server switches)

I ran the updates from the Life Cycle Controller UI (no OS needed) and used the Dell FTP site to download them. Almost too easy ... almost. The first time I pulled them all down at once and it seemed to hiccup at the BIOS but I ran it again and it went fine until it started on the PSU firmware. Then, see above. I also tried doing all except the PSU on the second server and then, after all other updates had been completed, I tried just the PSU firmware and the same network issues as before.  These are VMWare servers so I'm not sure there is an update package to do that ... but I'll check.

0 Kudos

RE: Updating the PSUs firmware causes network to crash (or at least server switches)

Nope no VMware package and I can't seem to find a location to just download the PSU update Firmware ROM file so I can put it on a USB disk and unplug my network before updating from the LCC. I guess I'll just have to pass on these or wait until I can be sure nobody will miss the network being down for 2 minutes or so.

0 Kudos

RE: Updating the PSUs firmware causes network to crash (or at least server switches)

Ok I just unplugged all my network including the iDRAC NIC and then I plugged in a small WiFi bridge that I found to the integrated NIC port 1. I was finally able to position it so that I got a WiFi signal in the server room and then pulled down the update and it installed without taking down my server network OR my WiFi network. Still have to go in steps it seems from v69.45.99 -> v69.45.9B -> Uh ... looks like I can't go any higher than that even though there is no "failure" in the log and in fact is shows that it "Successfully updated" it to ... the same version that it was (v69.45.9B). OK ... The latest firmware it shows is v6B.47.9B so I think it isn't so much as I have to update in steps but for some reason that is as high as I can go ... OR maybe something's wrong with the files or update catalog in the Dell FTP server and they have the old version firmware where the new version should be.

HOWEVER this does show that some type of "I-Kill-STP" packet is going through one (or more) of the NIC ports (I'm using Ports 1-4 on the integrated and then one more on a PCIe card plus the iDRAC module) ONLY when updating the PSU firmware. Really crazy! I have two Flextronics 750W PSUs (PN:6W2PW) each in these servers. Hope Dell Engineering sees so they can find the problem before the new R730s have the same issues; though, I don't expect them to since I don't have one yet ... unless Dell wants to send me one for "testing." ;-D

0 Kudos