Start a Conversation

Unsolved

This post is more than 5 years old

129520

October 1st, 2014 13:00

After driver update, OME still shows update as available

Hi,

Just upgraded to OME 2.0 and downloaded the latest catalog.  I'm trying to apply a PERC H710 Mini driver to an R620.  OME shows the current driver version as 6.802.19.0 with 6.802.21.00 as available.  I queue the update, the server automatically reboots, and OME System Update Summary page shows the task as completing successfully.  I re-run discovery/inventory/status and OME still shows that same update as available under the Non-Compliant tab.

I checked OMSA (v7.4.0 and Inventory Collector Agent v7.4.1) and iDrac7 web interfaces and they both report the driver has been upgraded to 6.802.21.00.  Seems like part of OME isn't getting the message the driver has been upgraded.  Attempting this process again returns the same results as above.

Please advise.  Thanks.

Ryan

2.8K Posts

October 1st, 2014 15:00

Hmmm, ok it usually takes 15 or 20 minutes for the inventory to update.  But it seems you may have waited this long.

What about when you look at the inventory details from the device tree and clicking on the server...are the versions the same? (old?)

Thanks,

Rob

delltechcenter.com/ome

101 Posts

October 3rd, 2014 15:00

Rob,


Does it have to be 15-20 minutes?  These days people want instant gratification.  Someone trying out OME for the first time is going to be confused when OME says the updates just installed need to be installed again.  Admins that don't trust updates enough to run them over night will have a frustrating experience trying to apply updates in a more interactive manner.

615 Posts

October 3rd, 2014 15:00

A forced inventory for me on a single device usually takes 30-45 seconds.

To the original poster - Did you run inventory on just that device or your whole range?

If just that device and the inventory completed and you still see different data than the DRAC I would delete that device and rediscover the range it's in.

101 Posts

October 6th, 2014 06:00

Please consider this analogy.  Say Windows Update pops up on your PC and asks to install 5 updates.  You allow it to install the updates and then reboot when it asks.  After the restart, you open Windows Update and it shows the same 5 updates as needing to be installed.  This is just not a good user experience.

I've now accepted that after I deploy a batch of updates, I'll just have to wait until the next day to see if OME will stop offering them.  This has caused the task of updating our servers to grow from a weekend project to a several month project (over many weekends) because I want to test the updates on less critical servers first.

Waiting 20 minutes doesn't work for me. (I waited all day Saturday once.)  I've tried forced discovery/inventory with mixed results.

This whole thing contrasts with Dell's Server Update Utility.  I run SUU and it installs updates.  I run it again and it shows no updates needed.

Physiologically speaking, seeing a system move from the non-compliant tab to the compliant tab will cause the reward center of the brain to release a little bit of dopamine.  This game that OME plays has the opposite effect of causing frustration when the expected reward does not materialize.

17 Posts

October 6th, 2014 11:00

Well put, MK1024.

Rob, apologies for the delayed response.  For some reason I wasn't receiving email alerts that someone had responded to this post until this morning.

I was aware it could take several minutes for this information to update in OME so I was sure to wait long enough.  Further tests yielded mixed results.  Sometimes OME would finally register the updated information after a few hours, in another case, I waited until the following morning before it showed those updates were no longer needed.  In no case did OME update promptly after manually running an inventory.

MK1024, I agree that waiting 15 - 20 minutes is too long, but, as I understand it, the inventory collector agent which is installed on each server with OMSA does not run on-demand.  This is what inventories the server to collect components, driver versions, etc and populates an XML file which OME then retrieves and uses to update itself.  The inventory collector only runs on reboot (or if you manually restart all OMSA services), but, this isn't explained in any documentation.  If OME continues to check-in before the inventory collector agent updates the XML file, OME will remain unaware of any updates.  It certainly seems to be a bit of a clunky mechanism and can easily make for a frustrating experience.


Cameron, I ran the inventory on just particular devices, but, entire ranges are re-inventoried nightly on schedule.  I've seen the recommendation of deleting and re-adding devices talked about for other various issues on this forum, but, for a production environment I don't feel this is a sustainable solution.  Having to delete/re-add devices each time we push out updates can't now become another task to manage.  Rather we just need the software to work as advertised.

It's worth noting that I did not experience this problem in any past versions of OME.  If I rebooted (or manually restarted OMSA services) and then kicked off a manual re-inventory of a specific device, OME would accurately report all updates and version as soon as the re-inventory completed.

Rob, I realize it might be tough to troubleshoot further.  Could you maybe just relay this experience as an FYI to the OME engineers?  If it continues to hinder my workflow, I'll probably need to call in.  And thanks for your efforts.  You do a heck of a job managing these forums.

Ryan

2.8K Posts

October 6th, 2014 20:00

Ryan,

Thanks, I think you might have stated the OMSA/inventory explanation better than I did.  I'll send the thread to the team to try and dig a bit more.  Perhaps it is some combination of the OMSA timing thing and something else on our side.  Let's see what I can come up with.

If you've not mentioned already higher up in the thread, you might let me know what kind of DUPS you see this more frequently on (if there is a pattern).

Thx!

Rob

2.8K Posts

October 6th, 2014 20:00

Thanks for the well thought out remarks.  I'll be sure to pass them along.

I can't recall if you do mostly in band (OMSA) updates or out of band (WSMan).  

The 15 or 20 minute waiting period is to mitigate the fact that we need 'some time' to ensure the updates complete on managed node.  Folks might kick off inventory manually before the 20 minutes with some success.  For in-band, there may be an OMSA issue where the managed node needs to restart the OMSA service to get the inventory updated.  This might happen for updates where a reboot is not required.  Perhaps SUU does not have this issues since it can control the OMSA service re-start?  Just speculation on my part.

So a manual reboot of the MN or a restart of the OMSA service might be a workaround.  I'll try to find more info on this particular OMSA issue and confirm if I'm correct.

But I would be interested to know if a reboot or restart of the OMSA service does anything to improve the situation for you (not that that is a final solution...just for data gathering purposes).

Best,

Rob

101 Posts

October 7th, 2014 06:00

I do all in band (OMSA) updates.  Now that I better know the role that OMSA plays (thanks Ryan), I will test some different scenarios.

Maybe what's needed is for some functionality to be added to OMSA to better support OME.  If OME were able to cause an on-demand collection, that would give the OME engineers more to work with.

24 Posts

October 7th, 2014 12:00

.  

1 Message

October 8th, 2014 14:00

Hi Rob.

I am having a same problem that after update my BIOS and Firmware updates on the PowerEdge servers through OME, after couple days OME is asking to updates the same thing again? There seem to be a bug in the tool. Do you have any ETA and confirmation that the problem is in the tool itself rather than our environment?

Thanks

24 Posts

October 9th, 2014 07:00

After you update the inventory,  you need to double check whether the update has gone through.  

In OME,  Check Details of the device to see whether it show new or old firmware/driver.  You can then check Lifecycle controller log to see whether the update has actually happened,  success or failed?  

It happened to me several times.  In every instance,  OME reported the inventory correctly.  There is some firmware issue. 

Check this thread:

http://en.community.dell.com/techcenter/systems-management/f/4494/t/19599861

 

101 Posts

October 9th, 2014 07:00

Ideally, if an update fails to install, OME should report that the update failed.  It shouldn't be necessary to manually check updates if OME is showing success.  In my latest test, if I manually run the DUP on the system, it does show that the package version is the same as the installed version.

101 Posts

October 9th, 2014 07:00

I did some testing this morning.

1. Used OME to install Network_Firmware_NF92Y_WN32_7.10.18.EXE on a PowerEdge R300.  OME completes the install and reports success. The update required a reboot and OME successfully rebooted the system.

2. I wait 20 minutes and the update is still showing in OME as being needed. I wait an hour and OME is still showing the update as being needed.

3. I manually reboot the updated server. 

4. After the server comes back up, I run Perform Discover and Inventory Now from Manage > Discovery and Inventory.

5. I check the System Update tab after OME says the discovery and inventory task is 100% complete and the update is still showing as needed.

6. I run Refresh Inventory from the Devices tab.

7. I check the System Update tab after OME says the inventory task is 100% complete and the update is still showing as needed.

8. I restart DSM BMU SOL Proxy.  No change.

9. I restart DSM Essentials DA Service.  No change.

10. I restart DSM Essentials Host Service.  No change.

11. I restart DSM Essentials Network Monitor.  No change.

12. I restart DSM Essentials Task Manager.  No change.

13. I close my web browser and restart using the Essentials icon from the desktop.  The system is now listed as a compliant system.


The last time I ran into this problem, I restarted all the DSM Essential services and the web browser and that allowed OME to start seeing the correct status.  This time I was trying to see if restarting any individual service resolved the problem.  Next time, I'll get the web browser restart in the process earlier.  

24 Posts

October 9th, 2014 10:00

There is some problem with some firmware/driver, for example Power supply firmware.  The new package contains old version.  When you update the server,  the system just updated with old version successfully.  OME will not treat this as error.   When the inventory is updated,  OME will correctly report the old version, as current version, and update is needed.

For example:

Successfully updated [PSU-1] PWR SPLY,750WP,RDNT,FLX, 69.45.9B, A00.

Updating Flex AC750W Platinum PSU(PSU PN: N30P9), 6B.47.9B, A00

    

101 Posts

October 12th, 2014 09:00

This problem is exacerbated by OME's inability to always detect if an update failed.  f2000sa mentioned a case in which the DUP contains the wrong version.  I encountered an instance today in which I asked OME to install Network_Driver_GPJ8K_WN_18.2.2_17.8c.4.3.  Everything appeared to work well, with the OME log reporting:

[09/14/14 09:40:20]    Vendor Software Return Code: 0

[09/14/14 09:40:20]    Name of Exit Code: SUCCESS

[09/14/14 09:40:20]    Exit Code set to: 0 (0x0)

[09/14/14 09:40:20]    Result: SUCCESS

[09/14/14 09:40:20]    Name of Exit Code: SUCCESS

But, when I manually ran the DUP on the system, it was clear that it hadn't actually installed when OME tried it.  (It did install fine when I ran it manually.)  The lesson is that you need to stop trusting your DUPs.  I had previously recommended that you get with the OMSA team and ask them to provide some way for you to manually initiate an inventory collection.  I think this function is needed, not only to update the OME status, but to actually validate installation success.  At the end of an update task, you should kick off an inventory collection, then use it to validate all of the DUPs that reported success.  If you don't find the system reporting the new version, then you should fail the update and put a note in the log that the update appeared to succeed, but validation failed.


And I think Dell should consider dogfooding OME.  That way, this type of feedback will come from internal sources instead of your customers.  In my opinion, OME is currently unusable for installing updates and this leads me to conclude that no one within Dell is trying to use it to manage production systems. 

No Events found!

Top