Changing failovermode on AIX

Question

I am preparing an NDU from FLARE 22 to 26. In the document generated by Clariion Procedure Generator I see 'If the AIX host configuration has been approved for NDU (see emc67186 for current status and requirements) then failovermode should be 3'. Failovermode is 1 for AIX servers using PowerPath and the configuration meet all requirements, so does it mean I should change the failovermode from 1 to 3 before the NDU ? If yes, how can I do it ? Just change it in Navisphere or must reboot AIX servers or anything else ?

tim.koopman · Accepted Answer

Ovivier,I do not think anyone has answered your original question.Shut your AIX system down, In NaviSphere use the Tools - failover setup wizard to change the failover mode from 1 to 3. Power up your AIX system. I do not know if an online process exists. You would have to open a support call to EMC see if an online process exists for changing failover mode online.

Allen Ward · Answer

I would engage EMC Support to help with this one. We just did a FLARE 24 to 26 upgrade on a new array we put on the floor with two AIX hosts attached to test out the NDU requirement for going to 26. It was an incredibly painful process involving unmounting file systems changing settings and remounting. The worst part was that after we had the HEAT reports generated we were told by a CLARiiON specialist that the HEAT didn't actual highlight everything we had to do, and they gave us more APARs, HBA settings, and various other setting to change.

I don't think our account team finds it as amusing as I do when I point out that the concept of NDU isn't really valid in the real world on CLARiiONs. And they especially don't like it when I point out that 2/3 of NDU is DU (otherwise know as Data Unavailable). Of course, it doesn't help that the last three major FLARE code upgrades we did caused outages to large numbers of hosts (even after EMC checked everything and said it would be OK).

ovivier · Answer

Ouh... You had a lot of problem with these NDUs... It is discouraging.Allen, could you please explain what 'HEAT' is ?Does anyone have a better experience with NDUs involving AIX servers ?

Allen Ward · Answer

When you run an EMCGrab or EMCReports from the host and send it in to EMC they run it through a tool that generates a 'HEAT' report. I can't remeber what it stands for off the top of my head, but it is supposed to tell you the state of the host and if everything is configured up to required specifications. If there are patches missing, incorrect setting, etc, it should be highlighted in the HEAT report.

dynamox · Answer

you can run your own HEAT reports now ..finally https://servicetools.emc.com/heat.php

AranH1 · Answer

Allen,Were the NDU (actually I think it is Non-Disruptive Upgrade) issues you had isolated to AIX hosts? I have never supported AIX on a SAN and am not familiar with the platform dependencies.I am curious because I have rarely seen issues resulting from NDU FLARE upgrades (all Windows hosts), and even upgrades from one platform to another (Cx600 to Cx700 for example). Of course that does not mean they don't happen but it just doesn't appear to be a common occurrence to me.

Allen Ward · Answer

Amen to that. This really simplifies the process. Now if they could just make the HEAT report comprehensive I'd be ecstatic. It catches a lot of the stuff, but the last time we had issues a manual review by a CLARiiON specialist found half a dozen things that hadn't been identified on the HEAT.

Allen Ward · Answer

WOW! When did this appear?My CE doesn't even know about this!

SKT2 · Answer

Host Environment Analysis Toolhttps://forums.emc.com/forums/thread.jspa?threadID=66560&tstart=0

dynamox · Answer

neither did mine ...i think it became available a couple of weeks ago, can't find the thread now but somebody posted it in the symmetrix forum. I was excited as hell ..don't have to beg my CE to run them for me or open tickets with support.

Allen Ward · Answer

Almost all the problems we have had related to AIX. There have been the occasional Windows or Solaris host that had trouble because something was misconfigured on the host side, but those were always obvious errors when we investigated.

AIX has never gone well for us.

dynamox · Answer

i agree, i use it mostly to check software versions for PowerPath/drivers. I did a CX600 upgrade two weeks ago and it did not catch incorrectly configured DMP configuration on 3 Solaris boxes, so all three lost access to drives during SP failover. So much for NDU , you are not alone

tim.koopman · Answer

All,My environment contains multiple AIX systems. I have had my share of bad flarecode updates where I have had problems with AIX. The good news is my last flarecode update Fare 19 to the latest 24 code went well for the four systems I did not shut down.Getting everything configured is not easy, I had to use ELAB, EMC support matrix for Veritas, and the AIX host connectivity guide. I did have to take a two hour outage to get my system updated to fit into the matrix the above three docs created. Good luck just wanted you to know it is possible. In my case the system only had a single hba and I was using PowerPath SE. I changed the failover mode to 3 when I was updating PowerPath to 5.0 part of the work I did in the 2 hour outage I referenced above.So a planned outage was required to get my system to a supported configuration.The four AIX systems, that I did get to a supported configuration had no problems while the flarecode update took place. I did test the configuration before the flarecode update. Doing trespass at the array and disabling initiators on the array. I recommend if you mess with the initiators on the array, reboot your server before you put it back into production. In my tests, I was not running production data to the disk. My production application was down, and I was running a large copy while I did the trespass tests.I would change the failover mode from 1 to 3. A EMC tech note does explain the difference between failover mode 1 and 3, I just do not remember the number. All my systems were down when I changed the failover mode. I do NOT recommend changing the failover mode on a running system.

dynamox · Answer

Tim,all of your AIX systems are running Veritas ? I do not have Veritas on mine ..so wonder if the process would be any different.

Allen Ward · Answer

I'm glad to hear you were able to get an NDU FLARE upgrade in with four AIX hosts up, but I have to wonder if we can still call it an NDU when you had to take a 2 hour outage before the upgrade to make sure the NDU would work. I'm thinking that we are just going to have to bite the bullet in the future and take all our AIX hosts down for FLARE upgrades and just have the single outage.

There is no guarantee that EMC isn't going to come back with more changes required to support the next FLARE upgrade... and thy will probably require more AIX outages to ensure that AIX won't take an outage!!!

CLARiiON

Was this post helpful?