Start a Conversation

Unsolved

This post is more than 5 years old

29692

January 13th, 2011 08:00

Removing LUN from Server - PowerPath for Linux - powermt issue

Hi,

I am sure a 'reboot' will be the fix for this, but I am trying to understand what the issue is or if there is some other way to correct it.

On a Linux server, I am removing a Clariion SAN presented LUN. The Storage Admin removed the LUN from the storage group to unpresent it. My log showed the path going dead. I ran 'powermt check' and said 'y' in response to the 'device path is currently dead, do you want to remove it' messages. Three of the four paths associated with the EMC PowerPath pseudo name have been removed, but one remains and I cannot remove it because PowerPath thinks it is 'in use'.

[root@bkvdw01da bin]# powermt display dev=emcpowerk

Pseudo name=emcpowerk

CLARiiON ID=APM00083101547 [bkvdwracda RAC cluster]

Logical device ID=6006016072C02000A6C87186920DDE11 [bkvdwracda - /exports on bkvdw01da - LUN375]

state=alive; policy=CLAROpt; priority=0; queued-IOs=0

Owner: default=SP A, current=SP A Array failover mode: 1

==============================================================================

---------------- Host --------------- - Stor - -- I/O Path - -- Stats ---

### HW Path I/O Paths Interf. Mode State Q-IOs Errors

==============================================================================

1 lpfc sdan SP A5 active dead 0 1

[root@bkvdw01da bin]# powermt check

Warning: CLARiiON device path sdan is currently dead.

Do you want to remove it (y/n/a/q)? y

Cannot remove device that is in use: sdan

Kernel - Oracle Enterprise Linux (pretty much Red Hat) 5.2 - 2.6.18-92.el5

PowerPath Version - EMC powermt for PowerPath (c) Version 5.1 SP 2 (build 21)

This has occurred on three different servers. I have tried Powermt commands for 'remove' 'remove force' 'release' - all complaining that the 'device is in use'. but what makes PowerPath think the device is 'in use' if it is a dead path and no longer presented.

Thanks!

1 Rookie

 • 

20.4K Posts

January 14th, 2011 20:00

unfortunately i can't provide a solution but to share your frustration. We always have these issue, even on the latest 5.x RHEL boxes. The other day we had to upgrade PowerPath from 5.1 to 5.5 and PowerPath kept complaining that devices were in use, how can they be in use if all file system were un-mounted and volume groups were exported. Stopping PowerPath service would not help either. I literally had to disable ports on the switch, bounce the box and only then we were able to upgrade PowerPath. Linux after so many years is still so weak when it comes to LVM and disk subsystem. You very rarely see these issues on AIX/HPUX/Solaris ..real *nix boxes

5 Practitioner

 • 

274.2K Posts

January 14th, 2011 22:00

I had same issue formerly. This happens when you removed devices on Clariion without cleaning on host.

What I did is just give back the lun (id must be same) and remove using powerpath commands, then clean up on storage array.

This might help without rebooting ...

154 Posts

January 17th, 2011 13:00

Hello.  Have you contacted anyone from EMC's Customer Support about this issue?

Thanks, Brion

154 Posts

January 18th, 2011 11:00

Please keep me informed on your progress. I’d like to see this issue researched.

January 18th, 2011 11:00

Thank you for the responses.  I have not opened a case with EMC, mostly because I suspect, based on some past experience, the answer will be reboot.

Today I made the same change on three more servers (another RAC cluster). Befor unpresenting the LUNs from the Clariion I issued the 'powermt remove' command, but the answer was -

#powermt remove force dev=emcpoweri
Cannot remove device that is in use: emcpoweri

Even though there was nothing mounted on the device.

Interestingly, the issue on these three nodes is consistent with what happened on the first three. One path remains on the EMC pseudo device in a display -

#  powermt display dev=emcpoweri
Pseudo name=emcpoweri
CLARiiON ID=APM00083101547 [bkvdwracp RAC cluster]
Logical device ID=6006016072C020003E7F9D2EC585DE11 [bkvdwracp - /exports on bkvdw03p - LUN407]
state=alive; policy=CLAROpt; priority=0; queued-IOs=0
Owner: default=SP A, current=SP A       Array failover mode: 1
==============================================================================
---------------- Host ---------------   - Stor -   -- I/O Path -  -- Stats ---
###  HW Path                I/O Paths    Interf.   Mode    State  Q-IOs Errors
==============================================================================
   0 lpfc                      sdh       SP A4     active  dead       0      1

I cannot reboot at this time, but it does not appear to be causing any issues in this state.

I have two more clusters to do this to. They are each only two nodes. Maybe I will find out more before then!

24 Posts

January 19th, 2011 12:00

Please review the steps in the "PowerPath for Linux Installation and Administration Guide" available from PowerLink in the following breadcrumb trail:

Home > Support > Technical Documentation and Advisories > Software ~ P-R ~ Documentation > PowerPath Family > PowerPath > Installation/Configuration

Specifically skip to the section: "Dynamically removing a LUN".  There are a few steps listed that you didn't mention whether or not you had performed and you may or may not be performing them in the recommended order.  From the few times I've needed to do this, I've followed this specific process and don't recall seeing the behavior you are observing, but I also would have to review my projects to see what version of PowerPath (and version of Red HatI was working on when I last performed it.  Based on the process, as we all agree a reboot will most likely clean it up as Linux rescans/rebuilds its pseudo devices and PowerPath rebuilds its configuration.

Just a thought.

January 21st, 2011 06:00

Thank you for all the helpful posts. i have performed the same task on eight servers in three RAC clusters, all with the same result. I have tried unpresenting the LUN's before cleaning up PowerPath and tried the 'powermt remove' before unpresenting..with no difference. PowerPath thinks one of the paths is still busy even though the filesystem is unmounted. It will removed all the dead paths for the device except for one in every case -

Pseudo name=emcpowerd
CLARiiON ID=APM00083101547 [bkvoltpda RAC cluster]
Logical device ID=6006016072C020007EBABDD310E0DE11 [bkvoltpda - /exports on bkvoltp01da - LUN504]
state=alive; policy=CLAROpt; priority=0; queued-IOs=0
Owner: default=SP B, current=SP B       Array failover mode: 1
==============================================================================
---------------- Host ---------------   - Stor -   -- I/O Path -  -- Stats ---
###  HW Path                I/O Paths    Interf.   Mode    State  Q-IOs Errors
==============================================================================
   1 lpfc                      sdaf      SP B5     active  dead       0      1

Atleast it is consistent! I expect these will finally clean up on a reboot. I have one more cluster (two servers) to do.

Thanks,

Michele

January 26th, 2011 13:00

On the final time doing this the result was consistent and one dead path remains.

I am planning to remove LUNs from a standalone server next week and am curious to see if the result is still the same.

I will update this forum with my results.

Thanks.

1.3K Posts

January 30th, 2011 08:00

since you mentioned RAC cluster, are those LUNs part of ASM disk group? if yes, the devices cant be released till you bunce the ASM on the node.

January 31st, 2011 06:00

Thanks for your response. No these are not ASM disks, strickly LVM.

Although I am curious to see if perhaps the clusterware may be the culprit. Tomorrow I will be removing a LUN from a standalone server...although same OS and version of PowerPath.

I will update with result.

1.3K Posts

January 31st, 2011 18:00

when you said you tried to remove when the device is mounted, but was the device part of the VG that time? were u able to do a vgexport ?

February 1st, 2011 06:00

Today I removed a LUN from a standalone Linux Server. I followed the same steps that I did on the previous RAC nodes -

umount the directory

lvremove, vgchange, vgremove

powermt remove dev=emcpowerc <== this time there were no error messages of device in use

powermt release

Un-presented LUN from SAN

powermt check  <==saw all four dead paths and removed them'

powermt save

vi /etc/fstab to remove old entry

This time it worked perfectly! So, my suspicion is that the RAC clusterware is somehow holding the last path 'in use' and not allowing the dead path to be removed. Until, I assume, I reboot.

Pseudo name=emcpowere
CLARiiON ID=APM00083101547 [bkvoltpp RAC cluster]
Logical device ID=6006016072C02000E41689897DDFDE11 [bkvoltpp - /exports on bkvoltp01p - LUN454]
state=alive; policy=CLAROpt; priority=0; queued-IOs=0
Owner: default=SP B, current=SP B       Array failover mode: 1
==============================================================================
---------------- Host ---------------   - Stor -   -- I/O Path -  -- Stats ---
###  HW Path                I/O Paths    Interf.   Mode    State  Q-IOs Errors
==============================================================================
   1 lpfc                      sdaf      SP B5     active  dead       0      1

[root@bkvoltp01p ~]# powermt remove force dev=emcpowere
Cannot remove device that is in use: emcpowere

Thank you for all the responses.

February 1st, 2011 22:00

Michele, firstly thank you for sharing your experiences throughout.  Not sure if it would provide anything meaningful and help identify what is holding onto it, but does lsof on the device provide anything useful to see what process might be holding onto it?  Run on both the dev files (using the output above):

lsof /dev/emcpowere

lsof /dev/sdaf

February 2nd, 2011 11:00

Hi Chris,

I tried the lsof and nothing was returned. 

These LUNs were presented from the SAN to all three nodes in the RAC, but each was only configured with lvm and mounted on one node only. I am not sure if that would make a difference...

Other than waiting to schedule a reboot of all the servers, I am not sure what else to do. It is not causing any issues I can see.

Thanks!

1.3K Posts

February 26th, 2011 07:00

since you have mentioned RAC, is the shared VG(LVM) are you following here?

No Events found!

Top