Todd_1215
1 Nickel

Unable to format LUN on Solaris 10

Hello - I hope someone here can assist me with this issue. I have been beating my head against the wall over this and have exhausted all possibilities.

I have 2 x Sun V480 with an Emulex LP10000-E card connected to a Clariion CX3-10f. I'm using the SUN branded Emulex driver provided with the OS. This server is a new install "no patches" what soever. One server has full access to the LUN that's assigned to it the other can see the LUN device but I am not able to format it nor can I mount it. I'll call the non-working server Server-B and the working server Server-A.

Server-A Config ( used fcinfo command to get results )
HBA Port WWN: 10000000c946878a
OS Device Name: /dev/cfg/c1
Manufacturer: Emulex
Model: LP10000
Firmware Version: 1.91a1 (T2D1.91A1)
FCode/BIOS Version: Boot:5.00a7 Fcode:1.41a4
Serial Number: VM51733465
Driver Name: emlxs
Driver Version: 2.31p (2008.12.11.10.30)
Type: L-port
State: online
Supported Speeds: 1Gb 2Gb
Current Speed: 2Gb
Node WWN: 20000000c946878a
Link Error Statistics:
Link Failure Count: 0
Loss of Sync Count: 10
Loss of Signal Count: 0
Primitive Seq Protocol Error Count: 0
Invalid Tx Word Count: 78894
Invalid CRC Count: 0

Server-B Config ( used fcinfo command to get results )
HBA Port WWN: 10000000c9468c76
OS Device Name: /dev/cfg/c1
Manufacturer: Emulex
Model: LP10000
Firmware Version: 1.91a1 (T2D1.91A1)
FCode/BIOS Version: Boot:5.00a7 Fcode:1.41a4
Serial Number: VM51734366
Driver Name: emlxs
Driver Version: 2.31p (2008.12.11.10.30)
Type: L-port
State: online
Supported Speeds: 1Gb 2Gb
Current Speed: 2Gb
Node WWN: 20000000c9468c76
Link Error Statistics:
Link Failure Count: 0
Loss of Sync Count: 6
Loss of Signal Count: 0
Primitive Seq Protocol Error Count: 0
Invalid Tx Word Count: 32
Invalid CRC Count: 0

Whenever I try to format the Server-B LUN I get lots of errors.

# format
Searching for disks...Jan 19 16:57:07 celhsol0200 scsi: WARNING: /pci@8,600000/lpfc@1/fp@0,0/ssd@w5006016941e0ca40,0 (ssd2):
Jan 19 16:57:07 celhsol0200     drive offline
Jan 19 16:57:07 celhsol0200 scsi: WARNING: /pci@8,600000/lpfc@1/fp@0,0/ssd@w5006016941e0ca40,0 (ssd2):
Jan 19 16:57:07 celhsol0200     drive offline


The device does not support mode page 3 or page 4,
or the reported geometry info is invalid.
WARNING: Disk geometry is based on capacity data.

The current rpm value 0 is invalid, adjusting it to 3600
done

c1t0d0: configured with capacity of 734.08GB


AVAILABLE DISK SELECTIONS:
0. c1t0d0 <DGC-RAID5-0326 cyl 47914 alt 2 hd 255 sec 126>
/pci@8,600000/lpfc@1/fp@0,0/ssd@w5006016941e0ca40,0
1. c2t0d0 <SUN72G cyl 14087 alt 2 hd 24 sec 424>
/pci@9,600000/SUNW,qlc@2/fp@0,0/ssd@w500000e010a8ec21,0
2. c2t1d0 <SUN72G cyl 14087 alt 2 hd 24 sec 424>  bootmirr
/pci@9,600000/SUNW,qlc@2/fp@0,0/ssd@w2100000c50fd5457,0
Specify disk (enter its number):


The drive that is reporting offline /pci@8,600000/lpfc@1/fp@0,0/ssd@w5006016941e0ca40,0 is the LUN from the EMC Clariion

Here's is the mapped drive
c1t0d0s0 -> ../../devices/pci@8,600000/lpfc@1/fp@0,0/ssd@w5006016941e0ca40,0:a

It almost seems like I can read the drive but cannot write to it. There isn't any security within the EMC Navisphere tool to restrict write access. So I'm stuped as to why I cannot format this LUN.

Any help is very much appricated
0 Kudos
13 Replies
kelleg
5 Rhenium

Re: Unable to format LUN on Solaris 10

On the Clariion - look in the Storage Group that contains the LUN and server B - right click on the storage group name and select "Select LUNs" - this should give you a list of the LUNs assigned to this host. There is a column called Host ID - the first LUN should be 0 - if not, then you need to move the LUN out of the storage group then add it back in and before clicking on apply make sure that the Host ID is zero - yu cna click in the column to engage a drop down to select the Host ID number. If you do not have a Host ID 0, the array will present a LUNZ to the host - this looks like a LUN to the host, but you can't do anything with it.

glen

0 Kudos
Todd_1215
1 Nickel

Re: Unable to format LUN on Solaris 10

HI Glen and thanks for the reply. The LUN ID from the Clariion side is presented as LUN 9 on the non-working Host. The Host ID is '0'. I tried removing and readding the LUN but it is still presented as LUN 9. The other working servers LUN is presented as LUN 8.

I ran luxadm against both server to compare and Server B "non-working" has a Path status of Not ready, while Server A "working server" Path status is O.K. plus it shows the Read cache information where on Server B it does not. So is this an issue on the host side of the Clariion side, I'm not sure. How could I check the Clariion side to make sure everything there is ok?

Non-Working Server

luxadm display /dev/rdsk/c1t0d0s2
DEVICE PROPERTIES for disk: /dev/rdsk/c1t0d0s2
  Vendor:               DGC
  Product ID:           RAID 5
  Revision:             0326
  Serial Num:           APM00082001710
  Unformatted capacity: 751735.000 MBytes
  Device Type:          Disk device
  Path(s):

  /dev/rdsk/c1t0d0s2
  /devices/pci@8,600000/lpfc@1/fp@0,0/ssd@w5006016941e0ca40,0:c,raw
    LUN path port WWN:          5006016941e0ca40
    Host controller port WWN:   10000000c9468c76
    Path status:                Not Ready

Working Server

luxadm display /dev/rdsk/c1t0d0s2
Password:
DEVICE PROPERTIES for disk: /dev/rdsk/c1t0d0s2
  Vendor:               DGC
  Product ID:           RAID 5
  Revision:             0326
  Serial Num:           APM00082001710
  Unformatted capacity: 819200.000 MBytes
  Read Cache:           Enabled
    Minimum prefetch:   0x0
    Maximum prefetch:   0x0
  Device Type:          Disk device
  Path(s):

  /dev/rdsk/c1t0d0s2
  /devices/pci@8,600000/lpfc@1/fp@0,0/ssd@w5006016141e0ca40,0:c,raw
    LUN path port WWN:          5006016141e0ca40
    Host controller port WWN:   10000000c946878a
    Path status:                O.K.

0 Kudos
kelleg
5 Rhenium

Re: Unable to format LUN on Solaris 10

Todd,

The Array LUN number is OK, it's the Host ID number that I wan concerned about.

Open the storage group for the host - you should see thre tree items lists - open the Host tree item and you should see the host listed - right click on the host and select Connectivity Status - the paths should be listed as Logged In and Registered - make sure all paths are listed correctly.

For the LUN 9, right click on the LUN and select Properties - on the General tab it should list the Default Owner and Current Owner. Make sure that the Current is the same as the Default and the the SPB that owns the LUN is also one the paths from the host to the array from the Connectivity status.

Are you using PowerPath?

glen

0 Kudos
Todd_1215
1 Nickel

Re: Unable to format LUN on Solaris 10

In looking at the Storage Processor event log I see a few events related to that host but am not sure what it means:

Initiator (20:00:00:00:C9:46:8C:76:10:00:00:00:C9:46:8C:76) on Server (celhsol0200) registered with the storage system is now inactive. It does not have a working physical connection. See Navisphere Manager for details.

0 Kudos
Todd_1215
1 Nickel

Re: Unable to format LUN on Solaris 10

The Host connectivity status looks good. The owner and current owner of the LUN are the same SP-A. Here's a thought...The host is plugged into SP-B Would that matter? Oh and no I am not using PowerPath.

Hmm - The working host is plugged into SP-A...could this be the issue because SP-B is not active at the moment?

0 Kudos
kelleg
5 Rhenium

Re: Unable to format LUN on Solaris 10

Right click on the LUN and select Trespass - that will move the LUN to SPB -

glen

0 Kudos
Todd_1215
1 Nickel

Re: Unable to format LUN on Solaris 10

That was it....Thanks for your help....
0 Kudos
bertog
2 Iron

Re: Unable to format LUN on Solaris 10

Hello,

Your host now has access to the LUN, but only through SPB.  Your problem looks like their was an issue with multipathing (especially considering you noted that there is no PowerPath) which means their is a high likelihood that you have a single point of failure.  I would recommend you continue to trouble shoot the issue until you identify and resolve the reason why the host could not access the LUN through SPA.  Otherwise, a failure of SPB, the cable, the switch port, or the HBA would result in a situation where the host would have no access to the LUN.

Also, if you have a switch in your environment (a single HBA connected to a switch, which is connected to both SPs) you may consider installing the free version of PowerPath to provide basic failover functionality.

kelleg
5 Rhenium

Re: Unable to format LUN on Solaris 10

Todd,

There are two issues at work - the zoning from the host to the array and the use of failover software.

If you only have a single HBA on the host you can zone the HBA to both SPA and SPB - that means you have some level of protection in case one of the paths fails but none if the HBA fails. The host will only see the actual LUN down the path that is the Current Owner - with one path to SPB and the LUN owned by SPA, you could not access the LUN until you trespassed the LUN to SPB. You should now change the Default Owner to SPB. This also means that on the array the failover mode is set to 1 - see below for more info on the failover.

For the above to work, you need failover software on the host - if you do not have failover software, then you need to be aware of the failover mode settings on the array for the host. Without failover software on the host, the failover mode must be set to 0 (zero) and with failover software the mode is set to 1 - depending on the type of failover software.

On PowerLink look for Knowledgebase Article emc99467 for more information on the array settings for different operating systems and failover software.

There is also a little used setting on the LUN properties called "Auto-Trespass" - this is only enabled when the host does not have any fialover software - this will trespass a LUN if one of the SP's dies. Failover mode would probably be set to 0 in this case.

glen