Start a Conversation

Solved!

Go to Solution

3378

October 3rd, 2018 10:00

Equallogic PS6000 Lost Block after power outage

Hi

I have a Equallogic PS6000 running FW V5.2.6 with 4400 Lost Blocks after a power outage.

Other places in this support forum I see a script that can be run on the device. Since the devices are are of support where can I obtain the script?

Kind regards

Christian

3 Apprentice

 • 

1.5K Posts

October 6th, 2018 15:00

Hello, 

 Yes, mark valid is your best bet at this point.  You can create a snapshot first if you wish.  Then you can revert quickly w/o needing 2x the space. 

 You need to rescan or even reboot the hosts afterwards. 

 Regards, 

Don 

3 Apprentice

 • 

1.5K Posts

October 3rd, 2018 11:00

Hello, 

 There's no script for this.  

You can open a one time support call for this.  They can help you.  

 Are there any volumes showing lost blocks?   

 Regards,

Don 

 

16 Posts

October 4th, 2018 01:00

Hi Don

Thanks for your reply.

Correction: It's Equallogic PS3000

I'm quite 'green' at this to be honest so not sure how to check the volumes but if I show the volumes they all list in status "online" :)

The ESX servers running on the SAN shows the volumes but don't see any datastores on the volumes and offers to initialize them. Based on that I concluded that the Lost Blocks seams to contain key data for the Env.

Running the su exec 'raidtool' command Raid 0 shows ~4400 Lost Blocks

How much is a normal one time support call? 
The setup is running services for a volunteer based non profit organisation for young musicians so founding is pretty much non existent..  

Have a great day

Christian

3 Apprentice

 • 

1.5K Posts

October 4th, 2018 05:00

Hello Christian, 

 First is this data backed up? 

 In the GUI if you look at each volume if there are lost blocks there it will indicate that an allow you to mark them valid.  However, the volume typically shows offline - lost blocks as the status in those cases.  

 I don't have the cost of the one time call.  I don't think it's terribly expensive.  They can tell you how much when you call in. 

 Regards,

Don 

 

16 Posts

October 5th, 2018 23:00

Hi Don Thanks for your reply No backup :'( Only a replication partner that have been paused for 15+ months.. I have problems getting the GUI running but that was an issue before the power outage so I'm not reading to much into that. The SAN is running 5 volumes VOL01-05. VOL04 and 05 is working fine and in a DesiredStatus: online while VOL01-03 is like below: Fulrik> volume show VOL01 _____________________________ Volume Information ______________________________ Name: VOL01 Size: 500GB VolReserve: 500.01GB VolReserveInUse: 484.86GB ReplReserveInUse: 0MB iSCSI Alias: VOL01 iSCSI Name: ActualMembers: 1 iqn.2001-05.com.equallogic:0-8a0906- Snap-Warn: 10% 326932403-d2a0000004c560b0-vol01 Snap-Depletion: delete-oldest Description: Snap-Reserve: 20% Snap-Reserve-Avail: 100% (100GB) Permission: read-write DesiredStatus: Status: online online-lost-cached-blocks Connections: 2 Snapshots: 0 Bind: Type: replicated ReplicationReserveSpace: 0MB Replicas: 0 ReplicationPartner: Pool: default Transmitted-Data: 5.6TB Received-Data: 8.89TB Pref-Raid-Policy: none Pref-Raid-Policy-Status: none Thin-Provision: disabled Thin-Min-Reserve: 0% (0MB) Thin-Growth-Warn: 0% (0MB) Thin-Growth-Max: 0% (0MB) ReplicationTxData: 0MB MultiHostAccess: enabled iSNS-Discovery: disabled Replica-Volume-Reserve: 0MB Thin-Clone: N Template: N NAS File System: N Administrator: Thin-Warn-Mode: offline The ESX servers show different Lock modes for the impacted volumes (not sure if that says anything): [root@vmh4:/var/log] esxcli storage vmfs lockmode list Volume Name UUID Type Locking Mode ATS Compatible ATS Upgrade Modes ATS Incompatibility Reason ----------- ----------------------------------- ------ ------------ -------------- ----------------- -------------------------- VOL04 581e5fec-dc9b0336-00ff-bc305bd53c8a VMFS-5 ATS true No upgrade needed VOL02 560b0a67-41eb18d0-4242-b8ac6f96d01e VMFS-5 ATS+SCSI true Online/Offline VOL05 595c10c5-8663949a-ddbf-b8ac6f96d01e VMFS-5 ATS true No upgrade needed VOL03 57041ed7-2c963740-0d4d-b8ac6f96d01e VMFS-5 ATS+SCSI true Online/Offline VOL01 560b08e1-eadb57cc-4be5-b8ac6f96d01e VMFS-5 ATS+SCSI true Online/Offline VOMA is unable to check VOL01-03 but works fine VOL04 and 05 [root@vmh4:/var/log] voma -m vmfs -f check -d /vmfs/devices/disks/naa.6090a0384032f968856eb5938c737240 Checking if device is actively used by other hosts Running VMFS Checker version 1.2 in check mode Initializing LVM metadata, Basic Checks will be done Phase 1: Checking VMFS header and resource files Detected VMFS file system (labeled:'VOL03') with UUID:57041ed7-2c963740-0d4d-b8ac6f96d01e, Version 5:61 ERROR: IO failed: Input/output error ERROR: Failed to check pb2.sf. VOMA failed to check device : IO error Total Errors Found: 0 Kindly Consult VMware Support for further assistance Have a great weekend when you get to that :) Christian

1 Message

October 6th, 2018 03:00


Morning 🙂

Gider du prøve at reply med dette til Don via Dell på
https://www.dell.com/community/EqualLogic/Equallogic-PS6000-Lost-Block-after-power-outage/m-p/6193178#M13055

Hi Don

Thanks for your reply

No backup 😢 Only a replication partner that have been paused for 15+ months..

I have problems getting the GUI running but that was an issue before the power outage so I'm not reading to much into that.
The SAN is running 5 volumes VOL01-05. VOL04 and 05 is working fine and in a DesiredStatus: online while VOL01-03 is like below:

Fulrik> volume show VOL01
___________________________ Volume Information ____________________________
Name: VOL01 Size: 500GB
VolReserve: 500.01GB VolReserveInUse: 484.86GB
ReplReserveInUse: 0MB iSCSI Alias: VOL01
iSCSI Name: ActualMembers: 1
iqn.2001-05.com.equallogic:0-8a0906- Snap-Warn: 10%
326932403-d2a0000004c560b0-vol01 Snap-Depletion: delete-oldest
Description: Snap-Reserve: 20%
Snap-Reserve-Avail: 100% (100GB) Permission: read-write
DesiredStatus: Status: online
online-lost-cached-blocks Connections: 2
Snapshots: 0 Bind:
Type: replicated ReplicationReserveSpace: 0MB
Replicas: 0 ReplicationPartner:
Pool: default Transmitted-Data: 5.6TB
Received-Data: 8.89TB Pref-Raid-Policy: none
Pref-Raid-Policy-Status: none Thin-Provision: disabled
Thin-Min-Reserve: 0% (0MB) Thin-Growth-Warn: 0% (0MB)
Thin-Growth-Max: 0% (0MB) ReplicationTxData: 0MB
MultiHostAccess: enabled iSNS-Discovery: disabled
Replica-Volume-Reserve: 0MB Thin-Clone: N
Template: N NAS File System: N
Administrator: Thin-Warn-Mode: offline

The ESX servers show different Lock modes for the impacted volumes (not sure if that says anything):
[root@vmh4:/var/log] esxcli storage vmfs lockmode list
Volume Name UUID Type Locking Mode ATS Compatible ATS Upgrade Modes ATS Incompatibility Reason
----------- ----------------------------------- ------ ------------ -------------- ----------------- --------------------------
VOL04 581e5fec-dc9b0336-00ff-bc305bd53c8a VMFS-5 ATS true No upgrade needed
VOL02 560b0a67-41eb18d0-4242-b8ac6f96d01e VMFS-5 ATS+SCSI true Online/Offline
VOL05 595c10c5-8663949a-ddbf-b8ac6f96d01e VMFS-5 ATS true No upgrade needed
VOL03 57041ed7-2c963740-0d4d-b8ac6f96d01e VMFS-5 ATS+SCSI true Online/Offline
VOL01 560b08e1-eadb57cc-4be5-b8ac6f96d01e VMFS-5 ATS+SCSI true Online/Offline

VOMA is unable to check VOL01-03 but works fine VOL04 and 05
[root@vmh4:/var/log] voma -m vmfs -f check -d /vmfs/devices/disks/naa.6090a0384032f968856eb5938c737240
Checking if device is actively used by other hosts
Running VMFS Checker version 1.2 in check mode
Initializing LVM metadata, Basic Checks will be done
Phase 1: Checking VMFS header and resource files
Detected VMFS file system (labeled:'VOL03') with UUID:57041ed7-2c963740-0d4d-b8ac6f96d01e, Version 5:61
ERROR: IO failed: Input/output error
ERROR: Failed to check pb2.sf.
VOMA failed to check device : IO error

Total Errors Found: 0
Kindly Consult VMware Support for further assistance

16 Posts

October 6th, 2018 09:00

Hi Don

My replies to you keeps getting mars as Spam :(

No backup :'( Only a replication partner that have been paused for 15+ months..

Christian

16 Posts

October 6th, 2018 09:00

Trying the first bit of my reply hoping it isn't getting the spam tag.. I have problems getting the GUI running but that was an issue before the power outage so I'm not reading to much into that. The SAN is running 5 volumes VOL01-05. VOL04 and 05 is working fine and in a DesiredStatus: online while VOL01-03 is like below: Fulrik> volume show VOL01 _____________________________ Volume Information ______________________________ Name: VOL01 Size: 500GB VolReserve: 500.01GB VolReserveInUse: 484.86GB ReplReserveInUse: 0MB iSCSI Alias: VOL01 iSCSI Name: ActualMembers: 1 iqn.2001-05.com.equallogic:0-8a0906- Snap-Warn: 10% 326932403-d2a0000004c560b0-vol01 Snap-Depletion: delete-oldest Description: Snap-Reserve: 20% Snap-Reserve-Avail: 100% (100GB) Permission: read-write DesiredStatus: Status: online online-lost-cached-blocks Connections: 2 Snapshots: 0 Bind: Type: replicated ReplicationReserveSpace: 0MB Replicas: 0 ReplicationPartner: Pool: default Transmitted-Data: 5.6TB Received-Data: 8.89TB Pref-Raid-Policy: none Pref-Raid-Policy-Status: none Thin-Provision: disabled Thin-Min-Reserve: 0% (0MB) Thin-Growth-Warn: 0% (0MB) Thin-Growth-Max: 0% (0MB) ReplicationTxData: 0MB MultiHostAccess: enabled iSNS-Discovery: disabled Replica-Volume-Reserve: 0MB Thin-Clone: N Template: N NAS File System: N Administrator: Thin-Warn-Mode: offline

16 Posts

October 6th, 2018 09:00

Trying my reply in bites hoping it isn't getting spam tagged

I have problems getting the GUI running but that was an issue before the power outage so I'm not reading to much into that.
The SAN is running 5 volumes VOL01-05. VOL04 and 05 is working fine and in a DesiredStatus: online while VOL01-03 is like below:

Volume VOL01 Info.JPG

16 Posts

October 6th, 2018 09:00

Trying my reply in bites hoping it isn't getting spam tagged, here goes the next part :)

The ESX servers show different Lock modes for the impacted volumes (not sure if that says anything):
ESX Locks.JPG

VOMA is UNABLE to check VOL01-03 but works fine VOL04 and 05

[root@vmh4:/var/log] voma -m vmfs -f check -d /vmfs/devices/disks/naa.6090a0384032f968856eb5938c737240
Checking if device is actively used by other hosts
Running VMFS Checker version 1.2 in check mode
Initializing LVM metadata, Basic Checks will be done
Phase 1: Checking VMFS header and resource files
Detected VMFS file system (labeled:'VOL03') with UUID:57041ed7-2c963740-0d4d-b8ac6f96d01e, Version 5:61
ERROR: IO failed: Input/output error
ERROR: Failed to check pb2.sf.
VOMA failed to check device : IO error

Total Errors Found: 0
Kindly Consult VMware Support for further assistance

Thanks for any guidance you can give and have a great weekend when you get to that :)

Christian

3 Apprentice

 • 

1.5K Posts

October 6th, 2018 13:00

Hello Christian, 

 On that one volume that shows lost blocks you can run: 

GrpName> vol sel VOLUMENAME lost-blocks mark-valid

 That may allow the ESX servers to see the Datastore again. 

 When you are in a lost cache condition you should NOT run utilities like VOMA until after the condition is fully corrected. 

 Is the replication up to date?   (Just in case) 

 Regards, 

Don 

 

16 Posts

October 6th, 2018 14:00

Hi Don

Thanks a lot for your reply and sorry for the message mess.

I was also thinking that marking the blocks valid might my best bet. Sound like you are thinking likewise?

The replication partner has been paused and powered down for something like 15 month so it way out dated but it could be a chance to recover an old version of the environment. I assume that option will be lost if I power it up and un-pause the replication?

So if mark valid is the best shoot I got(?)
Should I try to clone the volume first or something like that? 
And/or should I try to the other SAN box in sync or is a it better to save the old version for some kind of safety net?

Your help is much appreciated :)

Have an awesome day

Christian 

16 Posts

October 8th, 2018 00:00

Hi Don

Thanks for all your help :Cool:

12 out of 14 hosts now running :Yes:

Have a great day!

Christian

3 Apprentice

 • 

1.5K Posts

October 8th, 2018 02:00

Hello Christian, 

 You are very welcome,.  I am glad that I was able to assist you. 

 Regards,

Don 

 

No Events found!

Top