159 Posts

November 29th, 2010 15:00

To further clarify, all drives have soft media errors.  Usually before they leave EMC facilities (or whoever they have manufacturing them for them) drives are run through a series of tests that check for and re-map soft media errors.  This is normal.  Where you can end up getting to an out of normal state is when you might receive 25-100+ of these on a single drive in a close time proximity.  In that case, I would usually suggest that you work with EMC to copy out the disk to Hot Spare and replace it.  Doing a preemptive copy to hot spare is actually better than a fail to hot spare and typically takes less time because it does not need to do a full rebuild.  Hope that helps.

For future reference, not all soft media errors reported by Flare are simply bad sector related (the ones you are seeing are).  The codes to watch for are 801, 901, 803, and 20 (sometimes).  These typically indicate that there is a larger issue and/or that the predictive failure algorithms have found issues.

For now, what you are seeing is probably nothing, just watch the error counts per drive for 6A0/820 errors.

159 Posts

November 28th, 2010 20:00

6A0/820 is a standard soft media error warning.  Can you look at the extended error code (probably in the fix text) and tell me what the HEX is?  Typically this is a simple sector or partial sector problem and indicates that the data was read and re-mapped elsewhere on the drive.  Unless you have a bunch of these, it is typically not that big of a deal.  The extended error code can, however tell you where the root cause would have occured for various types of soft media errors. 

1 Rookie

 • 

41 Posts

November 28th, 2010 23:00

Here is the  error code from Navisphere manger.

2010-11-22 03:31:58 0x820 Soft Media Error CKM00083700085 Enclosure 2 Disk 12 SPB CX4_960_SPB
2010-11-22 03:32:06 0x689 Sector Reconstructed CKM00083700085 Enclosure 2 Disk 12 SPB CX4_960_SPB
2010-11-22 05:44:33 0x6a0 Disk soft media error CKM00083700085 Enclosure 2 Disk 12 SPB CX4_960_SPB

Support guy syaing that, specificaly there are three reason to  cause of this error.

1) Environmental issue   ------------- No issues

2) High intensive use of LUN  ----  No idea How  to claculate LUN perfomance.  (:

3)  disk drive firmware issue (Need to upgarde) 

     Is there any way  to find exact cuase of this error? please help.

4 Operator

 • 

5.7K Posts

November 29th, 2010 03:00

Soft errors can be caused by a bad disk or a bad section on a disk. Having soft errors means they can be repaired. If you're having lots of these, you should open a service request to get EMC look at it. Although soft errros will not cause the data not to be written, a lot of these are suspicious. You could have the disk replaced for this.

I'd open a SR if in doubt.

4 Operator

 • 

4.5K Posts

November 29th, 2010 13:00

As RRR says soft media errors are problems with the disk surface - bad spot - the data is automatically relocated to another location on the disk. With high capacity SATA disks this is normal - imperfections in the surface - and expected, which is why there are a lot of extra locations on the disk - to handle to expected media failures.

EMC has a theashold that they look for - x number of soft media errors in x number of days and you replace the disk.

glen

3 Posts

August 8th, 2013 17:00

How do clear these errors in the counter under the errors tab for the disk.  I want to set it back to 0 now that the disk has been replaced.

89 Posts

April 13th, 2015 10:00

Hello gprosser,

Were you able to clear these error and reset the counter?

I replaced multiple disk and the error counter stayed the same as well.

No Events found!

Top