Start a Conversation

This post is more than 5 years old

Solved!

Go to Solution

33154

December 5th, 2012 02:00

avamar exchange 2010 vss backup and ntfs corruption

Hi All,

We are in process of migration from tape backup (Backupexec) to Avamar. We encountered serious problems when trying to backup Exchange 2010 mailbox server. Soon after backup starts we are seeing multiple warning and error events (see attachments) stating that system is unable to complete shadow copies (event 14, volsnap).

Backup fails and system reports corrupted ntfs volumes. Avamar console reports only error code: 10013 Dropped session - no progress reported; Command failed: externally cancelled.

We previously had successes when backuping single databases from this server.

This is Exchange 2010 SP2 RU4 on Windows 2008R2 SP1 server running on the esxi 5.0. Database and Log volumes are on the EMC VNX5300.  Avamar version: 6.1

We are not aware of any other problems. There are no problems during backups by BackupExec.

Please help.

4 Attachments

15 Posts

January 8th, 2013 22:00

Just to share the information. Support suggested changing Windows Exchange plugin options. Values that are changed from defaults:

1. Group by: Database

2. Enable consistency check throttling: Checked

2a. Both values (#IOs between pauses and duration of pauses) remains default: 1000ms

exchangeVSSconfig.jpg

2K Posts

December 5th, 2012 06:00

It sounds like there may be some kind of file system problem on this client. Have you run chkdsk on the file systems? What were the results?

I have also seen these types of messages if there is a corrupted shadow copy on the file system. Could you provide the output of the following commands?

vssadmin list writers

vssadmin list providers

vssadmin list shadows

15 Posts

December 5th, 2012 09:00

I run chdsk on all volumes and one needs to be fixed. The problem with this volume is: "Volume Bitmap is incorrect".

Tonight I will fix it, take a full backup with BackupExec and try again with avamar. But I don't think it is the cause - it may be leftover after one of previous system crashes during avamar backup.

C:\>vssadmin list writers

vssadmin 1.1 - Volume Shadow Copy Service administrative command-line too

(C) Copyright 2001-2005 Microsoft Corp.

Writer name: 'Task Scheduler Writer'

   Writer Id: {d61d61c8-d73a-4eee-8cdd-f6f9786b7124}

   Writer Instance Id: {1bddd48e-5052-49db-9b07-b96f96727e6b}

   State: [1] Stable

   Last error: No error

Writer name: 'VSS Metadata Store Writer'

   Writer Id: {75dfb225-e2e4-4d39-9ac9-ffaff65ddf06}

   Writer Instance Id: {088e7a7d-09a8-4cc6-a609-ad90e75ddc93}

   State: [1] Stable

   Last error: No error

Writer name: 'Performance Counters Writer'

   Writer Id: {0bada1de-01a9-4625-8278-69e735f39dd2}

   Writer Instance Id: {f0086dda-9efc-47c5-8eb6-a944c3d09381}

   State: [1] Stable

   Last error: No error

Writer name: 'System Writer'

   Writer Id: {e8132975-6f93-4464-a53e-1050253ae220}

   Writer Instance Id: {b8ab5df5-8c10-4ff4-a1bb-3d3b497fdb8a}

   State: [1] Stable

   Last error: No error

Writer name: 'ASR Writer'

   Writer Id: {be000cbe-11fe-4426-9c58-531aa6355fc4}

   Writer Instance Id: {e52f387a-9381-42fd-a7b3-4bdf13e6e865}

   State: [1] Stable

   Last error: No error

Writer name: 'WMI Writer'

   Writer Id: {a6ad56c2-b509-4e6c-bb19-49d8f43532f0}

   Writer Instance Id: {1fe23f57-92cb-4253-b39c-fa54564bb5c9}

   State: [1] Stable

   Last error: No error

Writer name: 'Shadow Copy Optimization Writer'

   Writer Id: {4dc3bdd4-ab48-4d07-adb0-3bee2926fd7f}

   Writer Instance Id: {eae6ea73-04b6-46dc-b9de-934670873f61}

   State: [1] Stable

   Last error: No error

Writer name: 'IIS Config Writer'

   Writer Id: {2a40fd15-dfca-4aa8-a654-1f8c654603f6}

   Writer Instance Id: {5bb0d3c0-136e-4b50-88ef-086e974765de}

   State: [1] Stable

   Last error: No error

Writer name: 'Microsoft Exchange Writer'

   Writer Id: {76fe1ac4-15f7-4bcd-987e-8e1acb462fb7}

   Writer Instance Id: {8f5701c0-aa41-45a7-b571-fe4e757bb0a0}

   State: [1] Stable

   Last error: No error

Writer name: 'Registry Writer'

   Writer Id: {afbab4a2-367d-4d15-a586-71dbb18f8485}

   Writer Instance Id: {5c7d32aa-693c-4f02-a15f-5690779dcf58}

   State: [1] Stable

   Last error: No error

Writer name: 'COM+ REGDB Writer'

   Writer Id: {542da469-d3e1-473c-9f4f-7847f01fc64f}

   Writer Instance Id: {a12f1bc5-7965-4acd-8961-6b014e1fa724}

   State: [1] Stable

   Last error: No error

Writer name: 'IIS Metabase Writer'

   Writer Id: {59b1f0cf-90ef-465f-9609-6ca8b2938366}

   Writer Instance Id: {ebd6ce0e-5728-4562-ba51-bde236b0fbe9}

   State: [1] Stable

   Last error: No error

C:\>

C:\>vssadmin list providers

vssadmin 1.1 - Volume Shadow Copy Service administrative command-line tool

(C) Copyright 2001-2005 Microsoft Corp.

Provider name: 'Microsoft Software Shadow Copy provider 1.0'

   Provider type: System

   Provider Id: {b5946137-7b9f-4925-af80-51abd60b20d5}

   Version: 1.0.0.7

C:\>

C:\>vssadmin list shadows

vssadmin 1.1 - Volume Shadow Copy Service administrative command-line tool

(C) Copyright 2001-2005 Microsoft Corp.

No items found that satisfy the query.

C:\>

2K Posts

December 5th, 2012 13:00

What happens if you create a VSS snapshot manually? Does this snapshot also fail?

Avamar does not touch the filesystem directly. All requests for file data, snapshots, etc. are sent through the Windows API. It looks like there is a serious data integrity issue happening on this client that needs to be investigated.

15 Posts

December 5th, 2012 14:00

I repaired that volume that was incorrect (chkdsk)

I have checked that I can create VSS shaphot manually for each volume (vssadmin create shapshot /for= ) and then deleted the snapshots. everything is OK - no errors. run chkdsk on every volume - no problems.

I understand that avamar is just using api to expose data, but this client shows no other problems - it is a small production exchange for 260 mailboxes / 360GB and it is beeing backup daily without any problems for months.

Now I will try to backup with avamar.

2K Posts

December 6th, 2012 07:00

This is a very thorough analysis and seems to point to back-end storage performance.

The speed of Avamar backups tends to be limited only by the performance of the client filesystem(s), where more traditional backup software is often bottlenecked by the network. In general, this makes Avamar backups more I/O intensive than other backups. If a number of databases are on the same LUN, the backup may be placing the underlying spindles into contention which would certainly cause a performance drop like the one you are seeing.

I would recommend opening a service request for this issue. The Avamar support team will likely have to collaborate with the VNX support team to review the back-end performance.

15 Posts

December 6th, 2012 07:00

I began testing with backing up just one small database and it finished successfully.

The only error that is logged during the backup is VSS Event 8194:

Volume Shadow Copy Service error: Unexpected error querying for the IVssWriterCallback interface.  hr = 0x80070005, Access is denied.

. This is often caused by incorrect security settings in either the writer or requestor process.

Operation:

   Gathering Writer Data

Context:

   Writer Class Id: {e8132975-6f93-4464-a53e-1050253ae220}

   Writer Name: System Writer

   Writer Instance ID: {dd9acb93-989d-479d-8e9e-8acffe04ce3b}


But I didn’t notice that it affects backup anyhow. I can browse backup so it should be OK.

Then I successively added more databases and tested. I ended up with 6 databases  (so it is 12 volumes  - one volume for db and one logs – being backed up simultaneously).

The problem started when dataset contained 8 databases. (16 out of 23 volumes on this client). Soon after starting backup system was unresponsive for several seconds. Events that showed in eventlog:

Error ESE 481:

Information Store (2332) WRODB01: An attempt to read from the file "D:\exchsrv\WRODB\WRODB01\WRODB01.edb" at offset 17215848448 (0x0000000402250000) for 262144 (0x00040000) bytes failed after 52 seconds with system error 1117 (0x0000045d): "The request could not be performed because of an I/O device error. ". The read operation will fail with error -1022 (0xfffffc02).  If this error persists then the file may be damaged and may need to be restored from a previous backup.


Error ExchangeStorageDB 203:

At '12/6/2012 1:09:26 AM' database copy 'WRODB01' on this server appears to have an I/O error that it may be able to repair.  To help identify the failure, consult the Event log on the server for other storage and "ExchangeStoreDb" events. Service recovery was attempted by failover to another copy. The failover was unsuccessful in restoring the service because of the following error: 'There is only one copy of this mailbox database (WRODB01). Automatic recovery is not available.


Error ExchangeStoreDB 233 :

At '12/6/2012 1:09:26 AM' database copy 'WRODB01' on this server appears to have an I/O error that it may be able to repair.  To help identify the failure, consult the Event log on the server for other storage and "ExchangeStoreDb" events. Service recovery was attempted by failover to another copy. The failover was unsuccessful in restoring the service because of the following error: 'There is only one copy of this mailbox database (WRODB01). Automatic recovery is not available..

Warning  ESE 507:

Information Store (2332) WRODB01: A request to read from the file "D:\exchsrv\WRODB\WRODB01\WRODB01.edb" at offset 17239965696 (0x0000000403950000) for 262144 (0x00040000) bytes succeeded, but took an abnormally long time (47 seconds) to be serviced by the OS. This problem is likely due to faulty hardware. Please contact your hardware vendor for further assistance diagnosing the problem.


Error ExchangeStoreDB 218:

At '12/6/2012 1:10:30 AM' the copy of database 'WRODB01' on this server experienced a performance problem. Failover returned the following error: "There is only one copy of this mailbox database (WRODB01). Automatic recovery is not available.". For more detail about the failure, consult the Event log on the server for other storage and "ExchangeStoreDb" events.


Warning ESE 531:
Information Store (2332) The database engine attempted a clean write operation on page 526121 of database D:\exchsrv\WRODB\WRODB01\WRODB01.edb. This acton was performed in an attempt to correct a previous problem reading from the page.


Error  volsnap 14:

The shadow copies of volume D:\exchsrv\WROLG\WROLG09 were aborted because of an IO failure on volume D:\exchsrv\WROPF0.


Error Storage Group Cosistency Check 310 :

Instance 8: An attempt to read the header of log file '\\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy2\e0a.log' failed  with error code -1023 (0xfffffc01). Transaction log file validation failed with this error code.


Error Storage Group Consistency Check 401:

Instance 8: The physical consistency check has completed, but one or more errors were detected. The consistency check has terminated with error code of -106 (0xffffff96).

And then on the end of the backup:


Error MSExchangeIS 9782:

Exchange VSS Writer (instance 8feaf36e-859f-4cc0-8642-81ff91eb26d0:1) has completed the backup of database 'WRODB09' with errors. The backup did not complete successfully, and no log files were truncated for this database.

the same error was repeated for other databases: WRODB04, WRODB02, WRODB06, WRODB03, WRODB07, WRODB05, WRODB08 – all the databases that was selected in dataset for backup.


Error ESE 2007:

Information Store (2332) Shadow copy instance 1 aborted.

I attached log from Avamar administrator and eventlog  from the client for the period when backup was running. <= can't attach it to the post. I can send it by email.

Backup started on 1:07 and ended on 1:35 with error code 10007.

After backup finished I checked every file system and it was OK, but I suppose that adding more databases to the dataset would put more stress on the system causing bigger latency, some timeouts and problems with filesystems and event BSODs.

During backup I can see warning events on the vmware host that runs the client machine:


Device naa.600601608af02e007e9505bcb03ce111 performance has deteriorated. I/O latency increased from average value of 1426 microseconds to 30073 microseconds.

Warning 12/6/2012 1:12:45 AM


We have seen this event before, when high stress is put on storage, e.g. during backup windows, but it never caused any problems with client systems.

There are no errors or warnings on the storage processors, so I assume that is only informational.

To be sure that there is no problem with the those databases that I added to the dataset before running this failed backup I did a backup of them. It completed without problems, so this is problem connected with performance.

Why the problem does not exists while we are using BackupExec? Maybe BE does not try to start vss snapshots for all the volumes at the same time?

What can we do to resolve the problem? I can imagine that we could split backup of this Exchange client into several dataset but it wouldn’t be elegant solution.

15 Posts

December 6th, 2012 08:00

Thanks you very much for help. I will open a SR.

355 Posts

March 5th, 2014 02:00

Hello,

I can see a KB article which looks similar to your issue and contains solution for it.

Article Number:000126263


It contains a hotfix to fix the issue. I would suggest you to get in touch with EMC support for further assistance.


Regards,

Pawan

5 Practitioner

 • 

274.2K Posts

March 5th, 2014 02:00

this id is not accessible into support.emc.com 

may i ask you to paste the content or the link of the same .

with regards

salvi

5 Practitioner

 • 

274.2K Posts

March 5th, 2014 02:00

I am facing similar issues , can someone let me know what is the resolution for this issues

issue :-

Avamar GUI log .

============

2014-03-05 09:54:10 avexvss Error <0000>: The plugin on remote client HQ-EXMBX-02 (172.20.4.183) terminated with error code 10007: Miscellaneous error (Log #2)
2014-03-05 09:54:10 avexvss Error <12967>: Remote client HQ-EXMBX-02 failed to start its subworkorder. (Log #2)
2014-03-05 09:54:10 avexvss Error <0000>: The remote client HQ-EXMBX-02 failed to create VSS snapshot in predefined time. (Log #2)
2014-03-05 09:54:10 avexvss Error <12977>: The remote client [HQ-EXMBX-02] has failed to backup, so its components will not be displayed for restore. (Log #2)

avagent.log

=========

2014-03-04 20:33:38 avagent Info <5964>: Requesting work from 172.20.7.41

2014-03-04 20:33:38 avagent Info <5264>: Workorder received: sleep

2014-03-04 20:33:38 avagent Info <5996>: Sleeping 240 seconds

2014-03-04 20:33:55 avagent Warning <14836>: Unable to find manager for (3001-progress-) while processing stop

2014-03-04 20:33:55 avagent Info <10684>: Setting ctl message version to 3 (from 1)

2014-03-04 20:33:55 avagent Info <16136>: Setting ctl max message size to 268435456

2014-03-04 20:34:24 avagent Error <11123>: Required CTL manager:[3018-Exch_DB_Level_DAG-1393948703478] does not exist!

2014-03-04 20:34:54 avagent Error <11123>: Required CTL manager:[3018-Exch_DB_Level_DAG-1393948703478] does not exist!

2014-03-04 20:34:56 avagent Warning <14836>: Unable to find manager for (3001-progress-8 PM Daily-Exch_DB_Level_DAG-1393952400449) while processing stop

2014-03-04 20:34:56 avagent Warning <15188>: failed to process message type 35 stop_progress_avtar

2014-03-04 20:34:5

Windows event viewer of Parent of DAG

==============================

Log Name:      Application

Source:        VSS

Date:          3/5/2014 12:43:54 PM

Event ID:      8194

Task Category: None

Level:         Error

Keywords:      Classic

User:          N/A

Computer:      HQ-EXMBX-02.bh.zain.com

Description:

Volume Shadow Copy Service error: Unexpected error querying for the IVssWriterCallback interface.  hr = 0x80070005, Access is denied.

. This is often caused by incorrect security settings in either the writer or requestor process.

another log

===========

Volume Shadow Copy Service error: Unexpected error querying for the IVssWriterCallback interface.  hr = 0x80070005, Access is denied.

. This is often caused by incorrect security settings in either the writer or requestor process.

Another log

==========

The log copier was unable to communicate with server 'HQ-EXMBX-01.bh.zain.com'. The copy of database 'MB04 Senior Staff\HQ-EXMBX-02' is in a disconnected state. The communication error was: Communication was terminated by server 'HQ-EXMBX-01.bh.zain.com': Data could not be read because the communication channel was closed. The copier will automatically retry after a short delay.

another log

=========

Volume Shadow Copy Service error: Unexpected error querying for the IVssWriterCallback interface.  hr = 0x80070005, Access is denied.

. This is often caused by incorrect security settings in either the writer or requestor process.

Please respond if any idea you have to resolve this

115 Posts

October 22nd, 2014 00:00

We have 7.6 TB exchange DAG cluster and it takes 12-14 hours to complete backup.

We tried splitting the policy to 4+4+2, which worsened the situation, backup window increased to 34 hours.

We disabled the integrity check completely and the option are

grouping by volume - does this have some impact?

2 Posts

November 6th, 2014 23:00

Did you ever resolve your backup window ?

No Events found!

Top