Unsolved

This post is more than 5 years old

40 Posts

719

May 4th, 2009 13:00

SNAPView Clone - NTFS corruption

I have a recent issue with NTFS corruption reported on a mount host of a clone. The source LUN is mounted to a Windows 2003 SP2 server (64bit) and the clone is mounted to another Windows 2003 SP2 (64bit) host. These servers are identical. One is for production, the other is for reporting and backups. Recently the source LUN almost ran out of space and about 50GB of data was deleted. Ever since then, the clone is showing NTFS corruption when it is remounted to the mount host and that free space is not reported from the clone.

The script that we use hasn't been changed in over a year and this issue is recent. I've upgraded HBA firmware/drivers, an MS hotfix and upgraded PowerPath to 5.2 SP1. I've supplied EMC Grabs, SP Collects and the script to EMC support. We are currently at a loss. At this point, the only thing I can think of is that the issue is script related.

The steps I am running are:
-----------------------------------------------
1. net stop "Pervasive.SQL (relational)" on CLONE MOUNT HOST
2. net stop "Pervasive.SQL (transactional)" on CLONE MOUNT HOST
3. sleep 30
4. admsnap flush -o e: on CLONE MOUNT HOST
5. sleep 30
6. admsnap clone_deactivate -o e: on CLONE MOUNT HOST
7. naviseccli -address -user -password -scope global snapview -syncclone -name Clone_Name -cloneid 0100000000000000 ¿o
8. Loop that runs to check the status of the sync:
a. :scan3
cd\clone
sleep 60
cd "\Program Files (x86)\EMC\Navisphere CLI"
naviseccli -address -user -password -scope global snapview -listclone -name Clone_Name >c:\clone\Clone_Name.txt
cd\clone
:xdrive1
findstr /L /c:"CloneCondition: Normal" c:\clone\Clone_Name.txt
if "%errorlevel%" == "0" goto yes
if "%errorlevel%" == "1" goto scan3
9. NET SESSION /DELETE /Y
10. net stop "Distributed File System"
11. net stop "Computer Browser"
12. net stop "Server" /y
13. admsnap flush -o e: on SOURCE MOUNT HOST
14. sleep 30
15. cd "\Program Files (x86)\EMC\Navisphere CLI"
16. naviseccli -address -user -password -scope global snapview -fractureclone -name "Clone_Name" -cloneid 0100000000000000 -o
17. cd\clone
18. Sleep 5
19. admsnap clone_activate on CLONE MOUNT HOST
20. net start "Server" on CLONE MOUNT HOST
21. net start "Computer Browser" on CLONE MOUNT HOST
22. net start "Distributed File System" on CLONE MOUNT HOST
23. net start "Pervasive.SQL (relational)" on CLONE MOUNT HOST
24. net start "Pervasive.SQL (transactional)" on CLONE MOUNT HOST

---------------------------------------------------------------------
Any and all help is appreciated. I do realize that I can add a security file to prevent having to pass credentials through, but I'd rather not worry about that right now and focus purely on correcting the corruption before making any other changes.

Again, thanks to all in advance for taking the time to assist me.

238 Posts

May 5th, 2009 14:00

Mike,
Can you please advise the service request (SR#) that has been opened for this.

Thanks,
DGM

40 Posts

May 13th, 2009 06:00

Sorry, I believe I have discovered the issue and resolved it.

I think a couple of thinks were going on. First, the admsnap clone_deactivate command does not appear to be supported in an RDP session, per emc90130. My ops team that runs this script recently made this change from using a Raritan KVM to an RDP session. Second, the same day that space was freed up a new patch management application agent was installed, called Big Fix. For some reason the Big Fix agent has all of the volumes open for some reason.

So, needless to say, I have added a line to stop the Big Fix agent service before the clone_deactivate and I have the ops team running the script via a Raritan KVM again.

Since these changes were made, I have not seen any more corruption.
No Events found!

Top