Start a Conversation

This post is more than 5 years old

Solved!

Go to Solution

6393

August 29th, 2017 09:00

restoring data from snapshots

I am working on restoring large amounts of data from snapshotIQ snapshots.  Many of the directories are not 'huge' in the sense of size, but are massive in terms of number of files; ranging anywhere from a few thousand files, to the extreme of 500+ million files.  Because of this, these restores take a VERY long time.

Unfortunately, for most of these the snapshot was taken at the parent level (something I am going to re-evaluate, lesson learned), so I cannot use SnapRevert because I dont want to revert every sub-directory under the snapshot.

So I am left with copying the directories manually out of the snapshots.   (/ifs/data/.snapshot/....)

looking at the Isilon's cp man page, there are two OneFS unique switches, -c (clones) and -B (use large buffers).  I did some testing with -c with a test directory and it appears that I cannot use -c and -R (recursive) together, so that doesn't appear to be an option for me.

I am curious about -B, it says "Use larger buffers when copying data to regular files to reduce system calls".  I'd like to know the full impact if I use -B; will it speed up my copy jobs with millions of 'smallish' files?  What's the impact on the cluster?


alternatively, rsync is another option and is obviously more robust than cp, but I am concerned about time and not sure if rsync would be faster.   


thanks in advance for any information/recommendations

1.2K Posts

September 6th, 2017 12:00

If you have removed the directory on which the original snapshot was taken,

it appears to me that you will be out of luck with using SyncIQ for restoring.

If that directory still exists, and you want to restore only some subdirectory,

this can be specified with 

isi sync policies {create or modify} ... --source-include-directories /path/to/subdirectory

For local SyncIQ jobs, the target path must be outside the source path,

so one cannot restore a lost directory "in-place".

hth

-- Peter

August 30th, 2017 06:00

You can use SyncIQ to replicate out of a snapshot, but you can only do it on the command line.

If you do not have a SyncIQ licenses contact your Local Systems Engineer and they can provide you a demo license

Example:  If you wanted to replicate /ifs/data/.snapshot/snapshotname/dir-with-millions-of-files

  • On the cli you would use a job like this "isi sync job start --policy-name=name-of-synciq-policy --source-snapshot=source-snapshot-name"Note:
  • Note: you do have to have a syncIQ policy setup for the directory in question, you simply are changing the root of the replication for the job to the snapshot.  IE: in the above example you would have a sync job for /ifs/data/dir-millions-of-files

Your local SE can assist you with this, this is a very efficient way to recover large amounts of data from within snapshots.

1.2K Posts

August 30th, 2017 06:00

Cloning with cp -c is very efficient with large files because no data blocks are copied.

In your case it might not help much, in addition of not supporting recusive traversal

(where the latter can be solved by a home-made script of course).

In modern OSes cp -B is default and doesn't need to be worried about.

You can give SyncIQ a try with a "copy" job within the cluster -- use localhost as target host.

If you don't have a SyncIQ license, ask your account team for a temporary license,

ideally free as for "demo" purposes.

hth

--Peter

4 Posts

August 30th, 2017 07:00

I actually tried syncIQ, we are licensed and use it quite a bit to replicate to our other cluster.   However, when I setup a policy to replicate from the snapshot directory back to the original location on the local cluster, it failed when i started the job stating that it failed to snapshot the source, which makes sense for how synciq works.

So I guess that's where the command line argument you mentioned comes into play.  So to clarify your recommendation, when you say i have to have a syncIQ policy setup for the directory in question.  Do you mean to setup a policy like what i described above (source=snapshot, dest=original location), just only run it from cli with that snapshot argument in it?

otherwise, if i had a synciq job setup like my normal stuff (source cluster --> target cluster), it seems that if i ran that policy with the snapshot source argument, it would simply replicate the snapshots to the remote target cluster.

August 30th, 2017 12:00

Actually you need a normal syncIQ policy (Just like you would configure without a snapshot)

Then use the CLI flat to define the snapshot to replicate from.

This is essentially a change root for the single run of the SyncIQ Job.

You cannot configure the snapshot into the policy, or set source=snapshot.

What you setup is to replicate an entire snapshot, and that is not what you want you want to replicate a subset of a snapshot so the CLI setup is the only way.

4 Posts

September 5th, 2017 11:00

thank you for the info.  Though, I guess I need further clarification regarding the 'regular' policy that needs to be setup.

Since this is not normally a directory that I would have replicating locally, what am I defining as the 'source' in this regular policy?

for example, let's say /ifs/data/directory/directory_1 is the directory I am trying to restore to (the original location).  I would obviously set this up as the destination, but other than the snapshot, there's no 'source' that contains this data.  Would I just use a miscellaneous empty directory as source since the snapshot will be swapped in as source via --source-snapshot on CLI?

i.e. 

source= /ifs/data/empty_dir

dest = /ifs/data/directory/directory_1

but then run it with "isi sync job start --policy-name=temp_policy --source-snapshot=/ifs/.snapshot/snapname/data/directory/directory_1"?

4 Posts

September 7th, 2017 09:00

thanks, i have successfully tested this now.  that was my missing piece, that I could not go back to the original location.

So I have a policy with the source as the original location and a new location as the destination.  run the synciq policy with the snapshot specified with --source-snapshot, and I got exactly the data I needed. 

I can then rename the original directory to something else with mv, and then rename the restore location to the original name.

thanks everyone.

No Events found!

Top