Start a Conversation

Unsolved

This post is more than 5 years old

1841

July 11th, 2013 08:00

Checkpoints Appear/Disappear?

Hello,

We have checkpoint schedules set on files systems that run once a day. We retain 3 checkpoints.

However, when I review checkpoints listed via Unix/Linux "ls .ckpt" I usually see more than 3 available. Also, the checkpoints listed keep changing.

I first noticed this when trying to restore a file via CIFS (previous version restore) and found that the checkpoint I selected (the most current one listed) no longer existed.

Here are some sample "ls" queries taken from my home directory within seconds of each other. Note: this issue is not specific to home directories...

myhost:/gfs/home/me/.ckpt -> ls -la

total 193

dr-xr-xr-x   2 root root         512 Jul 11 10:53 ./

drwxr-xr-x  58 me user       11264 Jul  9 08:17 ../

drwxr-xr-x  58 me user       11264 Jul  8 14:52 2013_07_09_05.20.01_GMT/

drwxr-xr-x  58 me user       11264 Jul  9 08:17 2013_07_10_05.20.01_GMT/

drwxr-xr-x  58 me user       11264 Jul  9 08:17 2013_07_11_05.20.01_GMT/

drwxr-xr-x  58 me user       11264 Jul  9 08:17 2013_07_11_14.52.23_GMT/

drwxr-xr-x  58 me user       11264 Jul  9 08:17 2013_07_11_14.53.03_GMT/

myhost:/gfs/home/me/.ckpt ->

myhost:/gfs/home/me/.ckpt ->

myhost:/gfs/home/me/.ckpt -> ls -la

total 193

dr-xr-xr-x   2 root root         512 Jul 11 10:53 ./

drwxr-xr-x  58 me user       11264 Jul  9 08:17 ../

drwxr-xr-x  58 me user       11264 Jul  8 14:52 2013_07_09_05.20.01_GMT/

drwxr-xr-x  58 me user       11264 Jul  9 08:17 2013_07_10_05.20.01_GMT/

drwxr-xr-x  58 me user       11264 Jul  9 08:17 2013_07_11_05.20.01_GMT/

drwxr-xr-x  58 me user       11264 Jul  9 08:17 2013_07_11_14.53.03_GMT/

drwxr-xr-x  58 me user       11264 Jul  9 08:17 2013_07_11_14.53.43_GMT/

myhost:/gfs/home/me/.ckpt ->

myhost:/gfs/home/me/.ckpt ->

myhost:/gfs/home/me/.ckpt -> ls -la

total 193

dr-xr-xr-x   2 root root         512 Jul 11 10:54 ./

drwxr-xr-x  58 me user       11264 Jul  9 08:17 ../

drwxr-xr-x  58 me user       11264 Jul  8 14:52 2013_07_09_05.20.01_GMT/

drwxr-xr-x  58 me user       11264 Jul  9 08:17 2013_07_10_05.20.01_GMT/

drwxr-xr-x  58 me user       11264 Jul  9 08:17 2013_07_11_05.20.01_GMT/

drwxr-xr-x  58 me user       11264 Jul  9 08:17 2013_07_11_14.53.43_GMT/

drwxr-xr-x  58 me user       11264 Jul  9 08:17 2013_07_11_14.54.23_GMT/

myhost:/gfs/home/me/.ckpt ->



Please advise.


Thanks!

1 Rookie

 • 

20.4K Posts

July 11th, 2013 08:00

Replicator product utilizes snapsure checkpoints (2) to detect new and changed blocks and send them over to the target file system.  What is confusing to users ?

15 Posts

July 11th, 2013 08:00

Yes. All of our production file systems are replicated to a disaster recovery site.

Could one of you be so kind as to provide a quick explanation of replications impact on checkpoints?

Also, this is quite confusing to end users trying to restore files for various reasons. Any suggestions on how to deal with that? Can I mask those temporary checkpoint somehow?

Thanks again!

8.6K Posts

July 11th, 2013 08:00

Is the file system replicated?

1 Rookie

 • 

20.4K Posts

July 11th, 2013 08:00

You could see more if that file system is replicated to another Celerra/VNX

15 Posts

July 11th, 2013 09:00

When you select "Restore Previous Versions" in a windows environment, it shows the available checkpoints to choose from.

It is entirely possible that they will select one of the latest checkpoints available. It is not uncommon (but hit or miss) that their selection will throw an error because that checkpoint they chose to use is no longer available. If they choose one of the older (scheduled checkpoints) to restore from, all is well with the world.

I assume it is what it is and that I will have to provide proper documentation for our service center to respond to such questions from the user community.

Thanks for your help!

15 Posts

July 11th, 2013 10:00

The scheduled checkpoints are available.

[nasadmin@cdcnas2-cs0 ~]$ fs_ckpt homedir -l

id    ckpt_name                creation_time           inuse fullmark   total_savvol_used  ckpt_usage_on_savvol

7212  ckpt_homedir_1_001       07/09/2013-01:20:03-EDT   y   90%        5%                 1%

7217  ckpt_homedir_1_002       07/10/2013-01:20:02-EDT   y   90%        5%                 1%

7220  ckpt_homedir_1_003       07/11/2013-01:20:02-EDT   y   90%        5%                 1%

Let me give you a couple images of what a windows user would see:

Restore Previous Versions 1.gif

Restore Previous Versions 2.gif

As you can see, the top entries in the list are changing. In these examples, the top 2 (which I now assume are replication related checkpoints) are "transient" and the ones following are from our scheduled checkpoints.

Depending on how long it takes the user to select one of them, they either get to work with the contents of the folder or they get an error (version not available).

Thanks for following up.

Dave

1 Rookie

 • 

20.4K Posts

July 11th, 2013 10:00

Dave,

that should not be happening, if you have scheduled checkpints, they will get automatically refreshed so what users see in "previous versions" tab should be available. Can you double check that all snapshots are available ?

fs_ckpt -l

1 Rookie

 • 

20.4K Posts

July 11th, 2013 11:00

the 1:20am are you scheduled checkpoints and they always work ? It's the two replication ones that sometimes work and sometimes do not ?

15 Posts

July 11th, 2013 11:00

Correct.

My intent here was to understand 1) what exactly am I seeing here and 2) what, if anything I can do about it?

We have recently moved a considerable number of users to our NAS environment. And, as you might imagine, they would like the ability to recover files without assistance. You might also imagine that a user looking at the screenshots I provided earlier would think "Hey, I can recover the file to a time right before I messed it up!". However, as your know, this may not be the case.

As I mentioned earlier, if there isn't much I can do about functionality at this point, I will simply have to document/train accordingly. This implementation will just appear a bit "clunky" to the users I am sure...

Dave

15 Posts

July 11th, 2013 11:00

I have not tested extensively, but in my experience:

1) The scheduled checkpoints (taken at 1:20am) have consistently worked as I would expect.

2) The replication checkpoints sometimes work and sometimes not. I believe these work if I can catch them before they change...

Thanks Dynomox.

1 Rookie

 • 

20.4K Posts

July 11th, 2013 11:00

i am curious if they happen to fail at the moment they are being refreshed by Replicator software. Ultimately you would like to hide those two replication checkpoints so they don't confuse your users ? I have been looking through different params and can't seem to find a way to hide replication checkpoints only, i'll keep looking.

No Events found!

Top