Start a Conversation

Unsolved

This post is more than 5 years old

3514

October 24th, 2013 09:00

Replication shows errors in Admin Console


I have a customer that is running Avamar version 6.1.1-87. I'm seeing failed replication errors in the activity monitor but the replication.log file doesn't have any errors. When I check the target grid for the clients that are failing, I see all of the images. I also noticed that there may be up to 8 replication jobs active in the activity monitor but only 1 stream is configured. The server session only shows 1 active job. The extra running jobs will eventually fail after 15 minutes.

16 Posts

October 25th, 2013 13:00

Multiple streams sounds like there may be a custom replication setup.  If the replication that is setup follows the naming standards, repl_cron.cfg, repl2_cron.cfg, ... then there should be corresponding replicate.log files for each as well located in the /usr/local/avamar/var/cron/ directory there will be a replicate.log, replicate2.log, ... a different one for each replication job assuming there is more than one configured.

82 Posts

October 28th, 2013 00:00

Yeah, looks like multiple replication jobs are configured as stated by JBalentine. You many want to check your crontab via command:

crontab -l -u dpn

Where you can see the jobs configured. Accordingly check the replication logs in the var/cron directory where you might see job details.

The muliple replication jobs are configured by making changes to /usr/local/avamar/bin/dpncron.pm and creating clones of repl_cron inside the same directory.

13 Posts

October 28th, 2013 12:00

Maybe I didn't explain this corretly. We only have 1 replication cron job running. What I'm seeing is errors in the activity monitor for replication jobs failing on certain clients. It is not the same client each day. When I log in the replication.log for the failed client, there are no errors for that client, only in the activity monitor. I even check the target grid and all images are being replicated. I almost wonder if the MCS is not refreshing fast enough,

498 Posts

October 28th, 2013 12:00

to get a better look at what might be going on.

log into avamar command line

crontab -l -u dpn [this will show you if you have more than one cron job scheduled

the one at the bottom is the standard one you see in the gui. Any above that are custom.

 

15 17 * * * /usr/local/avamar/bin/cron_env_wrapper /usr/local/avamar/lib/mcs_ssh_add repl13_cron

30 17 * * * /usr/local/avamar/bin/cron_env_wrapper /usr/local/avamar/lib/mcs_ssh_add repl14_cron

# <<< BEGIN AVAMAR ADMINISTRATOR MANAGED ENTRIES -- DO NOT MANUALLY MODIFY >>>

0 4 * * * /usr/local/avamar/bin/cron_env_wrapper /usr/local/avamar/lib/mcs_ssh_add repl_cron

# <<< END AVAMAR ADMINISTRATOR MANAGED ENTRIES >>>

Note none of them should start at the same time.

once you see your numbers you can do the following

replprt.sh

     this command will show you the log  the default repl job

replrpt.sh --log=/usr/localavamar/var/cron/replicate#.log   [ replace # with what you see in cron for your repl#_cron entry

this will then show you the log for each of these.

each job is give an amount of time to run or will be killed when the Blackout Window starts

so find out when you Blackout starts before you look at the logs

now looking at the logs   find out the time of the errors... and what kind of error..

what you are looking for is - is it an error that happens at different times with the same log or do they all error out at the start of the Blackout Window.

so if the error is at the start of the Blackout window then there is just not enough time from the start of replication job to the BW for it to finish.

if it is random times there must be something else wrong, talk to support.

to see if your replication of backups is keeping up run the following script - it will tell you how far behind any server is in getting its images replicated

 

/usr/local/avamar/bin/replcnt.sh

if your replication jobs are not keeping up, and as it seems you are not the one who set them up, work with support to get them adjusted.

498 Posts

October 28th, 2013 13:00

ok you keep saying it ERRORS  - that means to me it is giveing you and error number or something else that is showing that it did not work.

just what are you seeing that says it is erroring?


498 Posts

October 28th, 2013 13:00

well then what is the error you are getting - give us an example.

and again is the time at the start of the Blackout window?

13 Posts

October 28th, 2013 13:00

Replication starts at noon, i hour after the blackout window ends. There is no errors in the replication.log file, just when you look in the activity monitor. It just fails after 15 minutes with 0 bytes transferred.

498 Posts

October 28th, 2013 14:00

and when you logged into the avamar untility node and ran the command

replprt.sh

it showed no errors?

so the only thing I can think of is your destation is full or down.

13 Posts

October 28th, 2013 14:00

Here is a screenshot. Once again there are no errors or warnings for these clients in the replication.log.

replication.jpg

No Events found!

Top