Unsolved
This post is more than 5 years old
36 Posts
0
1662
advice and guidance needed - various operational issues with avamar servers
advice and guidance needed - various operational issues with avamar servers
We have 2x 9+1 node Avamar servers with spare node. Both are high utilisation, primary grid is 90% used, secondary is 95% used.
* primary and secondary grids are showing almost a 5% difference in utilisation - The secondary node is simply a replication target and therefore should be using the same. When we have manually deleted snapups and clients, we have manually deleted them from both grids.
* GC does not complete all stripes, currently completes around 2000/4000 stripes in 4 hours. Does this mean we are leaving more and more redundant data on the system each day? If so what can we do to reverse this trend?
* second checkpoint is failing some days on the primary grid as hfscheck runs into the backup window - this is also causing scheduling issues with large backups. First daily cp does complete successfully and hfscheck does complete. hfscheck does take much longer on the primary grid though - 11am - 2am. Secondary grid hfscheck takes from 11am - 9pm.
* secondary avamar grid is very close to 95% utilisation. We will at some point need to run emergency garbage collection, because GC does not complete all stripes on a daily basis, will this mean we have a large amount of redundant data, or will we need to look at deleting some snapups before we run the emergency GC?
Long term we will be adding some nodes to these grids, but we need a short term solution. I believe 4 hours is the maximum for daily GC, what are our options to rectify the scheduling issues we have? We are currently adding approx 8 GB a day to the grids after GC, average new data is 106503mb, average gc is -98242mb.
Any help would be appreciated. Thanks
Hormigo
131 Posts
1
February 4th, 2010 04:00
Your Avamar is operating beyond capacity. You have a lot of data to just Avamar.
Regards
Hormigo
You can still do backups? It was to be in Read-Only ...For this size environment you need a large maintenance window, I could see that by not occurring.
I recommend increasing the Maintenance Window. Blackout windows in 6 hours and Maintenance windows in 10 Hours.
These routines must be completed. Without them your environment is at risk!
Run manually GC, a CHECKPOINT and a HFScheck Full. Let the Backup Window down to finalize these routines.
Then we enter the values again
Hope that helps.
DaveH8888
36 Posts
0
February 4th, 2010 05:00
The gc, checkpoint and hfscheck completes on a daily basis - it just runs into the backup window and therefore the checkpoint which should take place after hfscheck sometimes fails.
When you mention too much data, do you mean too much raw data or too much changed data? Are there any guidlines on how much changed data an avamar system can be expected to handle?
edit: servers are not read-only yet - the secondary grid is at 94.5% utilisation
Message was edited by: bobalob
Hormigo
131 Posts
0
February 4th, 2010 10:00
Avamar with a certain time of use should reach the Steady State.
Steady State is the amount of new data is less than the amount of expired or orphaned.
What harms reach Steay State: the frequency of new data being backed up in the environment, and retention time very long
DaveH8888
36 Posts
0
February 5th, 2010 01:00
These grids have not yet reached the longest retention which is 6 months and will occur in about 1 month. retention on the servers is 6 monthly backups and 30 daily backups for all servers.
markaldridge
9 Posts
1
February 8th, 2010 06:00
DaveH8888
36 Posts
0
February 8th, 2010 07:00