Unsolved
This post is more than 5 years old
7 Posts
0
539
June 13th, 2008 02:00
Autostart 5.2 Agent restarts
Hi
I have a customer that has been running 5.2 for over a year and all was OK up to a few weeks ago when the Agent started failing and restarting. I have seen this error before and the simplest way was to re-install autostart on both nodes.
We are running windows 2003 server SP1 and only use auto start to mirror disk between servers.
Can you advise as to what would cause the agents to restart, as re-installing in a production environment is not always possible
Many thanks in advance
I have a customer that has been running 5.2 for over a year and all was OK up to a few weeks ago when the Agent started failing and restarting. I have seen this error before and the simplest way was to re-install autostart on both nodes.
We are running windows 2003 server SP1 and only use auto start to mirror disk between servers.
Can you advise as to what would cause the agents to restart, as re-installing in a production environment is not always possible
Many thanks in advance
0 events found
No Events found!


tribicic
157 Posts
0
June 13th, 2008 02:00
It is almost always related to either communication problem or the cluster database corruption. It usually helps to change the heartbeat communication from broadcast to point to point (it can be done in node properties). Also, make sure that no third party application is trying to access the files in the Autostart directory (like virus scanner or backup software).
Is the agent restarting continuously or just randomly?
yito1
262 Posts
0
June 13th, 2008 02:00
Does the Agent service stop?
The event of Service control manager is sure to occur if service has terminated abnormally.
The problem might occur in the composition.
Has the heart beat of Agent failed?
Yoshinobu Ito
swiftalliance
7 Posts
0
June 13th, 2008 06:00
Thanks for your help
tribicic
157 Posts
0
June 13th, 2008 06:00
smaf
5 Posts
0
June 18th, 2008 09:00
Now previously, the heartbeat setting was point-to-point, but the IP address(es) used for both nodes were IP addresses that no longer existed (and had not been used in at least 2 years), so we are unsure as to how no errors or warnings were no reported. It was because of this the setting was changed to the 'default' multicast. It should be noted that while AutoStart was configured during this point, the AutoStart services were stable (they only started playing up a few hours after the data sources had finished their initial sync.
As it stands now, the heartbeat IP address has been set to the IP address of the opposite node in the cluster (ie, on node 1, the point-to-point IP address is the IP address for node 2 and vice versa). We are not aware of any hard and fast rules governing these IP addresses so we figured using the physical IP address of a node in the cluster should suffice (since it should be always be up and running anyway).