Unsolved
This post is more than 5 years old
2 Intern
•
2K Posts
0
490
November 16th, 2009 05:00
Complete Network Failure
What would happen to AutoStart Resource Group if there is a complete network failuer?
In our scenario, the switch went off. When network was restored, the nodes do not respond. We need to restart the hosts and then the AutoStart services.
The resource groups become online only after that. Sometimes, we need to restart the AutoStart services a couple of time to bring it online. EMC Support says there is a problem with isolation script at the end having some special characters but we never modified isolation script so why should there be a problem with it and there is no reference to isolation script error in AutoStart logs also.
Please suggest how to resolve this situation. I am usign AutoStart 5.3 SP1 and have faced this issue in both RHEL and Solaris environments with iSCSI shared storage.


tribicic
157 Posts
0
November 20th, 2009 03:00
Difficult to say without more info.
First of all, how many domain network lines and how are they set up? Any verification networks configured? How is the isolation detection configured?
When you break the network communication between the nodes, following will happen:
- cluster nodes detect that heartbeat communications stops
- cluster nodes will try to ping the other node through the domain and verification lines. If the ping returns no action is taken and the other node is marked as "agent failed"
- if the ping fails, each node pings the list of IP addresses configured for the isolation detection. Keep in mind that isolation detection can be configured globally on the domain level but also separately for each node.
- if the IP addresses configured for the isolation detection do not respond, the node executes the isolation detection script which by default reboots the node
- if they respond, the node brings up the resource group
Hope this will help you out where the problem might be. You should be able to see the above sequence of events noted in the cluster logs.