Unsolved
This post is more than 5 years old
3 Posts
0
4216
August 30th, 2012 17:00
Storage node offline
I have a storage node showing offline - how do I put this back online from the command line?
admin@dcy-mis-ava01:~/>: status.dpn
Thu Aug 30 17:53:01 MDT 2012 [DCY-MIS-AVA01.TH.EPCHD.ORG] Thu Aug 30 23:53:01 2012 UTC (Initialized Wed Aug 29 14:40:04 2012 UTC)
Node IP Address Version State Runlevel Srvr+Root+User Dis Suspend Load UsedMB Errlen %Full Percent Full and Stripe Status by Disk
0.4 OFFLINE 0 0 0.0%
0.0 192.168.255.2 6.1.0-402 ONLINE fullaccess mhpu+0hpu+0hpu 2 false 0.10 35860 197765 2.8% 2%(onl:249) 2%(onl:229) 2%(onl:223)
0.1 192.168.255.3 6.1.0-402 ONLINE fullaccess mhpu+0hpu+0hpu 1 false 0.41 35884 202367 3.1% 3%(onl:249) 3%(onl:250) 3%(onl:250)
0.2 192.168.255.4 6.1.0-402 ONLINE fullaccess mhpu+0hpu+0hpu 1 false 0.12 36089 225591 2.9% 2%(onl:242) 2%(onl:237) 2%(onl:248)
0.3 192.168.255.5 6.1.0-402 ONLINE fullaccess mhpu+0hpu+0hpu 1 false 0.10 35255 312768 3.1% 3%(onl:251) 3%(onl:261) 3%(onl:248)
0.5 192.168.255.7 6.1.0-402 ONLINE fullaccess mhpu+0hpu+0hpu 1 false 0.07 35572 214596 3.0% 3%(onl:246) 3%(onl:250) 3%(onl:251)
0.6 192.168.255.8 6.1.0-402 ONLINE fullaccess mhpu+0hpu+0hpu 1 false 2.08 35162 193127 3.1% 3%(onl:245) 3%(onl:266) 3%(onl:239)
Srvr+Root+User Modes = migrate + hfswriteable + persistwriteable + useraccntwriteable


Avamar Exorcist
462 Posts
1
August 31st, 2012 00:00
It wouldn't be wise to restart an offline node or 'force' it to come back online without first identifying what caused it to go offline.
Possible causes include but are not limited to
1) hardware failure
2) OS issues
3) Environmental issues (i.e. power problems)
4) network connectivity
5) time synchronisation
6) Issue with the Avamar GSAN or software.
If the node is still reachable by SSH you can check for hardware, network, time sync and some other OS related problems by checking the /var/log/messages file
To determine whether Avamar nodes are time synchronised, check the current time and date of each node on the Avamar system.
As the 'admin' user, load the SSH keys and run
The '--parallel' flag executes the command on each node simultaneously. On a system where time is synchronised you will see output similar to the following:
The date and time is fully synchronised between all the nodes on this system. Note that the utility node is set to the local time zone, in this example 'BST' whereas the data nodes are set to the 'UTC' timezone. This is normal and expected behavior.
Hopefully this will help you narrow down the cause of the problem. At that point it would be recommended to engage the support team and tell them of your findings so that they can help confirm the root cause of the failure and actually bring the node back online in the most appropriate way.