This post is more than 5 years old

7 Posts

6567

July 23rd, 2013 08:00

Avamar Boot Issue

We switched server rooms with our Avamar and we are receiving a boot error.  Our Gsan and MCS services are not starting.

1 error seen in output of "/usr/bin/yes not | /usr/local/avamar/bin/restart.dpn

Has anyone seen this error before?

Thank you,

Bryan

7 Posts

July 24th, 2013 05:00

I was able to get the unit up after running asktime and selecting 2 of our AD time servers.  It just took a really long time for the storage nodes to sync.  We are back up.  Thank you.

50 Posts

July 23rd, 2013 08:00

Hi Bryan,

Has anything changed when you changed the server rooms ? I mean the network also! Could you give us some details about the avamar config ? like number of nodes, version etc.

7 Posts

July 23rd, 2013 09:00

Hello. Our version is 6.0.1.66.  We have 6 nodes, 1 utility node, 1 spare node and 4 storage nodes.  We have not changed the IP addresses of the nodes.  We were able to move the stack and keep the existing IP scheme.

1 error seen in output of "/usr/bin/yes no | /usr/local/avamar/bin/restart.dpn

I am not sure what I am looking at with this error.  Is this telling me the location of the error log?

50 Posts

July 23rd, 2013 09:00

Ok! I have a question here, guess you've already done it, however I thought I'll ensure we are on the same page.

1) You did take a checkpoint and did dpnctl stop and ensured gsan comes down gracefully before physically shutting down the boxes

2) Once all the nodes were moved and racked, you did do a dpnctl start to bring gsan up

Please tell me if I'm mistaken anywhere

7 Posts

July 23rd, 2013 13:00

This is from a verbose start.  How can we change the time server setting for each of the nodes?  It is probably incorrect now that we moved the equipment.

ERROR: Time on nodes too far out of sync: giving up, stopped at /usr/local/avamar/bin/dpn.pm line 5121.

- - - - - - - - - - - - - - - END

dpnctl: ERROR: error return from "/usr/bin/yes no | /usr/local/avamar/bin/restart.dpn --verbose" - exit status 255

dpnctl: ERROR: 1 error seen in output of "/usr/bin/yes no | /usr/local/avamar/bin/restart.dpn --verbose"

        rm -f /tmp/dpnctl-gsan-restart-status-9900 /tmp/dpnctl-gsan-restart-output-9900

gsan error log scan:

7 Posts

July 23rd, 2013 13:00

This is the information we get after running the dpnctl command.

dpnctl: INFO: Checking that gsan was shut down cleanly...

dpnctl: INFO: Restarting the gsan (this may take some time)...

dpnctl: INFO: To monitor progress, run in another window: tail -f /tmp/dpnctl-gsan-restart-output-9318

dpnctl: ERROR: error return from "/usr/bin/yes no | /usr/local/avamar/bin/restart.dpn" - exit status 255

dpnctl: ERROR: 1 error seen in output of "/usr/bin/yes no | /usr/local/avamar/bin/restart.dpn"

dpnctl: ERROR: error return from "/usr/bin/yes no | /usr/local/avamar/bin/restart.dpn" - exit status 255

dpnctl: INFO: [see log file "/usr/local/avamar/var/log/dpnctl.log"]

This is from the error log dpnctl.log



2013/07/18-10:00:54 ======= dpnctl 6.0.1-66 (1.127.2.1), running as admin, ENDING at 2013-07-18 06:00:54 EDT =======

2013/07/18-10:00:54 [user "admin"] program (pid 18343) exit status = 0 (normal)

2013/07/19-10:00:47 dpnctl: INFO: ======= dpnctl 6.0.1-66 (1.127.2.1) STARTING at 2013/07/19-06:00:47 EDT (UTC -0400), running as admin =======

2013/07/19-10:00:47 dpnctl: INFO: log file time stamps are in UTC; local time is 2013/07/19-06:00:47 EDT (UTC -0400)

2013/07/19-10:00:47 dpnctl: INFO: PATH=/usr/kerberos/bin:/usr/local/avamar/bin:/bin:/usr/bin:/sbin:/usr/sbin:/usr/local/bin:/usr/local/sbin:/usr/local/apache/bin:/usr/local/ssl/bin

2013/07/19-10:00:47 dpnctl: INFO: Argument list: "--mcs_user=root" "status"

2013/07/19-10:00:47 ps auxww --forest >/tmp/dpnctl-ps-output.16385 2>&1

2013/07/19-10:00:47 dpnctl: INFO: output of preceding 'ps' command:

2013/07/19-10:00:47 dpnctl: INFO:  - - - - - - - - - BEGIN ps output

USER       PID %CPU %MEM   VSZ  RSS TTY      STAT START   TIME COMMAND

root         1  0.0  0.0  4752  596 ?        S     2012   0:01 init [3]        

root         2  0.0  0.0     0    0 ?        S     2012   0:12 [migration/0]

root         3  0.0  0.0     0    0 ?        SN    2012   0:01 [ksoftirqd/0]

root         4  0.0  0.0     0    0 ?        S     2012   0:12 [migration/1]

root         5  0.0  0.0     0    0 ?        SN    2012   0:00 [ksoftirqd/1]

root         6  0.0  0.0     0    0 ?        S     2012   0:12 [migration/2]

root         7  0.0  0.0     0    0 ?        SN    2012   0:01 [ksoftirqd/2]

"/usr/local/avamar/var/log/dpnctl.log" 6936L, 682215C


2013/07/23-13:59:21 /bin/cat /tmp/dpnctl-dtlt-status-output-31560

2013/07/23-13:59:21 [ "/bin/cat /tmp/dpnctl-dtlt-status-output-31560" exit status = 0 ]

2013/07/23-13:59:21 dpnctl: INFO: - - - - - - - - - - - - - - - BEGIN

2013/07/23-13:59:21 INFO: DTLT web application status: up

2013/07/23-13:59:21 dtltctl: INFO: Desktop/laptop services status: up.

2013/07/23-13:59:21 dpnctl: INFO: - - - - - - - - - - - - - - - END

2013/07/23-13:59:21 /bin/cat /tmp/dpnctl-dtlt-status-status-31560 2>&1

2013/07/23-13:59:21 [ "/bin/cat /tmp/dpnctl-dtlt-status-status-31560 2>&1" exit status = 0 ]

2013/07/23-13:59:21 dpnctl: INFO: "[ -r /etc/profile ] && . /etc/profile ; /usr/local/avamar/bin/../lib/dpnutils/dtltctl status" exit status = 0

2013/07/23-13:59:21 rm -f /tmp/dpnctl-dtlt-status-status-31560 /tmp/dpnctl-dtlt-status-output-31560

2013/07/23-13:59:21 dpnctl: INFO: "rm -f /tmp/dpnctl-dtlt-status-status-31560 /tmp/dpnctl-dtlt-status-output-31560" - exit status 0

2013/07/23-13:59:21 dpnctl: INFO: Shutting down dtlt...

2013/07/23-13:59:21 (/usr/local/avamar/bin/../lib/dpnutils/dtltctl stop ; echo $? >/tmp/dpnctl-dtlt-shutdown-status-31560) >/tmp/dpnctl-dtlt-shutdown-output-31560 2>&1

2013/07/23-13:59:29 dpnctl: INFO: "(/usr/local/avamar/bin/../lib/dpnutils/dtltctl stop ; echo $? >/tmp/dpnctl-dtlt-shutdown-status-31560) >/tmp/dpnctl-dtlt-shutdown-output-31560 2>&1" - exit status 0


7 Posts

July 23rd, 2013 13:00

We did not complete a checkpoint, however previous checkpoints have been made in the past.  We did properly shut down gsan.  When we brought the nodes back up and typed in dpnctl the gsan did not start.

7 Posts

July 23rd, 2013 15:00

I ran asktime and configured it in accordance with our time servers.

Our utility node connects to our time server, however, all storage nodes are unable to synchronize with the time server.

50 Posts

July 23rd, 2013 19:00

Sorry about the late reply. You'd always want to tail the following log and might also redirect it to another file so it can be analysed later

"dpnctl: INFO: To monitor progress, run in another window: tail -f /tmp/dpnctl-gsan-restart-output-9318 "

regarding asktime, are you doing it with user dpn ? could you give us the output of the following commands

1) mapall --noerror 'date'

2) date

3) date -u

And if you are doing it with user dpn then where does asktime fail ?

50 Posts

July 24th, 2013 07:00

Good to hear that you are back in production. If the issue is resolved, could you mark this discussion as answered please ?

No Events found!

Top