Start a Conversation

Unsolved

This post is more than 5 years old

1477

March 2nd, 2012 13:00

Global FS backup failed

Hi,

We have a Global FS  backup failed on only one of our platforms.

There is no clear issue (IP connectivity, HW defect, ... on that platform)

Is there any specila thing to check to know the cause of this failure.

Thanks

544 Posts

March 2nd, 2012 15:00

Would you please provide the following details to clarify the issue more :

1- NetWorker version

2- Operating System version and architecture ( x86 or x64)

3- GFS version

4- Error messages for backup failures.

Thanks,

Ahmed Bahaa

11 Posts

March 5th, 2012 08:00

Hi Ahmed,

1) Networke version: 7.4A00 Network

2) OS  Solaris 10, architechture x86

3) Could you please advice about what do you mean about GFS bersion

4) This error is shown via GUI mentioning that srv_fs_gp failed

Thanks

544 Posts

March 5th, 2012 16:00

Hi elmagik,

1) I would like to inform you firstly that NetWorker version 7.4 is out of support life , so it is not supported by EMC anymore , for that you have to plan for upgrading the NetWorker server and storage nodes as well to a supported version of NetWorker in order to be under EMC support.

NetWorker version 7.4A00 is the RTM version of 7.4 or there is a Service Pack to this version ?

2) Secondly for the OS solaris 10 x86 , is it the backup server or the client ? If it is the backup server , as far as i remember it is not supported as a backup server except for NetWorker 7.4 without any service packs , so i am not sure of your NetWorker server version, please double check it again if it is 7.4 RTM or it is 7.4 with service packs.

I dont think this solaris is the client as GFS filesystem is on Linux flavours ( Red Hat, Suse , Oracle linux) , so please specify the client OS as well.

3) For the GFS version, There is notes in the NetWorker compatability guide for the GFS supportability , here is it :

Both GFS 6.0 and GFS 6.1 versions work with 32 - bit Linux and 32 - bit NetWorker; do NOT work with 64-bit Linux and 32 -bit NetWorker 

GFS 6.0 is supported only by RH AS 3.0. 

GFS 6.1 is supported only by RH AS 4.0 as detailed below:: 

--  GFS 6.1 on RH as 4.0 x86  –  NW 7.4 32 -bit version for Linux

--  GFS 6.1 on RH as 4.0 x64  –  NW 7.4 64 -bit version for Linux

4) For the error message, i think srv_fs_gp is the group name, as this is not an informative message, would you please provide us with the backup failure messages from the daemon.raw to understand how is the backup and why ?

I will be waiting your updates, and again you have to plan for an upgrade to the backup environment to one of the latest versions.

Thanks,

Ahmed Bahaa

11 Posts

March 6th, 2012 05:00

Hi Ahmed,

Sorry for inconvenience

1) Networke version: 7.4A00 Network  Edition/29

2) OS for both client and Backup server is Solaris 10,  Architechture for both client and Backup server is also sun4v sparc

uname -a output from Backup server

uname -a

SunOS bar1 5.10 Generic_137111-08 sun4v sparc SUNW,Netra-T2000

3)

4) The error message that I saw

42506 03/03/12 01:40:01  2 0 0 1 2227 0 bar nsrd savegroup info: starting Srv_ts3nw1_FS_grp (with 1 client(s))

42506 03/03/12 01:40:01  2 0 0 1 2227 0 bar nsrd savegroup info: starting Srv_ts3nw1-backup_FS_grp (with 1 client(s))

1045 03/03/12 01:40:02  2 0 0 1 2227 0 bar nsrd GSS Legato authentication from ts3n1-backup failed...

40473 03/03/12 01:40:02  2 0 0 1 16349 0 bar savegrp command ' savefs -s bar-backup -c ts3nw1 -g Srv_ts3nw1_FS_grp -p -l full -R -v -F /global/Oam /

global/OamFt' for client ts3nw1 exited with return code 1.

32496 03/03/12 01:40:02  2 0 0 1 16349 0 bar savegrp job (515514) host: ts3nw1 savepoint: ts3nw1:Probe had ERROR indication(s) at completion.

7340 03/03/12 01:40:02  2 0 0 1 16349 0 bar savegrp ts3n1:probe failed.

38718 03/03/12 01:40:03  0 0 2 1 2227 0 bar nsrd bar2:index:ts3nw1 saving to pool 'DiskPool' (bar.003)

38714 03/03/12 01:40:03  0 0 2 1 2227 0 bar nsrd bar2:index:ts3nw1 done saving to pool 'DiskPool' (bar.003)

38758 03/03/12 01:40:07  2 0 0 1 2227 0 bar nsrd savegroup failure alert: Srv_ts3n1_FS_grp Completed/Aborted, Total 1 client(s), 0 Clients disabled, 0 Ho

stname(s) Unresolved, 1 Failed, 0 Succeeded.

38758 03/03/12 01:40:07  2 0 0 1 2227 0 bar nsrd savegroup alert: Srv_ts3nw1_FS_grp completed, Total 1 client(s), 1 Failed. Please see group completion details for more information.

12662 03/03/12 01:40:08  2 0 0 1 2227 0 bar nsrd runq: NSR group Srv_ts3nw1_FS_grp exited with return code 1.

Thank you

544 Posts

March 6th, 2012 19:00

Thanks elmagik for the information,

Seems you have communication issue between the client and server as there is probe failed message ,There can be one or more causes for the probe failure message like :DNS issue, NetWorker client software not installed or not running, Client aliases not configured in NetWorker or client side NSRLA contains an old or incorrect hostname.

So firstly we need to make sure that you can communicate with the client to the server and vice versa correctly, by using ping and rpcinfo and nsradmin commands to verify communication is correct and naming resolutions is correct.

Then you have to disable the nsrauth (Strong Authentication) on the client and the backup server sides. Please follow the following steps to disable the nsrauth:

nsradmin -p nsrexec

Use the "help" command for help, "visual" for full-screen mode.

nsradmin> . type: NSRLA

Current query set

nsradmin> show auth methods

nsradmin> p

                 auth methods: "0.0.0.0/0,nsrauth/oldauth";

nsradmin> update auth methods: "0.0.0.0/0,oldauth"

                 auth methods: "0.0.0.0/0,oldauth"; Update? y updated resource id 3.0.156.101.198.4.188.67.137.69.101.64(137)

nsradmin> p

                 auth methods: "0.0.0.0/0,oldauth";

After turning off nsrauth, restart nsrexecd service is required:

# /etc/init.d/networker stop

# /etc/init.d/networker start

After that try the backup again and update me how is it going.

Dont forget to plan for upgrading the backup environment to one of the latest versions.

Thanks,
Ahmed Bahaa

11 Posts

March 6th, 2012 21:00

Hi Ahmed,

Thank you for your answer.

I wanted to add one point, this backup was finished successfully one day after and until today.

But this problem (fs backup fails) appear from time to time on different platforms.

Thanks

544 Posts

March 7th, 2012 18:00

Hi elmagik,

Firstly, try to disable the nsrauth as described in my previous post.

The backup fail error message that appear from time to time , all failed due to probe issues ? if so, then it seems that you have some connectivity or communication issues , You need to double check the naming resolutions and IP resolutions works correctly and you can communicate from the client to the server and vice versa using rpcinfo and nsradmin commands.

Hope that this helps,

Thanks,

Ahmed Bahaa

11 Posts

March 9th, 2012 05:00

Thank you very much Ahmed for youe help

544 Posts

March 9th, 2012 12:00

Hi elmagik,

Is the issue solved now ? the backups for those servers works correctly ?

Thanks,

Ahmed Bahaa

No Events found!

Top