Unsolved
This post is more than 5 years old
11 Posts
0
1477
Global FS backup failed
Hi,
We have a Global FS backup failed on only one of our platforms.
There is no clear issue (IP connectivity, HW defect, ... on that platform)
Is there any specila thing to check to know the cause of this failure.
Thanks
Bebo2k
544 Posts
0
March 2nd, 2012 15:00
Would you please provide the following details to clarify the issue more :
1- NetWorker version
2- Operating System version and architecture ( x86 or x64)
3- GFS version
4- Error messages for backup failures.
Thanks,
Ahmed Bahaa
elmagik
11 Posts
0
March 5th, 2012 08:00
Hi Ahmed,
1) Networke version: 7.4A00 Network
2) OS Solaris 10, architechture x86
3) Could you please advice about what do you mean about GFS bersion
4) This error is shown via GUI mentioning that srv_fs_gp failed
Thanks
Bebo2k
544 Posts
1
March 5th, 2012 16:00
Hi elmagik,
1) I would like to inform you firstly that NetWorker version 7.4 is out of support life , so it is not supported by EMC anymore , for that you have to plan for upgrading the NetWorker server and storage nodes as well to a supported version of NetWorker in order to be under EMC support.
NetWorker version 7.4A00 is the RTM version of 7.4 or there is a Service Pack to this version ?
2) Secondly for the OS solaris 10 x86 , is it the backup server or the client ? If it is the backup server , as far as i remember it is not supported as a backup server except for NetWorker 7.4 without any service packs , so i am not sure of your NetWorker server version, please double check it again if it is 7.4 RTM or it is 7.4 with service packs.
I dont think this solaris is the client as GFS filesystem is on Linux flavours ( Red Hat, Suse , Oracle linux) , so please specify the client OS as well.
3) For the GFS version, There is notes in the NetWorker compatability guide for the GFS supportability , here is it :
Both GFS 6.0 and GFS 6.1 versions work with 32 - bit Linux and 32 - bit NetWorker; do NOT work with 64-bit Linux and 32 -bit NetWorker
GFS 6.0 is supported only by RH AS 3.0.
GFS 6.1 is supported only by RH AS 4.0 as detailed below::
-- GFS 6.1 on RH as 4.0 x86 – NW 7.4 32 -bit version for Linux
-- GFS 6.1 on RH as 4.0 x64 – NW 7.4 64 -bit version for Linux
4) For the error message, i think srv_fs_gp is the group name, as this is not an informative message, would you please provide us with the backup failure messages from the daemon.raw to understand how is the backup and why ?
I will be waiting your updates, and again you have to plan for an upgrade to the backup environment to one of the latest versions.
Thanks,
Ahmed Bahaa
elmagik
11 Posts
0
March 6th, 2012 05:00
Hi Ahmed,
Sorry for inconvenience
1) Networke version: 7.4A00 Network Edition/29
2) OS for both client and Backup server is Solaris 10, Architechture for both client and Backup server is also sun4v sparc
uname -a output from Backup server
uname -a
SunOS bar1 5.10 Generic_137111-08 sun4v sparc SUNW,Netra-T2000
3)
4) The error message that I saw
42506 03/03/12 01:40:01 2 0 0 1 2227 0 bar nsrd savegroup info: starting Srv_ts3nw1_FS_grp (with 1 client(s))
42506 03/03/12 01:40:01 2 0 0 1 2227 0 bar nsrd savegroup info: starting Srv_ts3nw1-backup_FS_grp (with 1 client(s))
1045 03/03/12 01:40:02 2 0 0 1 2227 0 bar nsrd GSS Legato authentication from ts3n1-backup failed...
40473 03/03/12 01:40:02 2 0 0 1 16349 0 bar savegrp command ' savefs -s bar-backup -c ts3nw1 -g Srv_ts3nw1_FS_grp -p -l full -R -v -F /global/Oam /
global/OamFt' for client ts3nw1 exited with return code 1.
32496 03/03/12 01:40:02 2 0 0 1 16349 0 bar savegrp job (515514) host: ts3nw1 savepoint: ts3nw1:Probe had ERROR indication(s) at completion.
7340 03/03/12 01:40:02 2 0 0 1 16349 0 bar savegrp ts3n1:probe failed.
38718 03/03/12 01:40:03 0 0 2 1 2227 0 bar nsrd bar2:index:ts3nw1 saving to pool 'DiskPool' (bar.003)
38714 03/03/12 01:40:03 0 0 2 1 2227 0 bar nsrd bar2:index:ts3nw1 done saving to pool 'DiskPool' (bar.003)
38758 03/03/12 01:40:07 2 0 0 1 2227 0 bar nsrd savegroup failure alert: Srv_ts3n1_FS_grp Completed/Aborted, Total 1 client(s), 0 Clients disabled, 0 Ho
stname(s) Unresolved, 1 Failed, 0 Succeeded.
38758 03/03/12 01:40:07 2 0 0 1 2227 0 bar nsrd savegroup alert: Srv_ts3nw1_FS_grp completed, Total 1 client(s), 1 Failed. Please see group completion details for more information.
12662 03/03/12 01:40:08 2 0 0 1 2227 0 bar nsrd runq: NSR group Srv_ts3nw1_FS_grp exited with return code 1.
Thank you
Bebo2k
544 Posts
0
March 6th, 2012 19:00
Thanks elmagik for the information,
Seems you have communication issue between the client and server as there is probe failed message ,There can be one or more causes for the probe failure message like :DNS issue, NetWorker client software not installed or not running, Client aliases not configured in NetWorker or client side NSRLA contains an old or incorrect hostname.
So firstly we need to make sure that you can communicate with the client to the server and vice versa correctly, by using ping and rpcinfo and nsradmin commands to verify communication is correct and naming resolutions is correct.
Then you have to disable the nsrauth (Strong Authentication) on the client and the backup server sides. Please follow the following steps to disable the nsrauth:
nsradmin -p nsrexec
Use the "help" command for help, "visual" for full-screen mode.
nsradmin> . type: NSRLA
Current query set
nsradmin> show auth methods
nsradmin> p
auth methods: "0.0.0.0/0,nsrauth/oldauth";
nsradmin> update auth methods: "0.0.0.0/0,oldauth"
auth methods: "0.0.0.0/0,oldauth"; Update? y updated resource id 3.0.156.101.198.4.188.67.137.69.101.64(137)
nsradmin> p
auth methods: "0.0.0.0/0,oldauth";
After turning off nsrauth, restart nsrexecd service is required:
# /etc/init.d/networker stop
# /etc/init.d/networker start
After that try the backup again and update me how is it going.
Dont forget to plan for upgrading the backup environment to one of the latest versions.
Thanks,
Ahmed Bahaa
elmagik
11 Posts
0
March 6th, 2012 21:00
Hi Ahmed,
Thank you for your answer.
I wanted to add one point, this backup was finished successfully one day after and until today.
But this problem (fs backup fails) appear from time to time on different platforms.
Thanks
Bebo2k
544 Posts
0
March 7th, 2012 18:00
Hi elmagik,
Firstly, try to disable the nsrauth as described in my previous post.
The backup fail error message that appear from time to time , all failed due to probe issues ? if so, then it seems that you have some connectivity or communication issues , You need to double check the naming resolutions and IP resolutions works correctly and you can communicate from the client to the server and vice versa using rpcinfo and nsradmin commands.
Hope that this helps,
Thanks,
Ahmed Bahaa
elmagik
11 Posts
0
March 9th, 2012 05:00
Thank you very much Ahmed for youe help
Bebo2k
544 Posts
0
March 9th, 2012 12:00
Hi elmagik,
Is the issue solved now ? the backups for those servers works correctly ?
Thanks,
Ahmed Bahaa