Need index/SSID replication to DR site

Question

Hi All,

We have 3 sites

Firstly, main site, NW 7.3.3, W2k3 server, 70 clients all W2K or W2k3.

Clients backed up to attached Tape library.

Second site, (soon to be "Warm DR" site) NW 7.3.3, W2k3 server (Hardware due to be replaced) matching numbers of (dummy) clients.

single Tape drive attached for recovery.

Third site, sister company, NW 7.4 unknown H/W.

Due to excessive time taken to recover indexes at this site I need to put a similar solution in place for them.

Our problem is, to hold a DR test at our Second site, we must scan the indexes of the tapes (6 LTO3's) to rebuild indexes and recover SSID's etc.

This wastes over a full day with tape swaps etc.

To avoid this I would like to set up Networker to replicate the indexes and SSID's to the second site from the first.

So we can recover data on day 1 of the DR test, or straight away at the 3rd site.

How best to acheive this ? convert NW server on second site to a storage node ? how to replicate to that site ?

Im afraid a simple "oh just do X" wont really help, as i'm still somewhat of a newbie here

a steer to a white paper or previous solution in the forums or on the net, would be a big help.

Thanks for you patience, and in advance for any advice or help.

paulo3 · Answer

Hi!

I met this problem a few year ago. Customer want to create DR site,

which working as a storage node in a normal hours and some tapes are cloned here

and in case of disaster it becomes a server.

I will find a docs which i self made it, and i will sent the the DR steps.

(I have to find the doc)

As i clearly remember, i have to make a lot of scripts which will automate this process.

Regards:

Paul

paulo3 · Answer

Hi!

Let me describe our environent:
We have one storage node, which has some cloned tapes.In the case of disaster
we have to convert our storage node to a server.

According our docs, the requirement of the converting the storage node to server is:
1. The storage node version should match the server version

2. We have to know the client id of the st. node and server.
   mminfo –av –q client=comshare –r client,ssid,name

3. Saveset id of the bootstrap file

4. Where is the latest bootstrap located

Our recovery steps are:
1. networker shutdown on the storage node
2. custom script which do the following:
   - backup /nsr dir
   - change the hostname
   - change /etc/nsswitch.conf and /etc/hosts
   - preconfigured /nsr directory copy to the storage node
   - networker start
3. custom script which runs the following:
   - add license
4. Start the Networker Administrator

5. custom script which gives the following information about last bootstrap
   - cartridge name, bootstrap ssid, file number, record number

6. Load cartridge to the jukebox

7. start the mmrecov command

8. stop networker

9. custom script which do the following
   - change the res.R to res

10. Delete some clients from Administrator screen

11. Unmount volumes from Administrator screen

12. Delete all the tape drives

13. customer script which do the following:
    - change the ip of the server

14. Create a new client from the Administrator screen

15. Configure a new jukebox with jbconfig

16. Make an inventory

17. custom script which do the following:
    - remove volumes which are tape locations are not correct

18. custom script which do the following:
    - nsrck -L7

19. The New server is ready to use

We recover oracle db after that.The recover process is faster if we runs multiple sessions
simultaneusly. (Multiple dirs or datafiles on the same machine)

As you see it is a quite long process.I think two options are good for DR:
1. Networker cluster.It is a bit expensive, because you should store the /nsr dir in a
   shared storage.

2. Manual failover to DR site.As you see you should do a much manual scripting.
   We dont have too big index files, so the nsrck -L7 command runs fast.

   Optionally you can save all the index files to the disk with savegrp -O command and in the
   case of disaster this disk area is replicated to DR site and you can quickly recover indexes and res files too.

   The main problem with the manual methods, that if we want to recover to another networker node, the host name is also changing and
   it generates a lot of scripting work.

Networker Disaster Recovery Guide is a good step-by-step guide for you.

I hope i can help.

BR: Paul

paulo3 · Answer

One tip:

If you save the indexes to disk create the following:

- create a group which name is dr_index_backup for example

- create a new clients with the following save set: index:

- create a disk pool for this group

- start the group from command line: savegrp -O -l full

It will save the indexes to the disk area. When the indexes are on the disk you can replicate to another server and can mount it.

(server names should be the same)

masonb · Answer

Pal.

You could also take the licensing step out by authorising the licenses on a Composite hostid (made up of Storage Node hostid and NetWorker server hostid). This will allow the software to run on either host without any changes. You can achieve a composite hostid by creating a file /nsr/res/hostids with one line in hostid1:hostid2 and then restart the software. Obviously when you do this you will have to authorise the licenses on the NetWorker server again when completed but going forward this should be the last time you have to so this.

You can also put the file in place to get the composite and then rename it until you have host transferred the licenses to the composite hostid. Software will revert back to the original hostid.

If you have scripted this then it will not take you much time to re-authorise the licenses, however its one less step....

Regards,

Bill Mason

ble1 · Answer

There are products which can do replicatrion either via SAN or LAN at block level so you may wish to investigate if any of those would match your requirement.

Phil_1 · Answer

H,

we have a soluition, that encompases SRDF. Our indexes are 30-40G for the biggest fileservers, totalling over 1/2 the 500G lun.

With many issues in our environment, support have told us that SRDF is not supported as the disk response times are too slow. We have to split the SRDF whilst running NetWorker. Our EMC team has changed a few times, and when we complain that the solution is flawed, the response is "I did not know that, it was before my time. I cannot be responsible".

To work around this, on a weekly basis, we establish the SRDF disk, and stop the NetWorker services before splitting the disk again.

However being backup related, if we have to DR occurance, then the most recent CFI changes (which I suspect to be the most valuable) would not be available.

We do however clone the indexes to the remote site.

Phil

ble1 · Answer

I do not have any issues with SRDF.  In my case, my whole cluster protected /nsr is SRDFed and index=bootstrap saved once and cloned after to remote site.  I only once had to restore from mirror (and only becuase I was lame to do mmrecov) and I had no issues.  My biggest index is just 2GB (we use retention of 9 days only) and we are at 40% of total capacity of the disk (out of which index takes only 26GB).

Thierry101 · Answer

Hi

Good thread..good to know replication supported for index.....just one question...what happens when shutdown happens/site lost while replicating index from SRV to SN (which will become SRV during DR)....corruption?

Thank you!

ble1 · Answer

You won't get amy corruption. The past which is not comitted is broken anyway and as such you would not use it for restore. Your biggest concern is media database which should be ok. Replication as such does not protect you from logical errors and corruptions of course if they happen. That's why you do backup. I have 6 scheduled jobs per day with bootstrap copy and I find those to be enough to give me point of return with acceptable loss if that would happen in the first place.

NetWorker

Need index/SSID replication to DR site

Was this post helpful?