Start a Conversation

Unsolved

This post is more than 5 years old

10740

July 1st, 2013 13:00

Ask the Expert: NetWorker Day-to-Day Operations - A Customer's Perspective

YOU MAY ALSO BE INTERESTED ON THESE ATE EVENTS...

Ask the Expert: Networker Server Best Practices

Ask the Expert: Avamar Plugin for Microsoft Applications

https://community.emc.com/message/739300#739300

Welcome to this EMC Support Community Ask the Expert conversation. This discussion will focus on day-to-day NetWorker operations from the perspective of an EMC customer.

Your host:

profile-image-display.jspa?imageID=8171&size=350

Dan Gauld (an EMC Partner) is in residency working at Mainland Information Systems.  Dan is focused in the area of backup/recovery and data storage for the last 8 years, and has performed numerous assessments, deployments and upgrades for NetBackup, Backup Exec, Networker, Avamar and Commvault infrastructures. Dan writes at backupbuddha.ca about backup recovery for both the enterprise as well as the home consumer. You can follow Dan on twitter @backupbuddha.

This discussion begins on July 29 and concludes on August 9. Get ready by following this page to receive updates in your activity stream or through email.

55 Posts

July 28th, 2013 17:00

Hey everybody! I want to thank Mark and EMC for the invite to share some knowledge here. Being asked to participate in a forum where I am the expert is a little intimidating, especially given the subject matter. NetWorker has a long history. There are probably some fellow grey beards out there who may have first encountered NetWorker when it was a product of a company called Legato in the late 80’s. Back then, data centers were growing as were the network bandwidth capabilities. Prior to this if you wanted to protect a server you connected a tape drive and wrote and cron’d some scripts. Legato identified that this was quickly becoming an administrative nightmare. The introduction of a centralized backup server, drive sharing and job scheduling solved some problems.

Back then, nobody wanted to deal with backups, the same is true today. Luckily for us that vacuum created another niche in the IT industry. The backup recovery specialist. One of the great things about this role is BRS touches all parts of an IT infrastructure. A backup recovery expert needs to have some rudimentary knowledge of databases, virtual and physical systems as well as specific OS knowledge of said systems. Never mind, storage that may include SAN connectivity, protection options specific to NDMP. It goes on….

I started in storage and data protection in 1996 when I joined the storage team for a growing Oil and Gas company here in Calgary.  After a few years taking care of their backup infrastructure I landed at Mainland Information Systems again in residency, but quickly moved to professional services. I now find myself back in residency, helping a large client with an ever changing dynamic infrastructure trying to stay ahead of their growing data protection challenges. When I first landed here a couple of years ago, I was new to NetWorker. I quickly realized that although a lot of the concepts and ins and outs of the product were shared with other products I had supported in the past, this creature required some special care and feeding. Not to sound dramatic, but if this beast doesn't get enough attention it will bite you.

Creating documentation specific to new learning’s is a great way to solidify knowledge, but I like most of you really dislike creating and maintaining documentation. So I started backupbuddha.ca as a way of documenting my new NetWorker specific knowledge gained in the day to day operations. I want to share here in this forum the little bit of what I’ve learned about the inner workings on NetWorker for those who may be new to the product. What do you, the NetWorker administrator need to do on a daily basis to ensure the infrastructure is healthy? Feel free to ask any questions, but EMC support is your best bet if you have an issue. I open tickets with EMC ALL THE TIME. Their support engineers are the best in the business. I take and share those learning’s on backupbuddha.ca to ensure I retain some of that knowledge in my back pocket.

In addition to EMC support. I would be remiss if I did not mention my acquaintance Preston de Guise. If you have ever had to perform a Google search on anything NetWorker related, you’ve probably stumbled across his website nsrd.info. His site is as valuable as support.emc.com and ECN to the NetWorker administrator. As well his book Enterprise Systems Backup and Recovery: A Corporate Insurance Policy is a great resource for backup administrators of all products and managers of backup portfolios alike. I very much hope to someday buy him more than a few pints at EMC world in Vegas in the future to thank him for is invaluable contributors to the NetWorker community.

Anyway, let’s get started. The first post I’m planning on is specific day to day operations of NetWorker. You just stumbled into the office in the morning. Coffee in hand, you need to check out NetWorker and make sure all is well. What should you look at? What should you look for?  Exciting? Not really, but stay tuned. I’ll try to make this as entertaining as possible.

2 Intern

 • 

326 Posts

July 28th, 2013 21:00

coool...good start..will brain dump soon

55 Posts

July 29th, 2013 09:00

Good Day!

So here is my day so far. I woke up and attempted to determine if that was a good or bad thing. I decided I was indifferent and then wondering why I was in pain? Oh yeah! Leg day at the gym yesterday! Then my morning routine begins. This usually entails dragging my wife out of bed. She is not a morning person, AT ALL.

Today was a little easier as she was starting her new job! Cat fed and watered, then the Dan gets fed and watered. I dress in my typical uniform of jeans and a black v neck t-shirt. I should not be dressing so casually when I work on site with a client. I usually apologize when I run into my client or manager and say I was pulling cable in the data centre and didn't want to get my nice clothes dirty. I pull a lot of cable. Today, I actually am at the data centre. It's nice here. We have a small office to work from where it’s quiet and I can hide from the world.

Now I need coffee. I'll be right back.....

photo.JPG.jpg

OK, coffee in hand! I may or may not also have two chocolate glazed do-nuts. I will neither confirm or deny.

So first, let's login to the NMC console. Did it work? Excellent! NetWorker is up and running! Small victories, my friends. Embrace these.

Before launching the app, I like to take a look at the events.

nw2.png

Here we see any outstanding alerts. These may be failed backups or other warnings. Today is looking like a good day. A few failures, nothing we can’t handle.

Let’s Launch the console and go to the monitoring tab.

I know what you're thinking. Really, this guy is going to show us the NetWorker monitoring tab? Stay with me it will get better.

nw3.jpg

The key thing I want to impart here is what to look for. You want to look and see what backups failed last night. If you have a healthy environment most of your backup failures can be chocked up to what I call "an act of god". We live in an imperfect world. People sometimes spill milk. Maybe you burnt the toast, and sometimes backups fail. For most backup failures you will not go down a rabbit hole to find the root cause. If you can, re-run it. Does it work? Great! Go on with your life. This may sound obvious to some of you, but I have actually had debates with co-workers who thought every backup failure should be investigated to the nth degree. Ok, this happened once, and the guy was kind of a tool.

The point is, in a healthy environment, you are going to have backup failures. In my environment most of the failures are related to the server team shutting down or decommissioning a server AND NOT TELLING ME! It's not like I get mad about it or anything, so yeah. I investigate when a client or a specific client component fails repeatedly. We will get into some basic client backup troubleshooting later.

Also look at the alerts and log section. These areas are good for some immediate feedback on the current health of the environment. There are some more in-depth checks we will get into later.

So I have noted the backup failures. I see I have some jobs running? I like to check to see if the jobs are indeed running or are they hung up?

This morning, the running jobs are for some of the filers that are backing up over NDMP.

Let's look at one of the jobs.

nw4.jpg

What I want to do is see if the job is running and actually has tapes mounted.

Note the volumes names in the job details.

Then let’s go over to devices.

nw5.jpg

What I'm looking for is some correlation between the jobs details of the previous screen. Are the volumes loaded? More importantly are the drives writing and at what speed?

Part deux will be coming up later today, but I have some work to do now and more coffee to drink.

55 Posts

July 29th, 2013 14:00

Hey!

I’m back! So let’s finish of our daily checks. Next up is checking devices. Given everything appears OK, I’m not anticipating an issue, but I like to keep an eye on these things.


Most larger environments will have a mix of disk and probably tape. Tape!? Yup it’s still around. Don’t pretend it doesn't have a place. It does and will continue to for some time. Check out the devices and look for any that may be down or in service mode. If there are any down you will need to go to your library admin console for further troubleshooting.


nw6.jpg

The same goes for any disk storage units. I have a couple of DD990’s here. I like to check to make sure the disks are mounted and good to go.

nw7.jpg

Also, if you have a very busy environment, you may have a look at any backups that run during the day. With tape, there are a lot of moving parts. Not just the robot and the drive themselves. There is also the NetWorker database that tracks and allocates media. Not to mention the relationship to the devices from the OS hosting NetWorker. In short, there are a lot of points of failure for your devices. So, sometimes I will watch the library portion of the GUI. Tomorrow we will look a  little at the native NetWorker alerts that can be configured to help you keep on top of some of these.

55 Posts

July 29th, 2013 16:00

Please share,

1 Message

July 30th, 2013 08:00

Have you ever seen this error during a routine saveset restore? warning: ASDF type 0x65 version 0 not recognized

55 Posts

July 30th, 2013 13:00

Busy day here today kids, I'll check in tomorrow.

2 Intern

 • 

1.2K Posts

July 30th, 2013 19:00

Hello Dan,

Do you have any disaster recovery plan and operation for your backup system/backup data in your data center?

How many Data Domain systems are in your data center?

55 Posts

July 31st, 2013 13:00

Hey Tim

We backup the bootstrap to DD and clone to a secondary DD offsite.

55 Posts

July 31st, 2013 14:00

Hey Guys, It has been a busy week. The next topic I wanted to touch on was around some of the other items you should keep an eye on that may indicate issues. The daemon.log captures all NetWorker operations and associated alerts, warnings and errors. I have seen issues that resulted in data that was unrecoverable, where it was not obvious that there was an issue, despite all the daily check previously mentioned being completed. It's a good idea to go through the daemon.log and grep out any warnings or errors. All warnings and errors should be actioned. Some of the messages may and most likely will seem esoteric. EMC support should be engaged to help understand the error and ramifications that may result.

The daemon.raw file can be configured for realtime rendering. Lets give props to Preston.

Basics – Realtime rendered logs and other log options « NetWorker Blog

Should be completed from nsradmin

nsradmin> update runtime rendered log: /nsr/logs/daemon.log

Else, you can use you can use the nsr_render_log utility.

Some useful switches...

-R hostname: renders log from remote host

-Y severity: outputs messages that match this variable

-F devicename: Outputs only messages related to a specific device.

These are just a few. Now a couple of warnings. The daemom file can get chatty and especislly verbose.

You can read about a fun day I had with this here.

NetWorker Log File Size Management

In short, its important to ensure the max file size is somewhat restricted. Otherwise, your issue may be compounded by the disk space filling up on your server.

Another area to keep an eye on is /nsr/cores. Not sure of the Windows has an equivalent?

Inside there are directories specific to the NetWorker daemons and processes. I should really write a script to check these for new cores and email me when a core dump for specific process occurs?  Also,  I like to keep an eye on syslog (and or event viewer) for anything unusual.

2 Intern

 • 

1.2K Posts

August 4th, 2013 19:00

Hello Dan,

Thanks for your great sharing. We are hoping you can share more about the common mistakes we may make on daily NetWorker management/operation and more best practice.

55 Posts

August 7th, 2013 10:00

Apologies for being absent! It was a long wknd up here in Canada and so yesterday I was kind of swamped. You know how it is.

I have a couple of other posts planned before we wrap up. Next up is database maintenance!

Excited? I know you are!

55 Posts

August 7th, 2013 10:00

Hey Jon!


Sorry, I had never had the pleasure of seeing that error before. When I googled it all I came back with was another post you made in another forum. lol.

I'll poke around a little more, time permitting. Please share here if you find the answer before this is locked.

666 Posts

August 7th, 2013 10:00

Hi all. As part of our continuing effort to garner feedback to improve EMC Ask the Expert and other content offerings, we'd like you to participate in this short survey (see link below) . It's five questions long and will take less than 5 minutes.

It will provide valued feedback for future Ask the Expert NetWorker topics, and provide valued information for what you, our community, wants out of the NetWorker product going forward.

It will also help us focus on the correct training material you require to get the best out of the EMC NetWorker portfolio.

We really appreciate your participation in this survey.

Click here to take the EMC Ask the Expert NetWorker survey

55 Posts

August 7th, 2013 10:00

Thanks Tim! It was a nice excercise to sit back and really think about exactly what we do day to day.

We take so much of what we do for granted. Operations is often seen as boring in relation to PS type work, but really without us to keep the lights on it falls apart.

No Events found!

Top