Assess health of storage system

Question

Is there a guide to assessing the health of my VNX storage system?  It has been running a few years.  What should I check to make sure it is healthy?  How often should I be checking these things? Thanks!

dynamox · Answer

make sure there are no faults, make sure you stay on the supported code level, make sure properl call-home/email-home is configured (verify SMTP server address in the template)

Rojizo · Answer

OK thanks.  How often do I need to check for faults and where do I check for them?

dynamox · Answer

typically you don't have to check for faults manually because the array is supposed to call home for events that require attention. But ...in my experience i always like to have another set of "eyes" to check on health of my arrays. Back in the days when call-home events were actually "dial-home" events using regular phone lines, there were instances in my data center where the phone lines would get disconnected because someone thought those lines were not in use. So you have a disk failure in your array, it cannot dial out so you end up having failure and know about it unless you happen to login to the GUI. Now days notification is done via email , but that does not guarantee anything. I have had instances where customer changed their SMTP server IP address and the array was not able to dial out.

So i always create little perl/bash scripts that use naviseccli commands for Clariion/VNX and symcli for DMX/VMAX to verify health of my arrays. For VNX there are other tools that help you monitor the health, i can't remember the app name now but you can add multiple VNX arrays and monitor their status from there.

Jon_hope · Answer

I manage about 8 Different VNX arrays across our infrastructures and what I do besides the normal day to day alerts and monitoring is once a month I run a series of manual checks per array its very quick. Ive included what I run and intruiged what others may do.

Im sure there are easier more automated ways but nothing beats hands on assurance that things are working properly. I also just run them quickly from cli and pipe the output to a file so it acts as a oh yeah that was the state of it on so and so day.

These are for Unified systems:

/nas/sbin/getserial

/nas/sbin/getreason

nas_storage -c -a(SP Issues)

/nas/sbin/navicli -h SPA faults -list

/nas/sbin/navicli -h SPA getlun -trespass(Check for Trepassed Luns)

>>/nas/sbin/navicli -h SPA trespass mine(IF Trespassed Luns exist)

nas_cs -info

nas_inventory -tree

nas_fs -list(Check Filesystem)

server_df server_2(3)(If filesystem live on DM 2 or 3)

nas_checkup(This will verify call home/Auto transfer status)

khanz1 · Answer

Another tool you may want to consider is Unisphere Service Manager (USM). There is a Diagnostics tab which will give you the options to 'Verify Storage System' and 'Capture Diagnostic Data'. You also get a list of applicable technical advisories for your system, can also generate an HTML output of system configuration which will include list of issues and system fault summary, there is also an option to run health check.

anre51801 · Answer

VNX Monitoring and Reporting ... is free with the purchase of a VNX. can be installed on free CentOS 6

or heavier lifting with ViPR SRM so you can correlate and monitor more modules: Vmware, Network, etc.

VNX

Was this post helpful?