Large service delivery accounts often find it difficult to perform health checks and monitor capacity on hundreds of arrays and switches spread across different environments, in various locations. To guarantee Service Level Agreements (SLA), a health check report is prepared several times a day to closely monitor the arrays and to ensure that all failures are handled appropriately. This time-consuming, tedious task requires the effort of multiple engineers dedicated for this purpose. The complexity increases as fabrics may be spread across different environments and locations. Moreover, due to the heterogeneity of the arrays and switches, a single tool may not serve the purpose of monitoring the entire environment.
In this Knowledge Sharing article, Mumshad Mannambeth and Salem Sampath, list the methodologies used in implementing a time-saving, automated, health check and capacity report generation process for large numbers of different types of arrays and switches in a shared cloud environment. This process uses numerous scripts developed in OS-specific shells and presentation tools, such as Excel, that can be prepared in minutes. The process eliminates the need for users to log in to each management workstation to inspect the array. Instead, health check scripts are run on the workstations using scheduled tasks and reports are automatically emailed to the administrators group.
The article explores Health Check routines for different types of heterogeneous arrays and switches, including:
Hitachi HDS Arrays
IBM XIV Arrays
HP EVA/XP Arrays
The authors also discuss:
Scripts to perform health check on the above mentioned arrays and switches
Data gathering techniques
Emailing techniques from different OS Platforms
Automatic FTP uploading and downloading scripts
Analysis of collated reports using Excel VBA - Automation of health check reports from multiple Outlook emails to a single Excel-based report
The procedure described in this article has been successfully implemented in multiple major accounts reducing manual effort and time to perform health checks on a large number of arrays and switches. It is especially suitable for accounts monitoring multiple domains and heterogeneous products that cannot be monitored with a single tool.