Introducing openUSM – simplifying server management & insight log analytics using docker containers

Introducing openUSM – simplifying server management & insight log analytics using docker containers


Containers are changing the dynamics of modern data center. It is a growing technology that is drawing widespread attention across the Enterprise IT. One of the primary reason of the rise & adoption of containers is that it allows developers to move faster. Compared to VMs which takes minutes to stand up, containers take milliseconds and even microseconds. As organizations prioritize shipping new products and features faster, to keep up in the software-eaten world, developers are favoring technology that allows them to scale applications and deploy resources much faster than what traditional VMs on public and private clouds can support.

Docker containers bring variety of cost and performance benefits. It brings ability to run multiple applications on the same server or OS without a hypervisor which eliminates the drag of the hypervisor on system resources, which means that your workloads have a lighter footprint – the container footprint is zero, because it’s simply a boundary of permissions and resources within Linux. They fire up and decommission very rapidly compared to virtual machines – a perfect fit for the ephemeral nature of today’s short-lived workloads, which are often tied to real world events.

Docker is an open platform which make application and workloads more portable and distributed in an effective and standardized way. Combining Docker containers, micro services along with software-defined infrastructure makes the datacenter more agile and quick resource reallocation. Hence, this architecture works well to improve the datacenter operations.

Most DellEMC Server Management software offerings, as well as the entire Software Defined Infrastructure, are built upon standard implementation using RESTful architecture called Redfish. Redfish is a next generation system management standard using a data model representation inside a hypermedia RESTful interface. The data model is defined in terms of a standard, machine-readable schema, with the payload of the messages expressed in JSON and the protocol using OData v4. Since it is a hypermedia API, Redfish is capable of representing a variety of implementations by using a consistent interface. It has mechanisms for discovering and managing data center resources, handling events, and managing long-lived tasks. It is easy to implement, easy to consume and offer scalability advantages over old technologies. Redfish is a RESTful interface over HTTPS in JSON format based on ODATA v4 usable by clients, scripts, and browser-based GUIs

What is OpenUSM?

OpenUSM

Figure 1: Open Universal Systems Manager

OpenUSM stands for Open Universal Systems Manager. It is basically a multi-tool product like "a Swiss army knife". It is a suite of open source tools & scripts which purely uses containers & related tools to perform server management tasks, monitoring & insight Log Analytics. It is 100% container based solution which heavily uses Docker for building microservices for system management tasks like BIOS token change, Firmware update. It is completely an out-of-band system management solution purely based on Redfish API Interface. It is a platform agnostic solution (can be run from laptop, server or cloud) and works on any of Linux or Windows platform with Docker Engine running on top of it.

Value Proposition:

OpenUSM follow an easy deployment model. It uses developer’s tools like Docker Compose & CLI to bring up microservices which ensures that system management tasks can be achieved flawlessly. It uses modern tools & technologies and integrates well with near real-time search analytics tools like ELK stack. It enables Sensor Log Analytics & visualization for operations team using the popular Grafana tool. It can scale both vertically & horizontally. As it is completely open source, you are free to build and customize based on your needs and holds a plug-and-play components and functionalities.

Technology Overview of OpenUSM:

OpenUSM is an integrated container solution which brings 3 basic functionalities– system management, monitoring and insight log analytics. System Management stack sits at the middle of the architecture which uses Python, Redfish, Django, Docker & Docker compose to enable system management tasks. On the left hand side of the architecture, we have monitoring stack which uses Prometheus, Grafana, Alert-Manager and Pushgateway to retrieve GPU and Sensor metrics. On the right hand side, it consists of open source version of Elasticsearch, Logstash & Kibana for sensors, lifecycle controller & SEL logs.

architecture_openusm

Figure 2: Open USM Technology Overview

Understanding OpenUSM System Management WorkFlow

OpenUSM uses "Container-Per-Server (CPS)" model. For each server management tasks, there are scripts which when executed builds and run Docker containers against each of server platform. It purely uses Redfish API to communicate directly with Dell iDRAC, collects iDRAC/LC logs and pushes it to syslog server. Logstash collects the syslog server and pushes to elasticsearch and finally it gets visualized through Kibana. OpenUSM uses Prometheus Stack for monitoring System components like GPU/CPU monitoring using nvidia-docker & node exporter.

OpenUSM-management

Figure 3: Open USM System Management WorkFlow

Under this blog post, I am going to demonstrate how OpenUSM simplifies the overall system management tasks with the help of Docker containers & Redfish. For this demonstration, I will leverage Ubuntu 17.10 VM running on my ESXi 6.0 system. This code should work on any of Linux & Windows platform too.

The source code for this project is open to the public and is available under https://github.com/openusm/openusm

Cloning the Repository

$git clone https://github.com/openusm/openusm

Bootstrapping Docker

If you have Docker already installed on your system, you can skip to next step. If not, run the below command to install Docker & Docker Compose on your system.

$sh bootstrap.sh install_docker

Based on your network connectivity, this step will take 1-2 minutes to complete.

Bootstrapping ELK

OpenUSM is 100% containerized solution and hence we will be running ELK inside Docker containers. To keep it simple, we designed a docker-compose file which can get you started in a matter of seconds.

Execute the same bootstrap file to bring up ELK stack as shown below:

$sh bootstrap.sh provision_elk

Just wait for 30-40 seconds to get ELK stack up and running.

ELK-services


Figure 4: ELK stack

Verifying the ELK services

Run the below command to check if ELK services are up and running:

$sh bootstrap.sh logs

Pushing DellEMC iDRAC Logs to ELK Stack

Under this section, I am going to demonstrate how OpenUSM makes it so easy to push logs of DellEMC system management tasks to a centralized ELK stack and get it visualized via Kibana . Let us pick up a simple "BIOS Token Change" functionality and apply it for multitude of DellEMC servers.

To keep it simple, we designed a script named "bios-token.py" which is placed under the root of OpenUSM GIT repository. Let us first look at various parameters which can be supplied with bios-token.py script -


$ python bios-token.py --help

usage: bios-token.py [-h] [--verbose] [-i IDRAC] [-n NFS] [-s SHARE]

[-c CONFIG] [-f IPS]

Welcome to Universal Systems Manager Bios Token Change


optional arguments:

-h, --help show this help message and exit

--verbose Turn on verbose logging

-i IDRAC, --idrac IDRAC

iDRAC IP of the Host

-n NFS, --nfs NFS NFS server IP address

-s SHARE, --share SHARE

NFS Share folder

-c CONFIG, --config CONFIG

XML File to be imported

-f IPS, --ips IPS IP files to be updated

As shown above, the script is targeted both at a single Dell server as well as multitude of DellEMC servers via a plain text file. We are currently looking at Autodiscovery feature to automate this functionality. One need to provide NFS server IP, share name and BIOS token configuration files as argument to execute it successfully. Once this script is invoked, it creates as many number of Docker containers per DellEMC servers, collects iDRAC logs from each of servers and pushes it to syslog server which runs inside Docker container. Logstash collects it from syslog and dumps it into Elasticsearch to get it visualized under Kibana UI.

$bios-token.py -f ips.txt -s /var/nfsshare -c biosconfig.xml -n <NFS server-IP>

where,

/var/nfsshare => NFS share

ips.txt => list of DellEMC iDRAC IP

biosconfig.xml => XML definining the BIOS tokens entry

IDRAC

Figure 5: iDRAC logs

By now, you should be able to see iDRAC logs visualized under Kibana UI. We can perform ample amount of customization around Kibana UI to display the logs per server basis.

inside-log

Figure 6: Log Analytics

Insight Log Analytics (LC, SEL & Sensor Logs)

Unpacking OpenUSM secrets further, this marks as an interesting use case and robust capability around insight Log Analytics. DellEMC server generates varieties of logs like system event logs (SEL), RAID controller logs & Lifecycle controller (LC) logs. When a system event occurs on a managed system, it is recorded in the System Event Log (SEL). The SEL page displays a system health indicator, a time stamp, and a description for each event logged. The same SEL entry is also available in the Lifecycle Controller (LC) log.

Considering a certain use cases where datacenter administrator want to collect LC logs for the last 1 year, it definitely requires a robust and modern tool to collect such huge data and perform analysis on top of the specific software.

We recently designed a script which simplifies such log analytics capability using Docker, Redfish & ELK. You can find the "lclogexporter.py" script under the root of GITHUB repository:

$ python lclogexporter.py -f ips.txt -ei <ip-address of ELK> -eu elastic -ep changeme

IDRAC-list

Figure 7: list of iDRAC IPs

This script requires elasticsearch IP address, credentials & list of iDRAC IPs to get all iDRAC LC logs pushed to ELK stack. Whenever you execute this script, it will build a Docker image called "openusm-analytics" first and run this container which automatically pushes all LC, SEL and Sensors logs to ELK stack.

chart

Figure 8: Kibana graph

Below is the Kibana UI visualizing Pie-Chart for Dell Lifecycle Controller logs collected for the last 1 year timeframe.

pie-chart

Figure 9: Pie-Chart for Dell Lifecycle

Insight Log Metrics (Sensor Logs) using Grafana

Temperature Graph

Figure 10: Temperature Graph

Temperature Graph 2

Figure 10: Temperature Graph 2

Did you find OpenUSM an interesting project? Want to contribute? OpenUSM is just 3 months old project and we are looking out for contributors across the globe. If you think the project really looks cool, come & join us to make it more robust. We welcome your participation.

join-OpenUSM

Figure 11: Join OpenUSM




Article ID: SLN312577

Last Date Modified: 08/27/2018 03:42 PM


Rate this article

Accurate
Useful
Easy to understand
Was this article helpful?
Yes No
Send us feedback
Comments cannot contain these special characters: <>()\
Sorry, our feedback system is currently down. Please try again later.

Thank you for your feedback.