Taking DevOps Productivity to New Heights with AIOps Automation

Discover where AIOps integration with other IT management applications is heading.

AIOps applications are a hot topic, and their ability to integrate with other tools in the IT management ecosystem has big implications about how to step-up daily operations productivity.

As Stephen Elliott from the IDC analyst group explains in the video below, AIOps is a new class of application that applies knowledge, based on machine learning and analytics, to provide insights and recommendations that improve and accelerate IT Operations. Leveraging automation, AIOps can further accelerate positive IT, DevOps and, ultimately, business outcomes.

CloudIQ is Dell’s AIOps proactive monitoring and predictive analytics application for Dell servers, storage, data protection and network systems. With hundreds of software engineers continuously churning out new CloudIQ features, we’re constantly soliciting feedback from the thousands of CloudIQ users.

AIOps Users Speak Out

User surveys show that CloudIQ’s AI/ML-driven capabilities result in 2X to 10X faster time-to-resolution of issues¹ and saves IT specialists an average workday (nine hours) per week.¹ CloudIQ user surveys also reveal how IT teams are thinking about ways to leverage AIOps insights with automation and increase gains. While nearly a third of users are undecided about how to leverage AIOps insights with automation, more than two-thirds want to use them to automate actions, but with various degrees of oversight and control.²

CloudIQ Webhook and the CloudIQ REST API are new features that will address those needs by enabling your ITOps, DevOps and SRE (Site Reliability Engineering) teams to integrate data and insights from CloudIQ into your IT Monitoring and Orchestration application ecosystem. CloudIQ’s trusted data and insights become the intelligence that will drive IT automation to further reduce toil.    

CloudIQ User Survey: AIOps Automation Preferences
CloudIQ User Survey: AIOps Automation Preferences

Why AIOps Integration?

The CloudIQ AIOps application for proactively managing infrastructure health, performance, capacity and cybersecurity was extended over the past twelve months to integrate with other tools via Webhook and REST API. During a three-month technical preview for each feature, we took users’ feedback into account before releasing them into production. We were amazed by the large number of participants and huge appetite to leverage CloudIQ insights through automation.

Integration with ITSM (IT Service Management), collaboration and orchestration tools, to name a few use cases, allows customers to reduce toil and streamline processes and flows. Based on CloudIQ’s reliable data and intelligent insights, DevOps and SRE (Site Reliability Engineering) teams can now automate the creation of incident tickets and the remediation of well-known and repetitive issues. They can also integrate CloudIQ with mechanisms other than email (such as Teams and Slack) to communicate with team members.

The key here, in terms of user experience, is that we’ve built our Webhook and REST API with consistency in mind: consistency in the way you interact with them and consistency in the way they work across Dell’s server, storage, data protection, storage area networking and IP networking systems.

Webhook

Considering system health as the most pressing issue, we started with the CloudIQ Webhook, sending health issue occurrences to remote endpoints via a simple HTTP POST and in an easy to parse JSON payload. Webhook endpoints can only be managed by users with the right role, and Webhook messages are signed (HMAC-SHA256) and sent over HTTPS.

Leveraging Webhook, you can create incident tickets in your favorite ITSM tool, pre-populating fields with data coming from CloudIQ and avoiding manual steps. You can also send health issue occurrences to a collaboration tool to notify the SRE team that there’s something wrong to look at. In both cases, flows can be designed to ask for the user’s approval before allowing automated remediation to be launched.

This will reduce time to restore service or to mitigate risks and is especially suited for common, repeatable tasks such as:

    • Storage capacity is rapidly approaching full; automate a storage object (e.g., file system) expansion
    • A server component is not responding; automate a re-boot to potentially resolve the issue

CloudIQ Webhook is available today for the full range Dell infrastructure systems that CloudIQ monitors.

REST API

Dell recently introduced a CloudIQ public REST API, based on a Dell standard REST API style for Dell’s infrastructure products. Dell API standards enable all Dell product APIs to look and behave consistently so that you don’t have to re-invent the wheel when integrating with various Dell products.

The CloudIQ REST API uses a common model across Dell infrastructure products to expose systems, components, attributes and metrics, making the life of our customers easier to list all objects across products. For example, listing all volumes across storage systems allows DevOps team to automate chargeback or showback flows, without having to rely on multiple element managers and integration points.

Three typical use cases we heard from our customers during our REST API tech preview were:

    • ITSM integration to synchronize system inventory from CloudIQ with a CMDB (Configuration Management Database)
    • Enterprise Manager of Manager integration to pull key metrics and KPIs from CloudIQ into a central repository for creating dashboards that include systems monitored by CloudIQ with non-CloudIQ monitored systems
    • Showback dashboard integration to pull capacity utilization metrics at the storage object and compute level from CloudIQ so stakeholders can see what’s in use

CloudIQ REST API is read-only today, and we’re working on additional key features to enable even more use cases.

As of this writing, CloudIQ REST API supports the Dell storage portfolio (PowerStore, PowerMax, PowerScale, PowerVault, Unity/Unity XT, SC Series, XtremIO), Dell Connectrix SAN switches and Dell PowerEdge servers. As well, our strategy is to extend REST API support to the other Dell products that CloudIQ monitors.

There is much more to come and you can learn more about CloudIQ here:

1 Based on a Dell Technologies survey of CloudIQ users conducted May-June 2021. Actual results may vary.

2 Based on a Dell Technologies survey of CloudIQ users conducted March-May 2022.

Frederic Meunier

About the Author: Frederic Meunier

Frederic drives the innovation roadmap for CloudIQ, the AIOps application for Dell’s IT infrastructure system portfolio. He has been with Dell for over 10 years, joining from the Watch4net acquisition where he was a co-creator and then Product Manager of APG, Watch4net's flagship performance management product. Working in data center and network infrastructure monitoring for over 20 years, Frederic has a deep understanding of infrastructure domains and the monitoring and reporting use cases that are critical to IT operations’ success.