Zero Touch Provisioning is Essential to Meet Infrastructure Demand

How Dell IT is using zero touch provisioning to simplify and accelerate infrastructure provisioning.

Every IT organization is on a mission to get their infrastructure logistics out of the way of their developers with on-demand provisioning. We want to offer seemingly endless capacity to on-prem private cloud users, echoing their public cloud experience. Our goal is to drive adoption and lower costs. At Dell Digital, Dell’s IT organization, we took on end-to-end hardware automation through Zero Touch Provisioning (ZTP) to keep pace with our relentless capacity demand while delivering reliable, scalable on-prem private cloud.

The largest investment in an on-prem private cloud is always going to be compute and storage hardware. Depending on the scale, it will require hundreds or even thousands of servers that will all need to be racked, cabled, tested, configured and installed as virtualization clusters. The operational effort to turn servers into clusters by hand, step by step, does not scale. The more repetitive a task is, the higher the risk of human shortcuts and oversights. These can impact the reliability of your on-prem private cloud for years. Your hardware is the foundation, and your on-prem private cloud can only ever be as scalable and reliable as that foundation.

As the Director of Zero Touch Engineering for Dell Digital, I built a team to address the efficiency, scalability and reliability of our cluster build process. We collaborated with the infrastructure teams to standardize and automate the core steps of the provisioning process. We now configure new blank servers into standard virtualization clusters, test and validate all functions, and enable them as capacity in a quarter of the time we did previously.

This shrinks the window where the hardware is depreciating while sitting idle in the warehouse and gets capacity to our users much more quickly.

If you continue to do infrastructure hardware deployments manually, you will never achieve economies of scale to meet that capacity demand curve. Standardizing and automating not only makes provisioning faster and more efficient, it also establishes detailed knowledge about the hardware in your environment to provide a basis for day-two operations to maintain your ecosystem going forward.

Transforming Manual Processes

When we kicked off the Zero Touch Provisioning effort two years ago, nine infrastructure teams were involved in the day-to-day effort of building clusters. Many of the construction steps required physically or virtually touching all servers in each cluster to complete the same set of tasks over and over, then moving on to the next cluster. The overall process was high friction and required daily calls and multiple full-time project managers to push clusters through the pipeline from team to team.

What’s more, our Dell Digital organization tripled our hardware asset spending over the previous five years, tripling our deployment burden.

To address this challenge, we first built our ZTP team by bringing in engineers skilled in microservice architecture, workflow management, automation frameworks and front-end design. We identified and clarified existing standards, helped to resolve discrepancies and close any standards gaps. We discovered and added new steps to the provisioning process to increase validation for reliability.

Today, there are 25 steps in our workflow, each with clearly established inputs that are structured and encoded in our ZTP database. Each step replaces dozens of tasks that were previously manual.

Automating a Step at a Time

Like the proverbial elephant, we tackled the end-to-end automation one bite at a time. We delivered steps to the build process as soon as they were ready, rather than waiting for the end-to-end workflow to be complete. By actively participating in the existing process during the shift to automation, we stayed in close collaboration with the specialized infrastructure teams while also delivering value quarter over quarter.

The automation process is ongoing. We consult infrastructure subject matter experts to analyze each deployment step in detail. The ZTP team then selects the appropriate match from our toolkit to tackle that integration. To automate installation and management of commercial or open-source tools, we start with open-source automation modules as much as possible. When needed, we develop our own integration libraries, always focusing on reusable components.

Automating the actual execution of build tasks to complete each step is about one quarter of the overall effort. We had to develop a system to encode all the required information – the DNA of Zero Touch Provisioning. Nearly 10,000 pieces of information need to be collected, captured or tracked about every cluster for it to flow through the steps. We compile and structure information about all layers of the platform and serve it via APIs, so it is self-service and a shared source of truth for automation and validation processes. This eliminates the friction from meetings, handovers, reliance on institutional knowledge and the endless flow of spreadsheets.

Finally, after the tasks are automated and the data is structured, the step can be integrated with the overall workflow so that it is truly automated enough to ‘run itself.’

Realizing the benefits of ZTP takes time. Initially your organization is trading infrastructure engineers for software engineers to build out the automation. But as your on-prem private cloud expands, the speed, efficiency and maintainability benefits will accumulate.

We heavily invested in ZTP to develop the end-to-end workflow that gets us to day one of production – the first day that a cluster goes live as capacity in our on-prem private cloud. However, this diligently structured and standardized design and build gives us the capability to continually audit, maintain and eventually decommission the capacity clusters.

The demand for capacity only continues to increase. Development teams continue innovating and growing their user bases. Internal tools and services grow along with your business. ZTP is an investment that will continue to pay dividends in the efficiency and reliability of your infrastructure capacity.

Keep up with our Dell Digital strategies and more at Dell Technologies: Our Digital Transformation.

Gabi Sweda

About the Author: Gabi Sweda

Gabi Sweda is Director of Zero Touch Engineering in Dell Digital’s Infrastructure Platform Enablement organization. She is responsible for transforming physical infrastructure deployment and on-prem private cloud capacity delivery. Gabi leads efforts in automated provisioning, maintenance and audit of infrastructure hardware at scale, along with standardized infrastructure architecture. Gabi and team also promote and support modern event-driven automation approaches across the organization. Before joining Dell, Gabi led physical infrastructure automation efforts at EMC and Virtustream.