AI Inferencing is at the Edge

AI inferencing is at the network edge in fractions of a second with NVIDIA and Dell Technologies.

In today’s enterprises, there is an ever-growing demand for AI-enabled services, from image and speech recognition to natural language processing, visual search and personalized recommendations. At the same time, datasets are exploding in size, networks are growing more complex and latency requirements are becoming more demanding as many AI applications now require responses in real time.

All of this is an argument for moving AI inferencing to the network edge — where the data is captured, the insights are gained and the action takes place. This is a point underscored in a recent report from Moor Insights & Strategy, titled “Delivering the AI-Enabled Edge with Dell Technologies.”

“Nowhere is data as meaningful as the point at which it is generated,” the firm notes in its report. “In many cases, the value of data is highest when insights are generated, and appropriate actions taken nearly instantaneously — sometimes in just fractions of a second. The ability to generate insights and actions at the edge, coupled with the widespread availability of affordable, high-performance AI processing capabilities, has led to AI being the number one workload for edge deployments. Indeed, some of the most exciting applications enabled by edge computing depend upon it.”

Here’s a case in point. Taboola, the world’s largest content recommendation platform, uses inferencing at the edge to provide the right recommendation 30 billion times daily across four billion web pages, processing up to 150,000 requests per second. The engine driving all of this consists of two components: front-end AI for inferencing, which processes and delivers the real-time content recommendations to generate the desired clicks, views and shares and back-end servers that host cutting-edge deep learning models that are continually trained using sophisticated neural networks to infer user preferences. Read the case study.

Getting started

So how do you get there? For many organizations, the answer lies in the NVIDIA inference platform. This software is designed to deliver the performance, efficiency and responsiveness critical to powering the next generation of AI products and services — in the cloud, in the data center, at the network’s edge and in autonomous machines.

This inference platform is designed to unleash the full potential of NVIDIA GPUs inside Dell PowerEdge servers, which support both open-source NVIDIA Triton Inference Server software and NVIDIA TensorRT. The platform includes an inference optimizer and runtime that delivers low latency and high throughput for inference applications. This inference software is also a part of the NVIDIA AI Enterprise software suite, which is optimized, certified and supported by NVIDIA to enable customers to run AI workloads on VMware vSphere on Dell Technologies.

To bring it all together, Dell systems are NVIDIA-Certified so enterprises can confidently deploy solutions that securely and optimally run their accelerated workloads. The certification test suite is designed to exercise the performance and functionality of the configured server or hyperconverged infrastructure by running a set of software that represents a wide range of real-world applications.

This certification process includes deep learning training, AI inference, AI frameworks including NVIDIA Riva and NVIDIA Clara, data science including NVIDIA RAPIDS and Spark, intelligent video analytics, high-performance computing and CUDA functions and rendering. It also covers infrastructure performance acceleration, such as network and storage offload, security features and remote management capabilities.

Dell Technologies offers a wide range of NVIDIA-Certified Systems for use cases ranging from the core of the data center to the network edge. We work closely with NVIDIA to optimize our systems for top performance, including AI inferencing and training. This kind of work makes Dell Technologies optimal for organizations deploying the NVIDIA inference platform.

This is a point that Moor Insights & Strategy makes in its recent report, commissioned by Dell Technologies:

“IT organizations that are ready to embrace the edge need both a strategy and a partner that has experience with AI and the edge,” the firm notes. “Dell Technologies has the expertise, products, services and reach to simplify the edge with intrinsic security and deliver insights where they’re needed the most.”

For a look at the infrastructure to drive inference workloads at the edge, visit the Dell Technologies’ edge page and AI page. Please also check out NVIDIA GTC November 8–11 (registration required).

About the Author: Janet Morss

Janet Morss previously worked at Dell Technologies, specializing in  machine learning (ML) and high performance computing (HPC) product marketing.