Accelerate Modern Workloads with AMD and Dell AI Innovation

Dell Technologies introduces Generative AI Solutions with AMD, a flexible scalable design to power AI development.

Driving value from one of the most disruptive paradigm shifts in our era, generative AI (GenAI), is a critical priority for leaders, as they wrestle to leverage this technology into their organizations and overcome business challenges. But deploying best-fit AI infrastructure to power the organization while enabling developers is limited by complexity of planning AI strategies, often dependent on proprietary, closed system solutions.

Dell Technologies and AMD are making available new “easy button” AI solutions allowing developers and IT flexibility to deploy architecture that enables innovation within an open ecosystem and open AI frameworks. Furthermore, customers can rely on proven methodologies to create a winning strategy that accelerates AI outcomes with Dell Services.

Scale Up with GenAI Value

Scientists and application developers often have a wealth of experience already in AI and ML. But with larger size LLMs, getting started with GenAI depends on powerful GPUs, more AI frameworks and integrated tools, often requiring a significant investment in platforms, AI software licensing and more. Furthermore, developers can face significant barriers to integrate home-grown tools and open-market models, customize drivers and, most importantly, add company IP because many all-in-one AI software suites limit custom integrations and can require additional investment.

This is where a suite of standards-based AI tools, open-source software and frameworks can allow developers to integrate and scale their workflows, code and familiar tools into the organization’s value chain to accelerate innovation.

Harness Open-source Flexibility for Your AI Starting Point

Enter Dell Technologies newest open-source AI framework approach to enable custom applications development while taking the limits off GenAI with open-framework foundations.

The new Dell Validated Design for Generative AI with AMD Instinct™ and ROCm™ powered AI frameworks extends the Dell ecosystem to help accelerate outcomes with a multi-node design based on the latest innovations, the AMD Instinct™ MI300X accelerators and AMD ROCm AI suite.

This solution is based on the fastest ramping solution in Dell history,1 the Dell PowerEdge XE9680 server, supporting eight AMD Instinct™ MI300X accelerators. With 192GB of memory per GPU (or a total capacity of 1.5 TB per server), the PowerEdge XE9680 with AMD further enables organizations to train larger models, lower TCO and gain a competitive edge. The PowerEdge XE9680 with AMD GPUs is already available for orders, with full availability in June.

Developers can also start from their own desk building applications leveraging AMD-based Precision workstations, such as the Precision 7875 Tower that features an AMD Ryzen™ Threadripper™ PRO processor with up to 96 cores and with scalabale professional GPUs.

Building Blocks to Drive Scale

With an on-premises approach, the new GenAI solution delivers a faster experience with optimized AI storage and AI fabric connectivity from the latest Dell PowerScale F710 and Dell PowerSwitch Z9664F-ON.

The new PowerScale F710 delivers faster time to AI insights with massive gains in streaming performance that accelerates all phases of the AI pipeline. It offers double the write throughput per RU over flash-only competitors2 and features up to 10 NVMe SSD drives in a compact 1U form factor to further enhance storage efficiency and minimize data center footprint. PowerScale leverages OneFS software and features the latest technology to bring multicloud agility with APEX File Storage integrations, federal-grade security and exceptional efficiency for AI infrastructure.

Network performance is critical to support AI operations between GPUs, servers and storage. The Dell PowerSwitch Z9664F-ON, offering 64 ports of 400GbE, delivers low latency and high throughput fabrics for modern AI clusters along with Enterprise SONiC distribution by Dell Technologies. Upcoming enhancements this summer and new network interface cards from Broadcom will boost AI fabric performance. Dell’s participation in the Ultra Ethernet Consortium (UEC) ensures organizations can rely on open approach, standards-based networking solutions to scale out their tailored AI strategy.

Integrate Value with AI Open Software Platform and Tools

AI - artificial intelligence - servers - PowerEdge - PowerSwitch - Open-source - AMD - Dell - Dell Technologies
Source: Accelerate HPC Innovation and AI Insights with an Open Ecosystem.

Open standards software suites enhance application development and workflow automation. The open-source AMD ROCm suite is designed to unleash the power of AMD GPUs, with greater choice and interoperability from popular software frameworks and tools (including PyTorch and TensorFlow, and other AI applications). Built on open standards, AMD ROCm reduces the need for proprietary AI software suites, enabling developers to simplify development and freely customize their workflows. Furthermore, developers can readily develop with open-source LLM models from partners including Hugging Face and Meta.

Scaling GenAI development into production is enabled by Dell Omnia open‑source software which deploys and manages high performance clusters for AI workloads among others. Omnia installs open-source Kubernetes for managing jobs and services. Developers are continually extending Omnia to speed deployment of new infrastructure into resource pools that can easily be allocated to different workloads. With Omnia, IT can optimize further to rapidly provision their AI infrastructure. In addition, Dell provides enterprise support for Omnia, giving customers the confidence to deploy Omnia in mission-critical environments.

Accelerate Your AI Journey

To quickly put the power of the Dell Generative AI solutions with AMD to work, Dell Services—recently recognized as one of the world’s leading management consulting firms by Forbes—brings deep expertise across every stage of your AI journey, regardless of where development starts, on a dedicated workstation or in servers at the data center.

From aligning a winning AI strategy or getting data ready to power AI projects, to implementing the infrastructure needed to quickly realize a secure, optimized model for key use cases or fully operating the solution for you, we will meet you wherever you are. By minimizing time-consuming operational efforts and providing your team with skills, best practices and time to focus on innovation, you will realize maximized ROI now and into the future. Leverage our Accelerator Workshop to start developing a point of view for how your business will maximize benefits from GenAI.

Comprehensive Approach to GenAI with Open Foundations

Dell Technologies and AMD aim to enable developers and IT accelerate their AI initiatives with a proven, open solutions approach that:

    • Powers AI-assisted use-case and application development
    • Delivers secure, on-premises AI applications at scale
    • Reduces barriers to integrating company IP, custom development processes and tools
    • Lowers TCO and investments with best-fit infrastructure

See how Dell and AMD can enable rapid AI development in these recent blogs here and here. The Dell Validated Design with AMD will be available this summer.

To learn more, visit AI at Dell.

1 Based on Dell analysis, August 2023
2 Based on Dell internal testing comparing write throughput per node, May 2024. performance rates based on FIO over a remote file system. Actual performance may vary.

Greg Findlen

About the Author: Greg Findlen

Greg is Senior Vice President of Product Management of Data Management at Dell Technologies. He is focused on efforts in the data management space to build solutions to enable customers better unlock the value from the data they generate each day across their organization. Prior to this role, he spent years leading engineering teams focused on developing products that scale across Dell’s enterprise portfolio of products, such as APEX, CloudIQ and others. Greg has over 25 years of experience in the technology industry and joined the company in 2006. Since joining Dell he has held a variety of positions across the Infrastructure Solutions Group and has led key functions including development, data analytics, pricing, business operations, program management and strategy development. Greg also served as a leader of the Dell and EMC integration which drove all key aspects of the merger planning related to the products and engineering teams. Prior to his time at Dell, he worked in the semiconductor test industry and drove quality improvement efforts, supplier relationships, and engaged across development and operations roles.