Rack-scale AI and HPC with the Dell AI Factory at SC25

Discover new compute and networking solutions at SC25 that allow you to scale AI and HPC with confidence

tl;dr: Dell AI Factory brings together servers, cooling, networking and intelligent software to accelerate AI and HPC. At SC25 we’re highlighting five themes that matter to IT leaders: performance, scalability, efficiency, sustainability and future-readiness. Expect faster time to insights, simpler scale-out, lower TCO and designs that are ready for what’s next.


New compute and networking solutions at SC25 allow AI and HPC customers to scale with confidence

The Dell AI Factory integrates cutting-edge technology—spanning servers, networking, cooling and the latest in intelligent software—into a unified ecosystem tailored for scalable AI and High-Performance Computing (HPC). With next-generation innovation centered on efficiency, intelligence, and future-ready design, Dell PowerEdge servers and Integrated Racks with PowerCool design innovations, along with Dell PowerSwitch networking, form the foundation of your AI Factory.

Let’s explore five key themes driving advancements at SC25 that reinforce how new innovations in power, interconnects, and intelligent systems management are shaping the future of AI-driven data centers: Performance, Scalability, Efficiency, Intelligent Infrastructure, and Future-Readiness.

Performance optimization: Elevating innovation

While others may focus only on GPU counts, Dell optimizes the entire data pipeline for balanced, real-world system performance. This ensures that data flows seamlessly even during the most demanding operations, giving you the power to innovate without limits.

For instance, the R770AP server and its 2.1x performance boost for latency-sensitive applications with Intel® Xeon® 6 processors deliver dense, predictable computing performance designed for trading and analytics. PowerEdge has the right fit for every advanced computing challenge.

Delivering breakthrough performance for AI and HPC workloads is essential for progress. The Dell PowerEdge servers lead the industry in speed, scalability, and reliability. These powerful systems combine AMD Instinct™ GPUs with Pollara 400 AI NICs, achieving remarkable results that empower organizations. The benefits speak for themselves:

    • Up to 2.7x faster machine learning (MLPerf) model training, accelerating your time to insight.
    • 50% more GPU memory to handle larger, more complex models.
    • Up to 44% higher memory bandwidth performance

Flexibility and scalability: Building infrastructure your way

Every organization’s IT evolution is unique. Whether you prefer an air-cooled system for simplified deployment or direct-liquid cooling for maximum efficiency, our platforms provide the flexibility to build your infrastructure, your way. This flexibility ensures that your data center can grow alongside your business without vendor lock-in, empowering continuous progress.

Key advantages of this flexible approach include:

    • Open ecosystem support across CPU and GPU families, with validated accelerator choices and scalable network fabrics, empowering customers with the flexibility to choose the components that best fit their unique architectures. Scalability with future-ready rack infrastructure that grows with your workload needs, and PowerSwitch Z9964 series network architecture that allow you to scale to over 100,000 GPUs in a multi-plane two-tier network.
    • Flexibility and customization with Enterprise SONiC Distribution by Dell Technologies, a network operating system optimized for demanding HPC and AI workloads, delivering seamless multi-silicon interoperability and support for the latest platforms like Broadcom’s Tomahawk 6.
    • Simplified upgrades and multi-generation reuse with an open-standards design foundation for integrated racks with Open Compute Project-based infrastructure design.

Efficiency and cost savings: Practical innovations, tangible results

By aligning server, cooling, and networking technologies, Dell transforms infrastructure into a strategic advantage, reducing total cost of operations while maximizing performance. Dell Integrated Rack Scalable Systems provide a cohesive system that integrates PowerEdge servers and PowerSwitch networking with advanced PowerCool air or liquid-cooling technologies and intelligent systems management to deliver measurable returns on investment.

Our smart designs deliver concrete outcomes:

    • Direct-to-chip cooling in the XE9785L eliminates up to 80% of system heat, minimizing energy costs and your environmental footprint.
    • The PowerCool Rack-m Cooling Distribution Unit (RCDU) handles an impressive 160 kW of cooling with unmatched space efficiency.
    • Automation tools like SmartFabric Manager and OpenManage Enterprise simplify operations, accelerate deployments, and proactively track system health.
    • The increased radix with the reduces component needs by replacing three-tier networks with two-tier networks, removing an entire layer of switches. This means component needs are reduced by up to 67% for switches and 40% for optics, fewer racks, less cabling, and lower power and cooling costs.

Intelligent infrastructure: Simplifying complexity, amplifying confidence

Managing the complexity of modern AI and HPC environments requires more than just powerful hardware—it demands intelligent, unified systems that simplify operations and enhance confidence. By uniting intelligence, automation, and resilience, Dell’s systems management and software innovations, including OpenManage Enterprise and SmartFabric Manager, transform infrastructure into a seamless, proactive ecosystem.

With OpenManage Enterprise, IT teams gain centralized visibility and control across up to 25,000 devices, eliminating operational silos and reducing manual intervention. Real-time telemetry and advanced automation ensure that every component—from servers to cooling—operates efficiently and securely.

SmartFabric Manager complements this by simplifying network management with AI-optimized blueprints, automated deployment, and real-time insights. Together, these tools empower IT decision-makers to:

    • Streamline operations: Reduce manual configuration tasks by up to 93%, enabling faster deployments and fewer errors.
    • Enhance visibility: Monitor and manage every layer of infrastructure through a single, integrated interface.
    • Proactively protect uptime: Detect and mitigate risks, such as thermal deviations or network congestion, before they impact performance.

Future-readiness: Built for tomorrow’s complex demands

Longevity and adaptability are critical in the fast-evolving world of AI. The Dell AI Factory is designed to help you scale your infrastructure over time as your needs grow, ensuring you are ready for what comes next. By embracing open standards, modular designs, and multi-silicon support, we provide a foundation that can evolve with your needs. This forward-thinking approach protects your investment and ensures your systems remain relevant for years to come.

    • New features like sub-millimeter leak detection in the Integrated Rack Controller, enhanced telemetry, and unified management give you proactive visibility and protection.
    • SmartFabric Manager and Dell services ensure your fabric and rack designs adapt as cluster sizes, topologies, and workloads evolve.

These innovations enhance infrastructure protection by proactively mitigating risks, which in turn ensures business continuity and peace of mind.

Together, let’s build what’s next

The Dell AI Factory is underpinned by the seamless integration of PowerEdge servers, PowerCool cooling innovations, and PowerSwitch networking, delivering integrated performance, cooling efficiency, and scalable design. By investing in a cohesive system, you’re not just preparing for the challenges of tomorrow—you’re building a foundation for sustainable, high-performance growth today. With Dell’s infrastructure solutions, the AI Factory isn’t just a vision; it’s a reality we can build together.

About the Author: Alison Biers

Alison Biers is a sales and marketing professional with 20-plus years of IT industry experience. In her role at Dell, she leads a team of edge solution professionals to increase awareness of the Dell edge portfolio and help customers gain the benefits of edge computing by simplifying the path to edge success.