Achieve Greater ROI with AI in Your Datacenter

As organizations look to scale AI, on-premises infrastructure holds the key to achieving better business value.

AI is migrating from computers and phones to robots and self-driving cars and pretty much any digital space imaginable. Even NVIDIA CEO Jensen Huang called this out at the company’s GTC conference in March.  

Generative AI is growing rapidly, with 75% of knowledge workers using it to create content like sales collateral or automate coding, says Accenture 

The key to effective AI outcomes is a centralized AI strategy that takes into account various technical and operational factors. Different use cases will require different models and processes, as well as access to high-quality data to help maximize productivity.  

Increased productivity isn’t the only consideration. As organizations put their AI projects through their paces, they must also respect their budgets and protect their data. And as with all emerging technologies, AI presents implementation hurdles. 

AI Deployment Options

Chief among these are large language models (LLMs), requiring training, inferencing, fine-tuning, and optimization techniques for grounding GenAI. Some organizations struggle with AI skill gaps, making it difficult to effectively use such technologies.

To ease these burdens, some organizations choose to consume LLMs managed by public cloud providers, or to run LLMs of their choosing on those third-party platforms.  

As attractive as the public cloud is for shrinking launch timelines it also features tradeoffs. Variable costs, higher latency and data security and sovereignty concerns can make running AI workloads there unappealing or even untenable. 

AI workloads also present more variables for IT decision makers to consider. As attractive as more choice is, it can also compound complexity. 

Accordingly, running infrastructure, including compute, storage and GPUs, on premises affords organizations the ability to control all aspects of deployment. For instance, on-premises infrastructure offers great value for deploying large, predictable AI workloads, with closer proximity to where the data is processed, enabling organizations to respect their budgets.  

Ensuring strong organizational control is also table stakes for protecting AI models, inputs, and outputs, which may include sensitive IP, from bad actors and data leakage.  

Some organizations must prioritize data security and data sovereignty mandates requiring that data remains in specific geographic locales to comply with local regulations. By running AI workloads where the data exists, organizations can remain compliant while also avoiding duplicative transfers between systems and locations.  

To that end, many organizations today are customizing open-source LLMs using retrieval-augmented generation (RAG). Organizations can use RAG to tailor chatbots to provide prompt responses to specific use cases.  

Moreover, as LLMs continue to downsize while maintaining high performance and reliability, more models are running on portable computers such as AI PCs or workstations on-premises and at the edge.   

These factors underscore why 73% of organizations prefer to self-deploy LLMs based on infrastructure operating at datacenters, devices, and edge locations, according to Enterprise Strategy Group. 

On-Premises AI Can Be More Cost Effective

Empirical data comparing the value of on-premises deployments to the public cloud are scarce, but recently studied by ESG.

The research firm compared the expected costs of delivering inferencing for a text-based chatbot fueled by a 70B parameter open-source LLM running RAG on-premises versus a comparable public cloud solution from Amazon Web Services.

The analysis, which estimated the cost to support infrastructure and system administration for between 5,000 to 50,000 users over a four-year period, found that running the workload on-premises was as much as 62% more cost-effective at supporting inferencing than the public cloud.

Moreover, the same on-premises implementation was as much as 75% more cost-effective when compared to the cost of running an API-based service from OpenAI. Of course, every organization’s savings will vary per use case and modeling scenario. 

The on-premises solution featured in the ESG study comprised the Dell AI Factory, a modern approach designed to help organizations scale their AI solutions and build better business outcomes.  

The Dell AI Factory blends Dell infrastructure and professional services, with hooks into an open ecosystem of software vendors and other partners who can support your AI use cases today and in the future. 

Articulating a centralized strategy for this new era of AI everywhere is great, but the work doesn’t stop there. The Dell AI Factory can help guide you on your AI journey. 

Learn more about the Dell AI Factory here. 

Fuzz Hussain

About the Author: Fuzz Hussain

Fuzz Hussain leads business value discussions for Dell Technologies AI and APEX portfolio marketing. With over 15 years of experience in technology, manufacturing and healthcare industries, Fuzz has had the privilege of helping multiple organizations launch and accelerate customer-centric digital platforms. He holds a Master of Business Administration from Indiana University and a Bachelor of Science from Purdue University.