AI Anywhere on Data Everywhere

Dell Technologies and Starburst announce a modern data lakehouse solution, unlocking data management from edge to core to multicloud.

AI loves data, but many enterprises are still scratching the surface. While significant advancements have been made in AI, locating, accessing and processing data across disparate environments remains a major roadblock. Organizations need help with the rapid growth of data (and its copies) across multicloud, proliferating data sources, formats and tool choices, and an inability to scale as requirements grow. This prevents them from fully unlocking the potential of AI to drive business outcomes.

Since most data remains on-premises, customers are left with two broad (both suboptimal) choices—either stitch together a complex web of tools and technologies and manage it on their own or replicate their entire data estate to the public cloud.

Organizations are stuck between a rock and a hard place and deserve a better answer. An answer that works with their data gravity and not against it, doesn’t force them to undertake a multi-year journey to centralize all their data in the public cloud or constantly play catch up to the myriad choices of proprietary and open-source technologies. And while we’re at it, an answer that works for their current analytics and AI needs and emerging GenAI needs. We call this “AI anywhere on data everywhere.”

Earlier this year, together with Starburst, Dell Technologies started working on this. We heard customers across the spectrum tell us about their challenges and needs—and we listened.

Today, I’m excited to announce our vision for an open, modern data lakehouse. Our vision for this lakehouse is as follows:

  • A highly secure single point of access to ALL data across the enterprise.
  • The ability to interoperate freely with the broader ecosystem of tools across on-prem and cloud.
  • A unified stack for any workload—BI, AI, ML and GenAI.
  • Blazing fast performance with decoupled compute and storage.
  • Predictable costs for the entire stack, including infrastructure.
  • A turnkey experience that simplifies purchase, deployment and lifecycle management.
  • A reduction in data movement by discovering and querying data in place before consolidating.

To deliver on these promises, Dell is developing the first phase of this solution, partnering with Starburst.

Starburst is built on top of Trino, the open-source high-performance distributed SQL engine known for running fast analytic queries against data lakes, lakehouses and distributed data sources at internet-scale. It integrates global security with fine-grained access controls, supports ad-hoc and long-running ELT workloads and is a gateway to building excellent data products.

Combining Starburst with Dell’s leading compute (PowerEdge) and leading storage (ECS, ObjectScale and PowerScale) platforms, we’re giving customers the ability to set the foundation for a high-performance, infinitely scalable lakehouse on-premises and across multi-cloud environments.

This overall solution enables customers to achieve a more cost-effective and predictable outlay versus cloud-based options. This is crucial given the escalating scale needed of modern organizations from analytics and AI workloads. For example, Dell ECS storage already saves up to 76% in total cost of ownership versus public cloud offers.1 Furthermore, IT and data teams can realize a positive ROI faster thanks to the simplified deployment and management experience, as well as an ability to tap into existing datasets across the enterprise without incurring any additional migration expense or time to consolidate.

By leveraging open table formats like Iceberg and Delta Lake, customers can interoperate freely with the broader data ecosystem, avoid vendor lock-in and harness all the innovation from these robust communities. Data scientists and engineers will finally have access to the data they need when needed, allowing for AI anywhere on data dispersed across the IT landscape.

There is a lot of work to do to deliver on our vision! And we’re excited to share that our first solution release is planned for the first half of 2024. More details will be shared in the coming months. If you want to learn more, speak with your Dell representative or visit the Dell website or Starburst website.

We always promise to meet our customers wherever they are in their data journey. This announcement and our collaboration with Starburst exemplify Dell’s unwavering commitment to innovating alongside our customers to deliver exceptional experiences.

1 ESG Economic Validation sponsored by Dell Technologies, “Analyzing the Economic Benefits of Dell ECS: Economic Benefit Analysis of On-premises Object Storage versus Public Cloud,” by Tony Palmer, July 2022. Cost savings based on ESG comparison of ECS to a leading public cloud in active storage scenarios.

Greg Findlen

About the Author: Greg Findlen

Greg is Senior Vice President of Product Management of Data Management at Dell Technologies. He is focused on efforts in the data management space to build solutions to enable customers better unlock the value from the data they generate each day across their organization. Prior to this role, he spent years leading engineering teams focused on developing products that scale across Dell’s enterprise portfolio of products, such as APEX, CloudIQ and others. Greg has over 25 years of experience in the technology industry and joined the company in 2006. Since joining Dell he has held a variety of positions across the Infrastructure Solutions Group and has led key functions including development, data analytics, pricing, business operations, program management and strategy development. Greg also served as a leader of the Dell and EMC integration which drove all key aspects of the merger planning related to the products and engineering teams. Prior to his time at Dell, he worked in the semiconductor test industry and drove quality improvement efforts, supplier relationships, and engaged across development and operations roles.