Building the Exploration & Production Data Lake

When combined, oil and water seem to have only negative connotations associated with them.  In the kitchen, water and hot oil is a recipe for disaster, just as an oil spill in the ocean would be deemed as an environmental disaster.  Is there a positive outcome when oil is combined with water?  One comes to mind as long as that body of water is your Data Lake and that the oil is in the form of seismic interpretation and reservoir modeling data.

Data Lake 2

In talking to customers, more and more of them are being asked by their internal customers (namely Geoscientists), to provide and make available more pre and post-stack data online, accessible to analyze and interpret. A majority of this data is offline and not readily accessible because it is on tape.  We are talking 100’s to 1,000’s of terabytes of valuable data sitting idly on tape.  What if you could have all this data online via a Data Lake?

EMC has been an important part of infrastructure management in oil & gas for over 25-years in an industry where companies have been drilling for hydrocarbons even longer. That’s an enormous amount of potential data. Think of all the subsurface pattern-matching we could be using to accelerate discovery. Think of all the best practices for running operations efficiently and safely, and models for optimized logistics that could have been created and implemented much earlier if we were able to harness that data in a Data Lake. It’s true that 25 years ago we did not have the advancements in technology we enjoy today – the ability to use sensors to capture and analyze real-time data, computing power to store and crunch terabytes of information in a fraction, global communications and mobility to bring new levels of collaboration to drive business agility – but think about the next 5, 10 or 25 years. The trajectory of innovation possible from harnessing an affordable Data Lake now could be exponential. We could simulate large parts of oil & gas operations and make more economically, financially and environmentally sound decisions quicker before a well is even drilled. The results will never be perfect, but every order of magnitude of change we take away from imperfection, the better off we’ll be in the continued pursuit of energy.

I shall pass on providing my definition of a Data Lake.  I will leave that to the analysts and others.  Instead I will give you a few characteristics that I would want from my Data Lake.  I would want my Oil & Gas Data Lake to:

  • Scale to double-digit petabytes
  • Ease of scale and simple management
  • Support multi-protocols to provide broad data ingestion and analysis capabilities
  • Ability to do Hadoop analytics via HDFS
  • Deliver a strong TCO advantage

Our Isilon Exploration and Production customers are quickly realizing that their Isilon storage investment is in fact the foundation of their Data Lake and also realizing all these benefits and much more today.  If using Isilon, you may have a Data Lake already and not even know it!


EMC Isilon will be showcasing our E&P solutions at the 76th European Association of Geoscientists and Engineers (EAGE) Conference & Exhibition in Amsterdam from June 16th – 19th.  Please stop by our booth at #3206 for presentations, demos, discussions with our staff of subject-matter-experts or any questions. We look forward to meeting you!

About the Author: John Nishikawa