Recap: Data Science Summit 2012

Did you know that “a dog can detect a teaspoon of sugar diluted in a million gallons of water: two Olympic-sized pools full?” I got that insight from a great book I’m currently reading by Alexandra Horowitz called Inside of a Dog. Now, what does that have to do with the Data Science Summit that I attended as a host last week?

Data Science Summit LogoWell, imagine data as varied and nuanced as a mix of colors, scents, sounds, surfaces, movements, shadows, and lights. Imagine if we could see data in the same way we can see a blossoming garden, or the way a dog can differentiate and comprehend the world of scents. Simply put, the Data Science Summit was sublime, reaching and expanding the world of data to a place I’ve never been, experienced, or knew about — an explosion of new ways of thinking.

“There’s a new spirit — an inspiration — to make something happen,” said Richard Snee, the event’s chairman. “Don’t wait. There’s a global transformation happening.”

Andreas Weigend, former chief scientist of Amazon, now heading up the Social Data Lab at Stanford University, spoke on a panel moderated by Jim Frederick, international editor and executive editor for TIME. Andreas was asked whether the data from Facebook or Twitter or any other community-oriented site was truly useful.

“We are moving from content to context,” Andreas replied, “from conversion to conversation, and conversations become markets.”

Markets are not only places to sell things (as Wall Street hoped from Facebook’s IPO), but places to learn, glean trends, and help the world. This was case with John Brownstein, co-founder of HealthMap, and associate professor at Harvard Medical School.

“In 1996,” he explained, “it used to take over 160 days to detect [outbreaks of] viruses. Now, with Big Data technology and the data scientist, it’s down to 20 days.”

HealthMap takes information from 50,000 cities around the world, updating its health database 2,000 times per day, to unearth potential viruses (like SARS or H1N1) that might have global impact.

Data Science SummitNora Denzel, Intuit senior vice president, noted that Big Data holds the power to help Main Street — in other words, Big Data for The Little Guy, from the small business person to the consumer.

“There’s a primal need to compare oneself with a larger community,” she noted. “Small businesses want to know whether they are spending more than their peers, whether they should hire now or later, whether their increase (or decrease) in revenue is on par with others like themselves…”

The power of data, in its raw and vast form, can foster new questions and new answers; it can help those in need, and, at the same time, it can uproot and obsolete those who do not see its force and ubiquity.

That’s a bit dramatic, yet as I listened throughout the day, I became convinced that this was not a rehash of some traditional trend or era. Something here was different. With all the talk in the press or Silicon Valley that data science is merely some permutation or evolution of traditional business intelligence or data mining, I realized that there is a difference, and that difference lies not only in the analytic results but the way in which the subject was being handled and examined at the Data Science Summit. This was not a day about vendors evangelizing their products, methods, ROIs, TCOs, dividends or features, but rather it was platform or launch point for full introspection, without the explicit or implicit lobbyists who mask the pros and cons.

Michael Chui, senior fellow at McKinsey Global Institute, for example, echoed this sentiment in his fireside chat with Richard Snee. “Yes,” he agreed, “Big Data is a big deal, but it’s like the first quarter of an NBA game.” And going back to Horowitz’s book about dogs, the world of scents can be thwarted and misconstrued. The wind can change. Darkness can fall. Sound can impair.

Nate Silver, in his opening keynote, discussed the challenges of distinguishing a “real signal” from “noise.” He retold the story about the chess champion, Garry Kasparov, who played Deep Blue, the chess-playing computer developed by IBM in May 1997. It turned out that Kasparov misinterpreted a bug in Deep Blue’s system as intelligence and became utterly confused. He was blinded by the noise, didn’t recognize the real signal, and lost the match, retiring from competitive chess thereafter.

If only Garry Kasparov had a data scientist with him!

With this in mind, it’s no doubt that data science is a team sport. Detecting the real signal takes an amalgamation of technology, people, and aspirations — a grouping whose cost or results may not initially fit easily into a traditional ROI mindset. But I think you have to take the plunge and experiment. The technology stack has changed. The people who use it have changed. The way we communicate has changed. And, as a result, information can morph into knowledge, and connections can turn into purpose.

Watch videos from Data Science Summit 2012 and follow #bigdatasci to join the conversation.

About the Author: Michael Howard