AI May Not Be So Intelligent without an Underlying Data Strategy

With all of the hype around artificial intelligence and machine learning, businesses are feeling a lot of pressure to embrace and enthuse these technologies. That's all well and good, business advisors say, but it puts the cart ahead of the horse.

By Marty Graham, Contributor

With all of the hype around artificial intelligence (AI) and machine learning (ML), businesses are feeling a lot of pressure to embrace and enthuse these technologies. That’s all well and good, business advisors say, but it puts the cart ahead of the horse.

“The technology-hype cycle is so powerful that enterprises continually stumble in their initial projects for the latest wave,” explained business analyst Rick Sherman. “Vendors, analysts, and pundits typically pitch the latest technology wave as solving all the problems encountered in the previous cycles. Just as the Wizard of Oz created the myth of his power — don’t look behind the curtain — so do pundits proclaim the latest technology’s wizardry.”

Compounding the hype problem, management of data analytics has escaped from the IT department and is roaming the halls, sweeping up enthusiasts from departments who don’t necessarily understand the thoughtful, reasoned approach that will make the effort a success.

“Business leaders must understand that in order to get the outcome and results, you have to have the data that gets you there.”

-Ken Durazzo, vice president for technology research and innovation at Dell Technologies

Before businesses dive in, experts are saying, they need to understand the focus of their efforts. What’s more, the insights and revelations business leaders seek will be a direct result of applying careful effort and thought to current data efforts.

The ‘Wizard of Oz’ Technique

Andrew White, distinguished analyst and vice president at Gartner, has reviewed dozens of businesses’ data and analytics plans in his work for Gartner. He’s found these new data skills and predicted outcomes are vulnerable to misunderstanding and overstatement.

“Anecdotally, I would argue that over 80 percent of ‘data and analytics’ strategies are really just up-to-date or modern ‘analytics strategies,'” he wrote in 2018 on Gartner’s blog. “[Adopters] are enamored with the newest analytic tool and do not concern themselves with operational use of data, business process change, or how decision making may actually change the business process.”

As clients continue to seek access to “transformative” tools, some businesses, including handwriting scanning companies, have resorted to misrepresenting their AI technology capabilities. While these businesses say their clients are benefitting from a huge pile of data funneled through a mysterious algorithm, they’re actually using humans to do work.

The Guardian’s reporting reveals at least six companies are guilty of this approach, including one that is linked to Facebook. And although this is not the norm, it may also not be as uncommon as we might think.

“It’s not magic,” said Ken Durazzo, a vice president for technology research and innovation at Dell Technologies. “Business leaders must understand that in order to get the outcome and results, you have to have the data that gets you there.”

Garbage In, Garbage Out

Those eager to expand analytics capabilities and productivity, Durazzo emphasized, have to start thinking about what data they have, how to make it useful, and what queries it might be able to answer. Ideally, companies will design a data strategy that identifies how AI and ML tools improve customer service and reduce business costs before they launch the official effort.

One of the longstanding (and occasionally malingering) principles of computing holds relevant when it comes to neural networks, machine learning, and artificial intelligence: garbage in, garbage out. This, of course, is not always easy.

It takes a lot of effort to identify and prepare the right data sets to get the outcome you’re looking for, Durazzo said. A business seeking insights needs to know what data it already has, how to make it useful, and how to get more.

“Many people, instead of thinking through what questions we should ask, they think through what outcomes we want to achieve,” he explained. “The focus should be what data do I need to help the machine understand what it should know and what it needs to process and get to the outcome set I’m looking for.”

This is why obtaining good, current data is an ongoing and primary challenge. Durazzo points to how Google is constantly refreshing its data, for example, with vehicles gathering the most current information to be used for maps, self-driving cars, and apps like Waze.

Company officials must understand that in order to get the outcome and results they want, they have to have the best data possible to get there.

Durazzo also points to the importance of breaking down department silos and involving experts from areas other than data or computing. “Right now, there’s a lot of dialogue about how to have your data scientist linked to your subject matter experts,” he said. “Those are the people with context.”

“If we don’t understand the context for the data, we are unable to give the data scientists the right data sets, and they can’t fine tune the algorithm to get to the outcome we’re seeking,” he emphasized.

Not long ago, Durazzo worked with a large, lower-priced retailer on a project that left the national chain in a much better position for selecting clothing that would not linger on the rack long enough to suffer price cuts and drain profits.

He learned that it takes five to seven years for high fashion looks by celebrated couture designers such as Versace and Ralph Lauren to trickle down to less expensive retailers.

“The fashion houses set the color, cut, and pattern for the whole industry,” he said. “The retailers have seasonal patterns and a good level of historical sales data – and experts. Looking over that four-year data pattern, we found the retail chain was better off when they followed Ralph Lauren than when it was Versace.”

There’s so much you can do with that data, he went on, that will help retailers offer more of what shoppers want. “Data will increase sales for most stores [including those that] sell more stuff like Versace,” he said. “I’m not a fashion guy, but when I presented the use case, the buyers lit up.”

Having the buyers involved spared the project some missteps, too, like their insider knowledge that a month-long spike in sales of red shirts occurred when an action flick about a red-costumed superhero was released.

“There’s a huge amount of human factor involved—it’s a necessity to engage the people like the buyer with 24 years of experience because that person will have the knowledge and the instincts to bring insight and clarity to the data scientist,” he said. “With that experienced person, they can refine what they’re looking for and get that data in 15 minutes.”

The Human Factor

Durazzo advises companies begin their AI journey with a trained data scientist. That person may or may not already be in the organization’s IT department, but it’s important to look for data science skills. This is not the same as programming, as data scientists have extensive training in statistics, probabilities and regressions, and what makes data viable. The data scientist will review existing data sets and strategize on where the data may be able to take the business, he explained.

“Once you start to walk through correlation and causation factors, you need someone who is an expert to sit back and think through the variances and not-useful data,” he said. “It takes a lot of interworking between the subject matter experts and data scientists to be able to extract the right probabilistic data set and the right outcomes.”

For experts like Durazzo, it’s important business leaders understand that even good data alone has limitations, without experts there to put it into context. A good example of what can go wrong, Durazzo mused, was when an algorithm trained solely on images was unable to distinguish between the tan snouts and dark eyes of Chihuahuas and blueberry muffins. Without an expert supervising, legs, smells, textures, and feeling were absent; there was no human no context brought into the picture.

(Another discordant outcome is the neural network that mistook bushes for sheep in Janelle Shane’s odd experiments with neural networks.)

While Durazzo suggested that algorithms and data, on their own, won’t provide slam-dunk answers, they will provide ranked probabilities. He encouraged companies start small because that gives those trying to learn from data a taste of the full experience while attaining outcomes fairly quickly.

“Everyone wants to spend their time doing the most amount of impact,” he said. “If I can do the job in half the time by figuring out what I really need to look at, that’s a really good thing.”